Boosting Higgs pair production in the $$b\bar{b}b\bar{b}$$ final state with multivariate techniques

Behr, J. Katharina; Bortoletto, Daniela; Frost, James A.; Hartland, Nathan P.; Issever, Cigdem; Rojo, Juan

doi:10.1140/epjc/s10052-016-4215-5

Boosting Higgs pair production in the $b\bar{b}b\bar{b}$ final state with multivariate techniques

Regular Article - Theoretical Physics
Open access
Published: 08 July 2016

Volume 76, article number 386, (2016)
Cite this article

Download PDF

You have full access to this open access article

The European Physical Journal C Aims and scope Submit manuscript

Boosting Higgs pair production in the $b\bar{b}b\bar{b}$ final state with multivariate techniques

Download PDF

J. Katharina Behr¹,
Daniela Bortoletto¹,
James A. Frost¹,
Nathan P. Hartland¹,
Cigdem Issever¹ &
…
Juan Rojo¹

2536 Accesses
57 Citations
2 Altmetric
Explore all metrics

A preprint version of the article is available at arXiv.

Abstract

The measurement of Higgs pair production will be a cornerstone of the LHC program in the coming years. Double Higgs production provides a crucial window upon the mechanism of electroweak symmetry breaking and has a unique sensitivity to the Higgs trilinear coupling. We study the feasibility of a measurement of Higgs pair production in the $b\bar{b}b\bar{b}$ final state at the LHC. Our analysis is based on a combination of traditional cut-based methods with state-of-the-art multivariate techniques. We account for all relevant backgrounds, including the contributions from light and charm jet mis-identification, which are ultimately comparable in size to the irreducible 4b QCD background. We demonstrate the robustness of our analysis strategy in a high pileup environment. For an integrated luminosity of ${\mathcal {L}}=3$ ab$^{-1}$, a signal significance of $S/\sqrt{B}\simeq 3$ is obtained, indicating that the $b\bar{b}b\bar{b}$ final state alone could allow for the observation of double Higgs production at the High Luminosity LHC.

Standard model Higgs boson pair production in the ( $$ b\overline{b} $$ )( $$ b\overline{b} $$ ) final state

Article Open access 06 August 2014

Revisiting the non-resonant Higgs pair production at the HL-LHC

Article Open access 18 July 2018

Search for non-resonant Higgs boson pair production in the $$ 2b+2\ell +{E}_{\textrm{T}}^{\textrm{miss}} $$ final state in pp collisions at $$ \sqrt{s} $$ = 13 TeV with the ATLAS detector

Article Open access 07 February 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The measurement of double Higgs production will be one of the central physics goals of the LHC program in its recently started high-energy phase, as well as for its future high-luminosity upgrade (HL-LHC) which aims to accumulate a total integrated luminosity of 3 ab$^{-1}$ [1, 2]. Higgs pair production [3] is directly sensitive to the Higgs trilinear coupling $\lambda $ and provides crucial information on the electroweak symmetry breaking mechanism. It also probes the underlying strength of the Higgs interactions at high energies, and it can be used to test the composite nature of the Higgs boson [4, 5]. While Standard Model (SM) cross sections are small, many Beyond the SM (BSM) scenarios predict enhanced rates for double Higgs production; therefore searches have already been performed by ATLAS and CMS with Run I data [6–10] and will continue at Run II. The study of Higgs pair production will also be relevant to any future high-energy collider, either at a 100 TeV circular machine [11–14] or at a linear or circular electron–positron collider [15].

Analogously to single Higgs production [16], in the SM the dominant mechanism for the production of a pair of Higgs bosons at the LHC is gluon fusion (see [3, 17] and references therein). For a center-of-mass energy of $\sqrt{s} = 14$ TeV, the next-to-next-to-leading order (NNLO) total cross section is approximately 40 fb [18], which is increased by a further few percent once next-to-next-to-leading logarithmic (NNLL) corrections are accounted for [19]. Feasibility studies in the case of a SM-like Higgs boson in the gluon-fusion channel at the LHC have been performed for different final states, including $b\bar{b}\gamma \gamma $ [20–22], $b\bar{b}\tau ^+\tau ^-$ [23–26], $b\bar{b}W^+W^-$ [25, 27] and $b\bar{b}b\bar{b}$ [21, 23, 25, 28, 29]. While these studies differ in their quantitative conclusions, the consistent picture emerges that the ultimate precision in the determination of the Higgs trilinear coupling $\lambda $ requires the full integrated luminosity of the HL-LHC, ${\mathcal {L}}=3$ ab$^{-1}$ and should rely on the combination of different final states. The interplay between kinematic distributions for the extraction of $\lambda $ from the measured cross sections and the role of the associated theoretical uncertainties have been intensely scrutinised recently [17, 30–37].

In addition to the gluon-fusion channel, Higgs pairs can also be produced in the vector-boson fusion channel hhjj [5, 26, 38, 39], the associated production modes hhW and hhZ [3, 40, 41] (also known as Higgs-Strahlung), and also in association with top quark pairs $hht\bar{t}$ [42]. All these channels are challenging due to the small production rates: at 14 TeV, the inclusive total cross sections are 2.0 fb for VBF hhjj [43], 0.5 fb for W(Z)hh [3] and 1.0 for $hht\bar{t}$ [42].

While the SM production rates for Higgs pairs are small, they are substantially enhanced in a variety of BSM scenarios. Feasibility studies of Higgs pair production in New Physics models have been performed in a number of different frameworks, including Effective Field Theories (EFTs) with higher-dimensional operators and anomalous Higgs couplings [14, 44–50], resonant production in models such as extra dimensions [51–54], and Supersymmetry and Two Higgs Doublet models (2HDMs) [55–61]. Since BSM dynamics modify the kinematic distributions of the Higgs decay products, for instance boosting the di-Higgs system, different analysis strategies might be required for BSM Higgs pair searches as compared to SM measurements.

Searches for the production of Higgs pairs have already been performed with 8 TeV Run I data by ATLAS in the $b\bar{b}b\bar{b}$ [7] and $b\bar{b}\gamma \gamma $ [8] final states, and by CMS in the same $b\bar{b}b\bar{b}$ [9] and $b\bar{b}\gamma \gamma $ [10] final states. In addition, ATLAS has presented [6] a combination of its di-Higgs searches in the $bb\tau \tau ,$ $\gamma \gamma WW^*$, $\gamma \gamma bb$ and bbbb final states. Many other exotic searches involve Higgs pairs in the final state, such as the recent search for heavy Higgs bosons H [62].

In the context of SM production, the main advantage of the $b\bar{b}b\bar{b}$ final state is the enhancement of the signal yield from the large branching fraction of Higgs bosons into $b\bar{b}$ pairs, $\mathrm{BR}\left( H\rightarrow b\bar{b}\right) \simeq 0.57$ [16]. However, a measurement in this channel needs to deal with an overwhelming QCD multi-jet background. Recent studies of Higgs pair production in this final state [28, 29] estimate that, for an integrated luminosity of ${{\mathcal {L}}}=3$ ab$^{-1}$, a signal significance of around $S/\sqrt{B}\simeq 2.0$ can be obtained. In these analysis, irreducible backgrounds such as 4b and $t\bar{t}$ are included, however, the reducible components, in particular bbjj and jjjj, are neglected. These can contribute to the signal yield when light and charm jets are mis-identified as b-jets. Indeed, due to both selection effects and b-quark radiation in the parton shower, the contribution of the 2b2j process is as significant as the irreducible 4b component.

In this work, we revisit the feasibility of SM Higgs pair production by gluon fusion in the $b\bar{b}b\bar{b}$ final state at the LHC. Our strategy is based upon a combination of traditional cut-based methods and multivariate analysis (MVA). We account for all relevant backgrounds, including the contribution from mis-identified light and charm jets. We also assess the robustness of our analysis strategy in an environment with high pileup (PU). Our results indicate that the $b\bar{b}b\bar{b}$ final state alone should allow for the observation of double Higgs production at the HL-LHC.

The structure of this paper proceeds as follows. In Sect. 2 we present the modeling of the signal and background processes with Monte Carlo event generators. In Sect. 3 we introduce our analysis strategy, in particular the classification of individual events into different categories according to their topology. Results of the cut-based analysis are then presented in Sect. 4. In Sect. 5 we illustrate the enhancement of signal significance using multivariate techniques, and we assess the robustness of our results against the effects of PU. In Sect. 6 we conclude and outline future studies to estimate the accuracy in the determination of the trilinear coupling $\lambda $ and to provide constraints in BSM scenarios.

2 Modeling of signal and background processes

In this section we discuss the Monte Carlo generation of the signal and background process samples used in this analysis. We shall also discuss the modeling of detector resolution effects.

2.1 Higgs pair production in gluon fusion

Higgs pair production is simulated at leading order (LO) using MadGraph5_aMC@NLO [63]. We use a tailored model [34] for gluon-fusion Higgs boson pair production which includes mass effects from the exact form factors for the top-quark triangle and box loops [64]. Equivalent results can be obtained using the recently available functionalities for the calculation of loop-induced processes [65] in MadGraph5_aMC@NLO. The calculation is performed in the $n_f=4$ scheme, accounting for b-quark mass effects. The renormalisation and factorisation scales are taken to be $\mu _F=\mu _R=H_T/2$, with

$$\begin{aligned} H_T\equiv \sum _i \sqrt{p_{T,i}^2+m_i^2}, \end{aligned}$$

(1)

the scalar sum of the transverse masses of all final-state particles. For the input parton distribution functions (PDFs) we adopt the NNPDF 3.0 $n_f=4$ LO set [66] with $\alpha _s(m_Z^2)=0.118$, interfaced via LHAPDF6 [67]. The Higgs boson couplings and branching ratios are set to their SM values, and its mass is taken to be $m_h=125$ GeV [68–70]. In the SM, the Higgs trilinear coupling is given by $\lambda =m_h^2/2v^2$, with $v\simeq 246$ GeV the Higgs vacuum expectation value.

Table 1 Details of the signal and background Monte Carlo samples used in this work. Also provided are the inclusive K-factors which are applied to reproduce the known higher-order results

Full size table

In Fig. 1 we show representative Feynman diagrams for LO Higgs pair production in gluon fusion. The non-trivial interplay between the heavy quark box and the triangle loop diagrams can lead to either constructive or destructive interference and complicates the extraction of the trilinear coupling $\lambda $ from the measurement of the Higgs pair production cross section. Higher-order corrections [17, 18] are dominated by gluon radiation from either the initial-state gluons or from the heavy quark loops.

The total inclusive cross section for this processes is known up to NNLO [18]. Resummed NNLO+NNLL calculations for Higgs pair production are also available [19], leading to a moderate enhancement of the order of few percent as compared to the fixed-order NNLO calculation. To achieve the correct higher-order value of the integrated cross section, we rescale our LO signal sample to match the NNLO+NNLL inclusive calculation. This corresponds to a K-factor $\sigma _\mathrm{NNLO+NNLL}/\sigma _\mathrm{LO}=2.4$, as indicated in Table 1.

Parton-level signal events are then showered with the Pythia8 Monte Carlo [71, 72], version v8.201. We use the default settings for the modeling of the underlying event (UE), multiple parton interactions (MPI), and PU, by means of the Monash 2013 tune [73], based on the NNPDF2.3LO PDF set [74, 75].

2.2 Backgrounds

Background samples are generated at leading order with SHERPA [76] v2.1.1. As in the case of the signal generation, the NNPDF 3.0 $n_f = 4$ LO set with strong coupling $\alpha _s(m_Z^2)=0.118$ is used for all samples, and we use as factorisation and renormalisation scales $\mu _F=\mu _R=H_T/2$. We account for all relevant background processes that can mimic the $hh\rightarrow 4b$ signal process. This includes QCD 4b multi-jet production, as well as QCD 2b2j and 4j production, and top-quark pair production. The latter is restricted to the fully hadronic final state, since leptonic decays of top quarks can be removed by requiring a lepton veto. Single Higgs production processes such as $Z(\rightarrow b\bar{b})h(\rightarrow b\bar{b})$ and $t\bar{t}h(\rightarrow b\bar{b})$ (see Appendix A) along with electroweak backgrounds e.g. $Z(\rightarrow b\bar{b})b\bar{b}$, are much smaller than the QCD backgrounds [28, 29] and are therefore not included in the present analysis.

The LO cross sections for the background samples have been rescaled so that the integrated distributions reproduce known higher-order QCD results. For the 4j sample, we rescale the LO cross section using the BLACKHAT [77] calculation, resulting in an NLO/LO K-factor of 0.6. For the 4b and 2b2j samples NLO/LO K-factors of 1.6 and 1.3, respectively, have been determined using MadGraph5_aMC@NLO [63]. Finally, the LO cross section for $t\bar{t}$ production has been rescaled to match the NNLO+NNLL calculation of Ref. [78], leading to a K-factor of 1.4. The K-factors that we use to rescale the signal and background samples are summarised in Table 1.

At the generation level, the following loose selection cuts are applied to background events. Each final-state particle in the hard process must have $p_T \ge 20$ GeV, and be located in the central rapidity region with $| \eta | \le 3.0$. At the matrix-element level all final-state particles must also be separated by a minimum $\Delta R_{\mathrm {min}} =0.1$. We have checked that these generator-level cuts are loose enough to have no influence over the analysis cuts. From Table 1 we see that the $t\bar{t}$ and QCD 4b cross sections are of the same order of magnitude. However, the former can be efficiently reduced by using top quark reconstruction criteria. The bbjj cross section is more than two orders of magnitude larger than the 4b result, but it will be suppressed by the light and charm jet mis-identification rates, required to contribute to the 4b final state.

As a cross-check of the SHERPA background cross sections reported in Table 1, we have produced leading-order multi-jet samples using MadGraph5_aMC@NLO, benchmarked with the results for the same processes reported in Ref. [63]. Using common settings, we find agreement, within scale uncertainties, between the MadGraph5_aMC@NLO and SHERPA calculations of the multi-jet backgrounds.

2.3 Modeling of detector resolution

While it is beyond the scope of this work to perform a full detector simulation, it is important to include an estimate of detector effects in the analysis, particularly for the finite energy and angular resolutions which directly degrade the reconstruction of important kinematic variables, such as the invariant mass of the Higgs candidates. Here we simulate the finite energy resolution of the ATLAS and CMS hadronic calorimeters by applying a Gaussian smearing of the transverse momentum $p_T$ with mean zero and standard deviation $\sigma _E$ for all final-state particles before jet clustering, that is,

$$\begin{aligned} p_T^{(i)} \, \rightarrow \, p_T^{(i)\prime }= \left( 1+ r_i\cdot \sigma _E \right) \, p_T^{(i)}, \quad i=1,\ldots ,N_\mathrm{part}, \end{aligned}$$

(2)

with $r_i$ a univariate Gaussian random number, different for each of the $N_\mathrm{part}$ particles in the event. We take as a baseline value for the transverse-momentum smearing a factor of $\sigma _E=5~\%$.

To account for the finite angular resolution of the calorimeter, the $\left( \eta ,\phi \right) $ plane is divided into regions of $\Delta \eta \times \Delta \phi =0.1\times 0.1$, and each final-state particle which falls in each of these cells is set to the same $\eta $ and $\phi $ values of the center of the corresponding cell. Finally, the energy of each final-state particle is recalculated from the smeared $p_T^\prime $, $\eta ^\prime $ and $\phi ^\prime $ values to ensure that the resulting four-momentum is that of a light-like particle, since we neglect all jet constituent masses in this analysis.

Our modeling of detector simulation has been tuned to lead to a mass resolution of the reconstructed Higgs candidates consistent with the hadronic mass resolutions of the ATLAS and CMS detectors [79–81], as discussed in Sect. 3.5.

3 Analysis strategy

In this section we describe our analysis strategy. First of all we discuss the settings for jet clustering and the strategy for jet b-tagging. Following this we discuss the categorisation of events into different topologies, and how the different topologies may be prioritised. We motivate our choice of analysis cuts by comparing signal and background distributions for representative kinematic variables. Finally, we describe the simulation of PU and validate the PU-subtraction strategy.

3.1 Jet reconstruction

After the parton shower, final-state particles are clustered using the jet reconstruction algorithms of FastJet [82, 83], v3.1.0. Here we use the following jet definitions:

Small-R jets. These are jets reconstructed with the anti-$k_T$ clustering algorithm [84] with $R=0.4$ radius. These small-R jets are required to have transverse momentum $p_T \ge 40$ GeV and pseudo-rapidity $|\eta |<2.5$, within the central acceptance of ATLAS and CMS, and therefore within the region where b-tagging is possible.
Large-R jets. These jets are also constructed with the anti-$k_T$ clustering algorithm, now using a $R=1.0$ radius. Large-R jets are required to have $p_T \ge 200$ GeV and lie in a pseudo-rapidity region of $|\eta |<2.0$. The more restrictive range in pseudo-rapidity as compared to the small-R jets is motivated by mimicking the experimental requirements in ATLAS and CMS related to the track-jet based calibration [85, 86]. In addition to the basic $p_T$ and $\eta $ acceptance requirements, large-R jets should also satisfy the BDRS mass-drop tagger (MDT) [87] conditions, where the FastJet default parameters of $\mu _\mathrm{mdt} = 0.67$ and $y_\mathrm{mdt}=0.09$ are used. Before applying the BDRS tagger, the large-R jet constituents are reclustered with the Cambridge/Aachen (C/A) algorithm [88, 89] with $R=1.0$. In the case of the analysis including PU, a trimming algorithm [106] is applied to all large-R jets to mitigate the effects of PU, especially on the jet mass. For further details, see Sect. 3.5.
Small-R subjets. All final-state particles are clustered using the anti-$k_T$ algorithm, but this time with a smaller radius parameter, namely $R=0.3$. The resulting anti-$k_T$ $R=0.3$ (AKT03) jets are then ghost-associated to each large-R jets in order to define its subjets [7]. These AKT03 subjets are required to satisfy $p_T > 50$ GeV and $|\eta |<2.5$, and they will be the main input for b-tagging in the boosted category.

For the boosted and intermediate categories, which involve the use of large-R jets, we use jet substructure variables [90, 91] to improve the significance of the discrimination between signal and background events in the MVA. In particular we consider the following substructure variables:

The $k_T$-splitting scale [87, 92]. This variable is obtained by reclustering the constituents of a jet with the $k_T$ algorithm [93], which usually clusters last the harder constituents, and then taking the $k_T$ distance measure between the two subjets at the final stage of the recombination procedure,
$$\begin{aligned} \sqrt{d_{12}} \equiv \mathrm{min}\left( p_{T,1},p_{T,2}\right) \cdot \Delta R_{12}. \end{aligned}$$
(3)
with $p_{T,1}$ and $p_{T,2}$ the transverse momenta of the two subjets merged in the final step of the clustering, and $\Delta R_{12}$ the corresponding angular separation.
The ratio of 2-to-1 subjettiness $\tau _{21}$ [94, 95]. The N-subjettiness variables $\tau _N$ are defined by clustering the constituents of a jet with the exclusive $k_t$ algorithm [96] and requiring that N subjets are found,
$$\begin{aligned} \tau _N\equiv & {} \frac{1}{d_0} \sum _k p_{T,k}\cdot \mathrm{min}\left( \delta R_{1k}, \ldots , \delta R_{Nk}\right) ,\nonumber \\ d_0\equiv & {} \sum _k p_{T,k}\cdot R, \end{aligned}$$
(4)
where $p_{T,k}$ is the $p_T$ of the constituent particle k and $\delta R_{ik}$ the distance from subjet i to constituent k. In this work we use as input to the MVA the ratio of 2-subjettiness to 1-subjettiness, namely
$$\begin{aligned} \tau _{21} \equiv \frac{\tau _2}{\tau _1}, \end{aligned}$$
(5)
which provides good discrimination between QCD jets and jets arising from the decay of a heavy resonance.
The ratios of energy correlation functions (ECFs) $C^{(\beta )}_2$ [97] and $D_2^{(\beta )}$ [98]. The ratio of energy correlation functions $C_2^{(\beta )}$ is defined as
$$\begin{aligned} C_2^{(\beta )} \equiv \frac{ \mathrm{ECF}(3,\beta ) \mathrm{ECF}(1,\beta )}{\left[ \mathrm{ECF}(2,\beta )\right] ^2}, \end{aligned}$$
(6)
while $D_2^{(\beta )}$ is instead defined as a double ratio of ECFs, that is,
$$\begin{aligned}&e_3^{(\beta )}\equiv \frac{ \mathrm{ECF}(3,\beta )}{\left[ \mathrm{ECF}(1,\beta )\right] ^3}, \quad e_2^{(\beta )}\equiv \frac{ \mathrm{ECF}(2,\beta )}{\left[ \mathrm{ECF}(1,\beta )\right] ^2},\nonumber \\&\quad D_2^{(\beta )} \equiv \frac{ e_3^{(\beta )}}{\left( e_2^{(\beta )} \right) ^3}. \end{aligned}$$
(7)
The energy correlation functions $\mathrm{ECF}(N,\beta )$ are defined in [97] with the motivation that $(N+1)$-point correlators are sensitive to N-prong substructure. The free parameter $\beta $ is set to a value of $\beta =2$, as recommended by Refs. [97, 98].

3.2 Tagging of b-jets

In this analysis we adopt a b-tagging strategy along the lines of current ATLAS performance [91, 99], though differences with respect to the corresponding CMS settings [100, 101] do not modify qualitatively our results. For each jet definition described above, a different b-tagging strategy is adopted:

Small-R jets. If a small-R jet has at least one b-quark among their constituents, it will be tagged as a b-jet with probability $f_b$. In order to be considered in the b-tagging algorithm, b-quarks inside the small-R jet should satisfy $p_T \ge 15$ GeV [99]. The probability of tagging a jet is not modified if more than one b-quark is found among the jet constituents. If no b-quarks are found among the constituents of this jet, it can be still be tagged as a b-jet with a mistag rate of $f_l$, unless a charm quark is present instead, and in this case the mistag rate is $f_c$. Only jets that contain at least one (light or charm) constituent with $p_T \ge 15$ GeV can induce a fake b-tag. We attempt to b-tag only the four (two) hardest small-R jets in the resolved (intermediate) category. Attempting to b-tag all of the small-R jets that satisfy the acceptance cuts worsens the overall performance as the rate of fake b-tags increases substantially.
Large-R jets. Large-R jets are b-tagged by ghost-associating anti-$k_T$ $R=0.3$ (AKT03) subjets to the original large-R jets [7, 91, 102, 103]. A large-R jet is considered b-tagged if both the leading and the subleading AKT03 subjets, where the ordering is done in the subjet $p_T$, are both individually b-tagged, with the same criteria as the small-R jets. Therefore, a large-R jet where the two leading subjets have at least one b-quark will be tagged with probability $f_b^2$. As in the case of small-R jets, we only attempt to b-tag the two leading subjets, else one finds a degradation of the signal significance. The treatment of the b-jet mis-identification from light and charm jets is the same as for the small-R jets.

For the b-tagging probability $f_b$, along with the b-mistag probability of light ($f_l$) and charm ($f_c$) jets, we use the values $f_b=0.8$, $f_l=0.01$ and $f_c=0.1$.

3.3 Event categorisation

The present analysis follows a strategy similar to the scale-invariant resonance tagging of Ref. [51]. Rather than restricting ourselves to a specific event topology, we aim to consistently combine the information from the three possible topologies: boosted, intermediate and resolved, with the optimal cuts for each category being determined separately. This approach is robust under variations of the underlying production model of Higgs pairs, for instance in the case of BSM dynamics, which can substantially increase the degree of boost in the final state.

The three categories are defined as follows:

Boosted category. An event which contains at least two large-R jets, with the two leading jets being b-tagged. Each of these two b-tagged, large-R jets are therefore candidates to contain the decay products of a Higgs boson.
Intermediate category. An event with exactly one b-tagged, large-R jet, which is assigned to be the leading Higgs candidate. In addition, we require at least two b-tagged, small-R jets, which must be separated with respect to the large-R jet by an angular distance of $\Delta R\ge 1.2$. The subleading Higgs boson candidate is reconstructed by selecting the two b-tagged small-R jets that minimise the difference between the invariant mass of the large-R jet with that of the dijet obtained from the sum of the two small-R jets.
Resolved category. An event with at least four b-tagged small-R jets. The two Higgs candidates are reconstructed out of the leading four small-R jets in the event by considering all possible combinations of forming two pairs of jets and then choosing the configuration that minimises the relative difference of dijet masses.

Once a Higgs boson candidate has been identified, its invariant mass is required to lie within a fixed window of width $80~\mathrm{GeV}$ around the nominal Higgs boson mass of $m_h= 125$ GeV. Specifically we require the condition

$$\begin{aligned} |m_{h,j} - 125~\mathrm{GeV}| < 40~\mathrm{GeV},\quad j=1,2, \end{aligned}$$

(8)

where $m_{h,j}$ is the invariant mass of each of the two reconstructed Higgs candidates. This cut is substantially looser than the corresponding cut used in the typical ATLAS and CMS $h\rightarrow b\bar{b}$ analyses [79, 80]. The motivation for such a loose cut is that further improvements of the signal significance will be obtained using an MVA. Only events where the two Higgs candidates satisfy Eq. (8) are classified as signal events.

These three categories are not exclusive: a given event can be assigned to more than one category, for example, satisfying the requirements of both the intermediate and the resolved categories at the same time. The exception is the boosted and intermediate categories, which have conflicting jet selection requirements.

This is achieved as follows. First of all we perform an inclusive analysis, and optimise the signal significance $S/\sqrt{B}$ in each of the three categories separately, including the MVA. We find that the category with highest significance is the boosted one, followed by the intermediate and the resolved topologies, the latter two with similar significance. Therefore, when ascertaining in which category an event is to be exclusively placed: if the event satisfies the boosted requirements, it is assigned to this category, else we check if it suits the intermediate requirements. If the event also fails the intermediate category requirements, we then check if it passes the resolved selection criteria. The resulting exclusive event samples are then separately processed through the MVA, allowing for a consistent combination of the significance of the three event categories.

3.4 Motivation for basic kinematic cuts

We now motivate the kinematic cuts applied to the different categories, comparing representative kinematic distributions between signal and background events. First of all, we present results without PU, and then discuss the impact of PU on the description of the kinematic distributions. In the following, all distributions are normalised to their total integral.

In Fig. 2 we show the $p_T$ distributions of the leading and subleading large-R jets in the boosted category. We observe that the background distribution falls off more rapidly as a function of $p_T$ than the di-Higgs signal. On the other hand, the cut in $p_T$ cannot be too strong to avoid a substantial degradation of signal selection efficiency, specially taking into account the subleading large-R jet. This comparison justifies the cut of $p_T \ge 200$ GeV for the large-R jets that we impose in the boosted category.

Another selection requirement for the boosted category is that the two leading AKT03 subjets of the large-R jet should satisfy $p_T \ge 50$ GeV. To motivate this cut, in Fig. 3 we show the distribution in $p_T$ of the leading and subleading AKT03 subjets in the subleading large-R jet in events corresponding to the boosted category. It is clear from the comparison that the subjet $p_T$ spectrum is relatively harder in the signal with respect to the background. On the other hand, considering the subleading AKT03 subjet, this cut in $p_T$ cannot be too harsh to maintain a high signal selection efficiency. Therefore, as for the previous distribution, the chosen cut value is a compromise between suppressing backgrounds but keeping a large fraction of signal events is crucial.

Turning to the resolved category, an important aspect to account for in the selection cuts is the fact that the $p_T$ distribution of the four leading small-R jets of the event can be relatively soft, especially for the subleading jets. As noted in [29], this is due to the fact that the boost from the Higgs decay is moderate; therefore the $p_T$ selection cuts for the small-R jets cannot be too large. In Fig. 4 we show the distribution in $p_T$ of the four leading small-R jets in signal and background events: we observe that both distributions peak at $p_T \le 50$ GeV, with the signal distribution falling off less steeply at large $p_T$. The feasibility of triggering on four small-R jets with a relatively soft $p_T$ distribution is one of the experimental challenges for exploiting the resolved category in this final state, and hence the requirement that $p_T \ge 40$ GeV for the small-R jets. In Fig. 4 we also show the rapidity distribution of the small-R jets in the resolved category. As expected, the production is mostly central, and more so in the case of signal events, since backgrounds are dominated by QCD t-channel exchange; therefore the selection criteria on the jet rapidity are very efficient.

One of the most discriminating selection cuts is the requirement that the invariant mass of the Higgs candidate (di)jets must lie within a window around the nominal Higgs value, Eq. (8). In Fig. 5 we show the invariant mass of the leading reconstructed Higgs candidates, before the Higgs mass window selection is applied, for the resolved and boosted categories. While the signal distribution naturally peaks at the nominal Higgs mass, the background distributions show no particular structure. The width of the Higgs mass peak is driven both from QCD effects, such as initial-state radiation (ISR) and out-of-cone radiation, as well as from the four-momentum smearing applied to final-state particles as part of our minimal detector simulation.

The invariant mass of the di-Higgs system is another important kinematic distribution for this process. The di-Higgs invariant mass is a direct measure of the boost of the system, which in BSM scenarios can be substantially enhanced, for instance due to specific $d=6$ EFT operators [14]. One important advantage of the $b\bar{b}b\bar{b}$ final state for di-Higgs production is that it significantly increases the reach in $m_{hh}$ as compared to other channels with smaller branching ratios, such as $2b2\gamma $ or $2b2\tau $. In Fig. 6 we show the invariant mass distribution of the reconstructed Higgs pairs, comparing the resolved and the boosted categories.

In the resolved case, we see that the distribution in $m_{hh}$ is rather harder for the signal as compared to the background, and therefore one expects that cutting in $m_{hh}$ would help signal discrimination. For the boosted category the overall trend of the $m_{hh}$ distribution is different because of the selection criteria, and the distribution now peaks at higher values of the invariant mass. In this case, signal and background distributions are not significantly differentiated. Note that at parton level the $m_{hh}$ distribution for signal events has a kinematic cut-off at $m_{hh}^\mathrm{min}=250$ GeV, which is smeared due to parton shower and detector resolution effects.

In Fig. 7 we show the transverse momentum of the di-Higgs system, $p_T^{hh}$, for the resolved and boosted categories. Once more we see that the background has a steeper fall-off in $p_T^{hh}$ than the signal, in both categories, therefore this variable should provide additional discrimination power, motivating its inclusion as one of the inputs for the MVA. In our LO simulation the $p_T^{hh}$ distribution is generated by the parton shower, an improved theoretical description would require merging higher-multiplicity matrix elements [35] or matching to the NLO calculation [17],

We shall now investigate the discrimination power provided by jet substructure quantities. In Fig. 8 we show the distributions of representative substructure variables for the boosted category: the $k_T$ splitting scale $\sqrt{d_{12}}$, Eq. (3), the ECF ratio $C_2^{(\beta )}$, Eq. (6), and the 2–to–1 subjettiness ratio $\tau _{21}$, Eq. (5), all for the leading Higgs candidates, and also $\tau _{21}$ for the subleading Higgs candidates.

From Fig. 8 we observe how for these substructure variables the shapes of the signal and background distributions reflect the inherent differences in the internal structure of QCD jets and jets originating from Higgs decays. Signal and background distributions peak in rather different regions. For example, the $k_t$ splitting scale $\sqrt{d_{12}}$ peaks around 80 GeV (40 GeV) for signal (background) events, while the distribution of the ECF ratio $C_2^{(\beta )}$ is concentrated at small values for signal and is much broader for background events. From Fig. 8 we also see the distributions of the subjettiness ratio $\tau _{21}$ are reasonably similar for both the leading and the subleading jets.

3.5 Impact of pileup

Now we turn to discuss how the description of kinematic distributions for signal and background processes are modified in the presence of pileup. To study the impact of PU, Minimum Bias events have been generated with Pythia8, and then superimposed to the signal and background samples described in Sect. 2. We have explored two scenarios, one with a number of PU vertices per bunch crossing of $n_\mathrm{PU}=80$, and another with $n_\mathrm{PU}=150$. In the following we adopt $n_\mathrm{PU}=80$ as our baseline, and denote this scenario by PU80. We have verified that the combined signal significance is similar if $n_\mathrm{PU}=150$ is adopted instead.

In order to subtract PU in hadronic collisions, a number of techniques are available [87, 102, 104–114].^{Footnote 1} In this work, PU is subtracted with the SoftKiller (SK) method [111], as implemented in FastJet, whose performance has been shown to improve the commonly used area-based subtraction [104]. The idea underlying SoftKiller consists of eliminating particles below a given cut-off in their transverse momentum, $p_T^\mathrm{(cut)}$, whose value is dynamically determined so that the event-wide transverse-momentum flow density $\rho $ vanishes, where $\rho $ is defined as

$$\begin{aligned} \rho \equiv \mathrm{median}_i \Bigg \{ \frac{p_{Ti}}{A_i}\Bigg \}, \end{aligned}$$

(9)

and where the median is computed over all the regions i with area $A_i$ and transverse momentum $p_{Ti}$ in which the $\left( \eta ,\phi \right) $ plane is partitioned.

From its definition in terms of the median, it follows that the value of $p_T^{(\mathrm cut)}$ will be dynamically raised until half of the regions have $p_{Ti}=0$. The size and number of these regions is a free parameter of the algorithm—here we will use square regions with length $a=0.4$. We restrict ourselves to the central rapidity region, $|\eta | \le 2.5$, for the estimation of the $p_T$ flow density $\rho $. The SoftKiller subtraction is then applied to particles at the end of the parton shower, before jet clustering.

In addition, jet trimming [106], as implemented in FastJet, is applied to large-R jets. The trimming parameters are chosen such that the constituents of a given jet are reclustered into $k_T$ subjets with $R_{\text {sub}} = 0.2$. Subjets with transverse momentum less than 5 % of the total transverse momentum of the large-R jet are then removed. The use of trimming in addition to PU removal with SoftKiller is necessary to correct the jet mass in the boosted category, which is particularly susceptible to soft, wide-angle contaminations. No trimming is applied to the small-R jets and to the case without PU.

In Fig. 9 we show the invariant mass distributions of the Higgs candidates for signal events in the resolved and boosted categories. In the resolved category, we compare the results without PU with those with PU80, with and without SK subtraction. If PU is not subtracted, there is a large shift in the Higgs mass peak, by more than 30 GeV. Once SK subtraction is performed, we recover a distribution much closer to the no PU case, with only a small shift of a few GeV and a broadening of the mass distribution. In the boosted case, the comparison is performed between no PU, PU with only SK subtraction, and PU with both SK and trimming. We find that the mass distribution for jets to which no trimming is applied peaks at around 160 GeV, even after PU subtraction with SoftKiller. When trimming is applied in addition to SoftKiller, the distribution peaks close to the nominal Higgs mass, as in the case of the resolved category.

In Fig. 10 we compare the transverse momentum of the leading Higgs candidate, $p_T^{h}$ and the invariant mass of the di-Higgs system $m_{hh}$, in both the boosted and the resolved categories, between the no PU and the PU+SK+Trim cases. In the case of the $p_T^{h}$ distribution, the differences between the selection criteria for the resolved and boosted categories is reflected in the rightward shift of the latter. After subtraction, the effects of PU are small in the two categories. A similar behaviour is observed in the di-Higgs invariant mass distribution.

We can also assess the impact of PU on the substructure variables that will be used as input to the MVA in the boosted and intermediate categories. In Fig. 11 we show the 2-to-1 subjettiness ratio $\tau _{21}$, Eq. (5), and the ratio of energy correlation functions $C_2^{(\beta )}$, Eq. (6), for the leading Higgs candidate. We observe that the shapes of both substructure variables are reasonably robust in an environment including significant PU. Therefore we can consider the PU subtraction strategy as validated for the purposes of this study, although further optimisation should still be possible, both in terms of the SoftKiller and of the trimming input settings.

It is also interesting to quantify how the relative differences between signal over background distributions are modified by the inclusion of PU. Considering the boosted category initially, in Fig. 12 we compare various kinematic distributions for signal and background events, with and without PU for the leading Higgs candidate: the transverse-momentum distribution $p_T$, the $p_T$ of the leading AKT03 subjet, the 2-to-1 subjettiness ratio $\tau _{21}$, and the $k_T$ splitting scale $\sqrt{d_{12}}$. We verify that the relevant qualitative differences between signal and background distributions are maintained in the presence of PU. This is especially noticeable for the substructure variables, which exhibit a similar discriminatory power both with and without PU.

We can also perform a similar comparison for the resolved category. In Fig. 13 we compare the kinematic distributions for signal and background events, with and without PU, for the invariant mass and the transverse momentum of the leading Higgs candidate. Again, the PU-subtracted background distributions appear reasonably close to their counterparts without PU, and thus the distinctive features between signal and background are maintained after PU subtraction.

It is illustrative to determine the mass resolution obtained for the reconstructed Higgs candidates in the various cases considered in the present study. In Table 2 we indicate the shift of the fitted invariant mass peak as compared to the nominal Higgs mass, $\langle m_h^\mathrm{reco}\rangle -m_h$, and the corresponding width of the distribution, $\sigma _{m_h}$, obtained from fitting a Gaussian to the mass distributions of leading and subleading Higgs candidates in the resolved and boosted categories. We show results for three cases: without PU, with PU80 but without subtraction (only for the resolved category), and the same with SK+Trim subtraction.

In both categories, we find a mass resolution of around 9 GeV in the case without PU. In the case of PU with SK+Trim subtraction, in the resolved category the mass resolution worsens only slightly to around 11 GeV, while in the boosted category we find the same resolution as in the no PU case. We also note that after SK+Trim subtraction, the peak of the invariant mass distributions of Higgs candidates coincides with the nominal values of $m_h$ within a few GeV for the two categories.

4 Pre-MVA loose cut-based analysis

In this section we present the results of the pre-MVA loose cut-based analysis described in the previous section, and provide cut flows for the different analysis steps. We study how the signal significance is affected if only the 4b component of the QCD multi-jet background is taken into account. This section presents the results in an environment without pileup; the following one contains those obtained including significant PU.

Table 2 Resolution of the invariant mass distribution of reconstructed Higgs candidates in the resolved and boosted categories. We show three cases: no PU, with PU80 without subtraction (only for resolved), and the same with SK+Trim subtraction. We indicate the shift of the fitted invariant mass peak $\left\langle m_h^\mathrm{reco}\right\rangle $ for the Higgs candidates as compared to the nominal Higgs mass $m_h$, as well as the fitted Gaussian width $\sigma _{m_h}$

Full size table

Table 3 Definition of the cuts imposed successively for the three selections

Full size table

4.1 Cut flow and signal significance

Here we compare the cross sections for signal and background events at various stages of the analysis. We consider all relevant backgrounds (see Sect. 2), and discuss how results are modified in the case where only the 4b background is considered. In Table 3 the different steps of the cut flow in the present analysis are summarised, separated into the boosted, intermediate, and resolved topologies. The different analysis steps proceed as follows:

C1a: check that we have at least two large-R jets (in the boosted case), one large-R jet and at least 2 small-R jets (in the intermediate case) and at least four small-R jets (in the resolved case). In addition, require that these jets satisfy the corresponding $p_T$ thresholds; $p_T \ge 200$ GeV for large-R jets and $p_T \ge 40$ GeV for small-R jets, as well as the associated rapidity acceptance constraints.
C1b: the two leading large-R jets must be mass-drop tagged in the boosted category. In the intermediate category, the large-R jet must also be mass-drop tagged.
C1c: after the two Higgs candidates have been reconstructed, their invariant masses are required to lie within a window around $m_H$, in particular between 85 and 165 GeV, Eq. (8).
C2: the b-tagging conditions are imposed (see Sect. 3.2), and the event is categorised exclusively into one of the three topologies, according to the hierarchy determined in Sect. 3.3.

Signal and background events satisfying all the analysis cuts up to the C2 level are then used as input for the MVA training, to be described next in Sect. 5.

In Table 4 we collect the values for the signal and background cross sections at the different analysis steps. Results are divided into the resolved, intermediate and boosted categories, and they are inclusive up to the C2 level, where exclusivity is imposed. In Table 4 we also provide the signal over background ratio, S / B, and the signal significance, $S/\sqrt{B}$, corresponding to an integrated luminosity of ${\mathcal {L}}=3$ ab$^{-1}$. These are computed either taking into account all the background components or the 4b QCD background only. We find that after b-tagging, the 2b2j component is of the same order of magnitude as the 4b component in all categories. This implies that the signal significance at the end of the cut-based analysis is degraded due to the contribution of light and charm jets being mis-identified as b-jets.

Table 4 The cross sections for the signal and the background processes at different steps of the analysis (see Table 3), for the resolved (upper), intermediate (middle) and boosted (lower table) categories, for the analysis without PU. For each step, the signal over background ratio S / B, and the signal significance $S/\sqrt{B}$ for ${\mathcal {L}}=3$ ab$^{-1}$ are also provided, considering either the total background, or only the 4b component

Full size table

In the boosted category, at the end of the loose cut-based analysis, we find that around 500 events are expected at the HL-LHC, with a large number, ${\simeq } 10^6$, of background events. This leads to a pre-MVA signal significance of $S/\sqrt{B}=0.5$ and a signal over background ratio of $S/B=0.06~\%$. From Table 4 it is also possible to compute the corresponding pre-MVA expectations for the LHC Run II with ${\mathcal {L}}=300$ fb$^{-1}$: one expects in the boosted category around 50 signal events, with signal significance dropping down to $S/\sqrt{B}\simeq 0.16$. Such signal significances could have been enhanced by applying tighter selection requirements, but our analysis cuts have been left deliberately loose so that such optimisation may be performed by the MVA.

The resolved category benefits from higher signal yields, but this enhancement is compensated for by the corresponding increase in the QCD multi-jet background. In both resolved and intermediate categories the signal significance is $S/\sqrt{B}\simeq 0.4$, similar to that of the boosted category. A further drawback of the resolved case is that S / B is substantially reduced as compared to the boosted and intermediate cases.

Combining the results from the boosted, intermediate and resolved categories, we obtain an overall pre-MVA significance for the observation of the Higgs pair production in the $b\bar{b}b\bar{b}$ final state at the HL-LHC of $(S/\sqrt{B})_\mathrm{tot} \simeq 0.8$.

4.2 The role of light and charm jet mis-identification

One of the main differences in the present study as compared to previous work is the inclusion of both irreducible and reducible background components, which allows us to quantify the impact of light and charm jet mis-identification. Two recent studies that have also studied the feasibility of SM Higgs pair production in the $b\bar{b}b\bar{b}$ final state are from the UCL group [28] and from the Durham group [29]. The UCL study is based on requiring at least four b-tagged $R=0.4$ anti-$k_T$ jets in central acceptance with $p_T \ge 40$ GeV, which are then used to construct dijets (Higgs candidates) with $p_T \ge 150$ GeV, $85 \le m_\mathrm{dijet} \le 140$ GeV and $\Delta R \le 1.5$ between the two components of the dijet. In addition to the basic selection cuts, the constraints from additional kinematic variables are included by means of a Boosted Decision Tree (BDT) discriminant. The backgrounds included are the 4b and 2b2c QCD multijets, as well as $t\bar{t}$, Zh, $t\bar{t}h$ and $hb\bar{b}$. For the HL-LHC, a signal significance of $S/\sqrt{B}\simeq 2.1$ is obtained.

The Durham group study [29] requires events to have two $R=1.2$ C/A jets with $p_T\ge 200$ GeV, and in addition two b-tagged subjets inside each large-R jet with $p_T \ge 40$ GeV each. To improve the separation between signal and background, both the BDRS method and the Shower Deconstruction (SD) [116, 117] technique are used. The backgrounds considered are QCD 4b as well as $Zb\bar{b}$, hZ and hW. At the HL-LHC, their best result is obtained by requiring two SD-tagged large-R jets, which leads to $S/\sqrt{B}\simeq 2.1$. Using the BDRS tagger results in slightly poorer performance.

From our results in Table 4, we observe that the signal significance for the boosted, intermediate, and resolved categories is increased to 1.1, 0.6 and 0.6, respectively, when only the QCD 4b background is included. Combining the signal significance in the three categories, we obtain $(S/\sqrt{B_\mathrm{4b}})_\mathrm{tot}\simeq 1.4$, twice as large as the result found when all background components are included. Note the importance of the combination of the three exclusive event topologies, as opposed the exploitation of a single specific category. Taking into account the loose selection cuts, we see that our pre-MVA results including only the 4b background are consistent with those reported in previous studies.

From Table 4 we can compare the interplay between the reducible and irreducible components of the QCD backgrounds. In all cases, the 4b and 2b2j components have comparable magnitudes within the uncertainties from missing higher-order corrections. On the other hand, the 4j component is always substantially smaller. So while the 4j component can be safely neglected, the inclusion of the 2b2j component is essential to assess the feasibility of measuring Higgs pairs in this final state robustly, especially in the boosted category. This has the important consequence that a promising avenue to improve the prospects of this measurement would be to reduce, as much as possible, the light and charm jet mis-identification rate.

Table 5 The relative fractions $n^{(\text {b-jet})}_j$ of events for the resolved selection for which out of the four leading small-R jets of the event, j jets contain at least one b-quark with $p_T^b\ge 15$ GeV. This information is provided for the di-Higgs signal events and for the three QCD background samples. The last column indicates the overall selection efficiency as defined in Eq. (10)

Full size table

In Fig. 14 we show a comparison of the shapes of the 4b and 2b2j components of the QCD background for the transverse momentum $p_T^h$ of the leading Higgs candidate and for invariant mass $m_{hh}$ of the reconstructed di-Higgs system in the resolved and boosted categories. The two components possess a rather similar shape for the two distributions, albeit with some differences. In the boosted category, the 4b component exhibits a less steep fall-off of the $p_T^h$ distribution at large $p_T$, while in the resolved case the 2b2j component has a slightly harder distribution of the invariant mass $m_{hh}$. We also observe that the 2b2j distributions are affected by somewhat larger Monte Carlo fluctuations as compared to 4b, despite the large size of the initial sample.

In the resolved category, the cross section before b-tagging is two orders of magnitude larger in the 2b2j sample as compared to the 4b sample. After b-tagging, a naive assessment would suggest a suppression of the 2b2j cross section by a factor $(f_l/f_b)^2 \simeq 1.5\times 10^{-4}$, as compared to the 4b component, since a total of four b-tags are required to classify the event as a Higgs candidate. In this case the ratio of 2b2j over 4b would be around ${\simeq } 3~\%$, and therefore negligible. While we have checked that this expectation is borne out at the parton level, we find that when parton shower effects are accounted for the situation is different, due both to radiation of $b\bar{b}$ pairs and from selection effects. Due to these, the number of b quarks in the final state is increased substantially in the 2b2j component as compared to the parton level, while at the same time the number of events in the 4b sample with 4 b-jets passing selection cuts is reduced.

We can make these statements more quantitative in the following way. To first approximation, neglecting the contribution from charm mis-identification, the overall efficiency of the b-tagging requirements in the resolved category will be given by the following expression:

$$\begin{aligned} \mathrm{EFF}_{\text {b-tag}}\simeq \sum _{j=0}^{4}n^{(\text {b-jet})}_j\cdot f_b^{j}\cdot f_l^{4-j}, \end{aligned}$$

(10)

with $n^{(\text {b-jet})}_j$ being the fraction of events satisfying all the selection requirements, where j jets out of the leading four jets of the event contain b quarks (with $p_T^b\ge 15$ GeV). Similar expressions can be derived for the boosted and intermediate categories.

The naive expectation is that all events in the 4b sample have $n^{(\text {b-jet})}_4\simeq 1$ and $n^{(\text {b-jet})}_j\simeq 0$ for $j\ne 4$, while the events in the 2b2j sample should have $n^{(\text {b-jet})}_2\simeq 1$ and zero otherwise. This leads to the ratio of overall b-tagging selection efficiencies

$$\begin{aligned} \frac{ \mathrm{EFF}_{\text {b-tag}} \left[ 2b2j \right] }{\mathrm{EFF}_{\text {b-tag}} \left[ 4b\right] } \simeq \left( \frac{f_l}{f_b}\right) ^2 \simeq 1.5\times 10^{-4}. \end{aligned}$$

(11)

However, after the parton shower, the above estimate is no longer accurate. First of all, we will have a non-negligible fraction $n^{(\text {b-jet})}_j$ with $j=3,4$ also in the 2b2j sample, due to b-quark pair radiation during the shower. Second, not all events in the 4b sample will lead to four small-R b-jets, due to a combination of selection cuts and parton shower effects.

In Table 5 we collect the values of $n^{(\text {b-jet})}_j$ for the signal and the three QCD background samples. We find that rather than the estimate Eq. (11), the correct ratio of b-tagging selection efficiencies is instead

$$\begin{aligned} \frac{\mathrm{EFF}_{\text {b-tag}} \left[ 2b2j\right] }{\mathrm{EFF}_{\text {b-tag}} \left[ 4b\right] }= \frac{0.04~\%}{8.4~\%} \simeq 5\times 10^{-3}. \end{aligned}$$

(12)

This suppression factor is of the same order as the ratio of 4b to 2b2j cross sections in the resolved category before b-tagging. This explains why the 2b2j contribution cannot be neglected as compared to the irreducible 4b component of the QCD background. A similar calculation from the numbers in Table 5 shows that, on the other hand, the 4j component of the background can be neglected.

5 Multivariate analysis

At the end of the loose cut-based analysis, by combining the three event topologies, we obtain a signal significance of $S/\sqrt{B}\simeq 0.8~(1.4)$ with all backgrounds (only QCD 4b) considered. This section describes how this signal significance can be enhanced when the cut-based analysis is complemented by multivariate techniques. These are by now a mature tool in high-energy physics data analysis, opening new avenues to improve the performance of many measurements and searches at high-energy colliders. In particular, the classification of events into signal and background processes by means of MVAs is commonly used in LHC applications [28, 46, 80, 118–120].

In this section, first we present the specific MVA that we use, based on feed-forward multi-layer neural networks. Then we introduce the input variables that are used in the MVA, including the jet substructure variables, and then present the signal significance obtained by applying the MVA. Then we assess the robustness of the MVA strategy in the case of significant contamination from pileup.

5.1 Deep artificial neural networks

The specific type of MVA that we use to disentangle signal and background events is a multi-layer feed-forward artificial neural network (ANN), known as a perceptron.^{Footnote 2} This family of ANNs are also known as deep neural networks, due to their multi-layered architecture. The MVA inputs are a set of kinematic variables describing the signal and background events which satisfy the requirements of the cut-based analysis. The output of the trained ANNs also allows for the identification, in a fully automated way, of the most relevant variables in the discrimination between signal and background.

In this work, the ANN that we use has the following architecture.

$$\begin{aligned} N_{\mathrm {var}}\times 5\times 3\times 1, \end{aligned}$$

(13)

where $N_{\mathrm {var}}$ represents the number of input variables for the MVA, which is different in the resolved, intermediate, and boosted categories. All neural-network layers use a sigmoid activation function, allowing for a probabilistic interpretation of the ANN output. In Fig. 15 we show an illustrative example of an ANN used in this work, corresponding to the case of the boosted category (thus $N_{\mathrm {var}}=21$, as we explain below).

The training of the ANN for the signal/background classification task proceeds as follows. Given a set of $N_{\mathrm {var}}$ kinematic variables $\{k\}_i$ associated with the event i and a set of neural-network weight parameters $\{\omega \}$, we interpret the neural-network output $y_i$ (the activation state of the neuron in the last layer) as the probability that the event i originates from the signal process,

$$\begin{aligned} y_i = P(y^\prime _i=1|\{k\}_i, \{\omega \} )\, , \end{aligned}$$

(14)

where $y_i^\prime $ represents the true classification of the event i, i.e., $y^\prime _i = 1$ for signal and $y^\prime _i = 0$ for background events. With this interpretation, our general classification probability including background events is given by

$$\begin{aligned} P(y_i^\prime |\{k\}_i, \{\omega \}) = y_i^{y^\prime _i}(1-y_i)^{1-y^\prime _i}, \end{aligned}$$

(15)

consequently we can define an error function $E(\{\omega \})$ to be minimised during the ANN training. In this case, the error function is the cross-entropy function, defined as

$$\begin{aligned} E(\{\omega \})\equiv & {} -\log \left( \prod _i^{N_{\text {ev}}} P(y_i^\prime |\{k\}_i, \{\omega \})\right) \nonumber \\= & {} \sum _i^{N_{\text {ev}}} \left[ y^\prime _i\log {y_i} + (1-y^\prime _i)\log {(1-y_i)}\right] , \end{aligned}$$

(16)

where $N_{\text {ev}}$ is the number of Monte Carlo events that are used for the ANN training. The ANN is trained both on the signal and background MC events, so it is important to ensure that the input MC sample is large enough to avoid contamination from MC statistical fluctuations.

The training of the neural networks therefore consists of the minimisation of the cross-entropy error, Eq. (16), which in this work is achieved using a Genetic Algorithm (GA). GAs [125–128] are non-deterministic minimisation strategies suitable for the solution of complex optimisation problems, for instance when a very large number of quasi-equivalent minima are present. GAs are inspired on natural selection processes that emulate biological evolution. In our case, the GA training is performed for a very large number of generations, $N_\mathrm{gen}=5\times 10^{4}$, to avoid the risk of under-training. We have verified that if a much larger number of generations are used, the results are unchanged.

In addition, in order to avoid the possibility of over-fitting, we have used a cross-validation stopping criterion, in particular the same one as that used in the NNPDF3.0 analysis [66]. This cross-validation proceeds by dividing the input MC dataset into two disjoint sets, using one for training the ANN and the other for validation: the optimal stopping point is then given by the minimum of the error function Eq. (16) to the validation sub-sample. This indicates the point where the ANN begins to train upon statistical fluctuations in the input MC samples, rather than learning the underlying (smooth) physical distributions.

5.2 Input kinematic variables

In this work we use different sets of input variables for the three categories. In the case of large-R jets, we exploit the available information on jet substructure. For the three categories, boosted, intermediate and resolved, the following common variables are used as input to the MVA:

The transverse momenta of the leading and subleading Higgs, $p_{T,h_1}$ and $p_{T,h_2}$.
The transverse momentum of the reconstructed Higgs pair, $p_{T,hh}$.
The invariant masses of the leading and subleading Higgs candidates, $m_{h,1}$ and $m_{h,2}$.
The invariant mass of the reconstructed Higgs pair, $m_{hh}$.
The separation in the $\phi $–$\eta $ plane between the two Higgs candidates, $\Delta R_{hh}$.
The separation in $\eta $ between the two Higgs candidates, $\Delta \eta _{hh}$.
The separation in $\phi $ between the two Higgs candidates, $\Delta \phi _{hh}$.

In addition, in the boosted category we use the transverse momenta of the leading, $p_{T,h_{1,1}}$ and $p_{T,h_{1,2}}$ and subleading, $p_{T,h_{2,1}}$ and $p_{T,h_{2,2}}$, Higgs candidate AKT03 subjets. In the resolved category instead, the corresponding variables are the transverse momenta $p_{T,i}$ of the four leading b-tagged small-R jets in the event. In the intermediate category, we use the transverse momenta of the subjets from the large-R jet $p_{T,h_{1,1}}$ and $p_{T,h_{1,2}}$ and the transverse momenta $p_{T,i}$ of the two leading b-tagged small-R jets. Therefore, we have 13 variables which are common to the three categories.

In the boosted and intermediate categories, we also include the jet substructure variables introduced in Sect. 3 for the large-R jets: the $k_T$ splitting scales $\sqrt{d_{12}}$, the ratio of 2-to-1 subjettiness $\tau _{12}$, and the ratios of energy correlation functions $C^{(\beta )}_2$ and $D_2^{(\beta )}$. This leads to a total of $N_{\mathrm {var}}=13,17$ and 21 variables for the resolved, intermediate and boosted categories, respectively.

Given that the MVA is able to identify the most discriminatory variables in an automated way, and to suppress those which have little effect, it is advantageous to include a wide array of input variables. This is one of the main advantages of ANNs in this context: their inherent redundancy means that adding additional information, even if carries very little weight, should not degrade the classification power of the MVA.

5.3 MVA results

We now present the results of the MVA, first without PU, and then later including the effects of PU. First of all, in Fig. 16 we show the distribution of the ANN output at the end of the GA minimisation, separately for the boosted, intermediate and resolved categories. All distributions are normalised so that their integral adds up to one. The separation between signal and background is achieved by introducing a cut, $y_\mathrm{cut}$, on the ANN output, so that MC events with $y_i\ge y_\mathrm{cut}$ are classified as signal events, and those with $y_i < y_\mathrm{cut}$ as background events. Therefore, the more differentiated the distribution of the ANN output is for signal and background events, the more efficient the MVA discrimination will be.

From Fig. 16 we see that in the boosted category the MVA can produce a clear discrimination between signal and background, with the two distributions forming peaks at their respective optimal limits. This indicates that introducing a suitable cut $y_\mathrm{cut}$ in the ANN output will substantially reduce the background, while keeping a reasonable signal efficiency. The performance of the MVA discrimination is similar, although slightly worse, in the intermediate and resolved categories.

The results for the signal selection efficiency and the background rejection rate as a function of the cut in the ANN output $y_\mathrm{cut}$ define the so-called Receiver-Operating Characteristic (ROC) curve, shown in Fig. 17. It is clear that we can achieve high signal efficiency by using a small value of $y_\mathrm{cut}$, but such a choice would be affected by poor background rejection. Conversely, using a higher value of the cut will increase background rejection at the cost of dropping signal efficiency. As could already be inferred from the distribution of neural-networks output in Fig. 16, we find that our MVA is reasonably efficient in discriminating signal over background. The performance is best in the case of the boosted category, and then slightly worse in the resolved and intermediate categories, consistent with the distributions of the ANN outputs in Fig. 16.

It is useful to estimate, for each value of the cut in the ANN output $y_\mathrm{cut}$, how many signal and background events are expected at the HL-LHC with ${\mathcal {L}}=3$ ab$^{-1}$. This comparison is shown in Fig. 17. We observe that in the boosted category, for a value $y_\mathrm{cut}\simeq 0.9$ we end up with around 300 signal events and $10^4$ background events. Similar results are obtained in the intermediate and resolved categories: in the former we find 130 ($3\times 10^3$) signal (background) events for $y_\mathrm{cut}\simeq 0.85$ (0.60), and in the latter 630 ($10^5$) signal (background) events for $y_\mathrm{cut}\simeq 0.6$. Therefore, the MVA achieves a substantial background suppression with only a moderate reduction of signal efficiency.

A useful property of MVAs such as the one used in our analysis is that they can provide direct physical insight about which of the input variables contribute to the separation between signal and background. In the case of ANNs, this can be quantified by computing the sum of the absolute values of all the weights connected to a given input neuron i, that is,

$$\begin{aligned} \omega ^\mathrm{(tot)}_i \equiv \sum _{k=1}^{n^{(2)}} \Big |\omega ^{(2)}_{ki}\Big |, \quad i=1,\ldots ,N_\mathrm{var}, \end{aligned}$$

(17)

with $\omega ^{(2)}_{ki}$ the value of the weight connecting the kth neutron of the second layer with the ith neuron of the first (input) layer, and $n^{(2)}=5$ the number of neurons in the second layer. Those input variables with a larger value of $\omega ^\mathrm{(tot)}_i$ will be those that play a more significant role in enhancing the signal discrimination using the MVA. We note, however, that the estimate provided by Eq. (17) is necessarily qualitative.

In Fig. 18 we show the distribution of the total associated weight, Eq. (17) for each of the $N_\mathrm{var}$ input variables of the three categories, using the notation for the kinematic variables as in Sect. 5.2. In the resolved category, the variables that carry a higher discrimination power are the transverse momentum of the two reconstructed Higgs candidates and their invariant masses. In the case of the boosted category, the invariant mass distribution of the Higgs candidates is also the most discriminatory variable, followed by the subjet $p_T$ distributions and substructure variables such as $C_2^{(\beta )}$ and $D_2^{(\beta )}$.

The results for the signal significance $S/\sqrt{B}$ and the signal over background ratio S / B as a function of $y_\mathrm{cut}$ for the three categories are given in Fig. 19. The values for $y_\mathrm{cut}=0$ correspond to those at the end of the loose cut-based analysis. We observe how in the three categories there is a marked improvement in signal significance as compared to the pre-MVA results. We also observe a substantial enhancement in S / B, arising from the background suppression achieved by the MVA, reaching values of 1, 6 and 3.5 % in the resolved, intermediate and boosted categories. This improvement in S / B is crucial to ensure the feasibility of this measurement, since it allows systematic uncertainties in the background determination to be at most of a similar size.

The optimal value of the cut in the ANN output, $y_\mathrm{cut}$, can be determined from the maximisation of $S/\sqrt{B}$, ensuring that the number of signal events $N_\mathrm{ev}$ expected at the HL-LHC does not become too low. In addition, we require that the number of MC events used to define the signal category (events with $y_i \ge y_\mathrm{cut}$) is sufficiently large in order to avoid the biases and statistical fluctuations associated to a small training sample. In Table 6 we quote, for the optimal value of $y_\mathrm{cut}$ in each category, the number of signal and background events $N_\mathrm{ev}$ expected at the HL-LHC, as well as $S/\sqrt{B}$ and S / B. For completeness, we also include the corresponding pre-MVA results.

Table 6 Post-MVA results, for the optimal value of the ANN discriminant $y_\mathrm{cut}$ in the three categories, compared with the corresponding pre-MVA results ($y_\mathrm{cut}=0$). We quote the number of signal and background events expected for ${\mathcal {L}}=3$ ab$^{-1}$, the signal significance $S/\sqrt{B}$ and the signal over background ratio S / B. The pre-MVA results correspond to row C2 in Table 4

Full size table

From Table 6 we see that following the application of the MVA, the signal significance in the boosted category increases from 0.5 to 2.7, with S / B increasing from 0.06 to $3~\%$. For the intermediate and resolved categories, $S/\sqrt{B}$ increases from 0.4 to 2.3 and 1.9, respectively, with the signal over background ratio raising from 0.05 and $0.01~\%$ to 4 and 1 %. Combining the three categories, taking into account all background components, we obtain the overall signal significance:

$$\begin{aligned} \left( \frac{S}{\sqrt{B}}\right) _\mathrm{tot} \simeq 4.0~(1.3),\quad {\mathcal {L}}=3000~(300)\,\mathrm{fb}^{-1}. \end{aligned}$$

(18)

The signal significance for ${\mathcal {L}}=3$ ab$^{-1}$ is thus well above the threshold for the observation of Higgs pair production. However, given that the HL-LHC will be a high-PU environment, which will affect the description of the various kinematic distributions used as input to the MVA, it is essential to quantify the robustness of these results in a realistic environment including the effects of significant PU.

It should be emphasised that MVAs such as the ANNs used in this work can always be understood as a combined set of correlated cuts. Once the ANNs have been trained, it is possible to compare kinematical distributions after and before the ANN cut to verify its impact. This information would allow one in principle to perform a cut-based analysis, without the need of using ANNs, and finding similar results.

To illustrate this point, in Fig. 20 we show the $p_T$ distribution of the leading AKT04 small-R jets and the invariant mass of reconstructed Higgs candidates in the resolved category, comparing the pre-MVA results ($y_\mathrm{cut}=0$) with the post-MVA results ($y_\mathrm{cut}=0.60$) for signal and background events. The distributions are not normalised, to better visualise the effect of the MVA cut. Unsurprisingly, the ANN cut effectively selects events which lead to similar kinematical distributions between signal and background events. In the case of the small-R jets $p_T$ distribution, the ANN cuts favours the high-$p_T$ region, while for the invariant mass distribution only the region around the Higgs mass peak is selected for background events.

A particularly challenging aspect of our analysis is the modeling of the 2b2j and 4j background, especially for the latter, which require extremely large MC samples. In the analysis reported here, out of the original 3M 4j generated events, only around 100 survive the analysis cuts, and thus these low statistics have associated a potentially large uncertainty in the calculation of the post-MVA 4j cross section. On the other hand, since the 4j cross sections are always quite smaller than the sum of the 4b of the 2b2j components, these low statistics should not modify qualitatively our conclusions above. To verify explicitly this expectation, and obtain a more robust estimate of the background cross section from mis-identified jets, we have increased by a factor 10 the size of the 2b2j and 4j background samples, up to a total of 30M each. Processing these events though our analysis, including retraining the MVA, we find $(S/\sqrt{B})_\mathrm{tot}=3.9$, consistent with Eq. (18), indicating that the low statistics of the 4j background is not a limiting factor.

5.4 Impact of PU in the MVA

In this section we study how the MVA results are modified when the analysis is performed including significant PU. The loose cut-based analysis and the subsequent MVA optimisation have been performed using the same settings as in the case without PU. In Table 7 we provide the pre-MVA cut flow in the case of PU80, the corresponding version without PU being Table 4. The interplay between the signal cross sections and the various background components is qualitatively unchanged as compared to the no PU case.

Table 7 Same as Table 4, now for the case of PU80+SK+Trim

Full size table

Table 8 Same as Table 6, now for the case of PU80+SK+Trim

Full size table

In Table 8 we compare the results for the PU80+SK+Trim case between the pre-MVA loose cut-based analysis and the post-MVA results for the optimal values of the ANN output cut $y_\mathrm{cut}$. As in Table 6, we also quote the number of signal and total background events expected for ${\mathcal {L}}=3$ ab$^{-1}$ and the values of $S/\sqrt{B}$ and S / B. We observe that the pre-MVA signal significance is close to the results of the simulations without PU for the three categories. We now find values for $S/\sqrt{B}$ of 0.4, 0.3 and 0.6, in the resolved, intermediate and boosted categories, respectively, to be compared with the corresponding values without PU, namely 0.4, 0.4 and 0.5. The number of selected signal events in each category at the end of the cut-based analysis is only mildly affected by PU. The slight pre-MVA improvement in $S/\sqrt{B}$ for the boosted case arises from a reduction in the number of background events that are classified in this category as compared to the case without PU.

Once the MVA is applied, the signal significance in the resolved, intermediate and boosted categories increases to 2.0, 1.9 and 1.5 respectively, to be compared with the corresponding values without PU, namely 1.9, 2.3 and 2.7. Therefore, the post-MVA effect of PU on $S/\sqrt{B}$ is a moderate degradation of the boosted and intermediate categories, especially for the former, while the resolved category is largely unchanged.^{Footnote 3} We also observe that, due to the MVA, the signal over background ratio is increased from 0.007, 0.03 and 0.1 % up to 1, 3 and 1 % in the resolved, intermediate and boosted categories, respectively. This indicates that, while this measurement is still highly challenging, requiring a careful extraction of the QCD background from the data, it should be within reach.

In Fig. 21 we show the number of signal and background events that are expected for ${\mathcal {L}}=3$ ab$^{-1}$ as a function of $y_\mathrm{cut}$, together with the corresponding ROC curve. The slight degradation of the boosted category in the case of PU can be seen by comparing with the corresponding results without PU in Fig. 17. In Fig. 22 we show the signal significance, $S/\sqrt{B}$, and the signal over background ratio, S / B, accounting now for the effects of PU. The corresponding results in the case without PU were shown in Fig. 19. As can be seen, the MVA-driven enhancement remains robust in the presence of PU, with $S/\sqrt{B}$ only moderately degraded. Therefore, the qualitative conclusions drawn in the case without PU also hold when the analysis is performed in a high-PU environment. Since no specific effort has been made to optimise PU subtraction, for instance by tuning the values of the patch length a in SoftKiller or the $p_T$ threshold during jet trimming, we believe that there should be still room for further improvement.

It is useful to quantify which of the MVA input variables carry the highest discrimination power in the case of PU, by means of Eq. (17), and compare this with the corresponding results without PU shown in Fig. 18. We have verified that the relative weight of the different input variables to the MVA is mostly unchanged in the case of PU. In the resolved category, the highest total associated weight is carried by the Higgs candidates $p_T$ and invariant mass, as well as by the $p_T$ of the individual small-R jets. For the boosted category, the highest weight is carried by the Higgs invariant mass, followed by the Higgs $p_T$, $m_{hh}$, the $p_T$ of the AKT03 subjets and the substructure variables, with a similar weighting among them.

In Table 9 we provide the post-MVA number of signal and background events expected for ${\mathcal {L}}=3$ ab$^{-1}$. For the backgrounds, we quote both the total number, $N_\mathrm{ev}^\mathrm{tot}$, and the QCD 4b component only, $N_\mathrm{ev}^\mathrm{4b}$. We quote results for the no PU and PU80+SK+Trim cases. We also quote in each case the corresponding values for the signal significance and the signal over background ratio. Note that the MVA is always trained to the inclusive background sample, though differences in the kinematic distributions of the 4b and 2b2j processes are moderate; see Fig. 14. From Table 9 one observes that all categories exhibit a marked improvement from eliminating the contamination from light and charm jet mis-identification. For instance, in the intermediate category, $S/\sqrt{B}$ increases from 2.3 to 3.3 (1.9 to 2.9) in the no PU (PU80) case, with similar improvements in the resolved and boosted categories.

Table 9 Post-MVA number of signal and background events with ${\mathcal {L}}=3$ ab$^{-1}$. For the backgrounds, both the total number, $N_\mathrm{ev}^\mathrm{tot}$, and the 4b component only, $N_\mathrm{ev}^\mathrm{4b}$, are shown. Also provided are the values of the signal significance and the signal over background ratio, both separated in categories and for their combination. We quote the results without PU and for PU80+SK+Trim

Full size table

In Table 9 we also provide the results for $S/\sqrt{B}$ obtained by combining the three categories. Taking into account all background components, we obtain for the case of $n_\mathrm{PU}=80$ an overall signal significance of

$$\begin{aligned} \left( \frac{S}{\sqrt{B}}\right) _\mathrm{tot} \simeq 3.1~(1.0),\quad {\mathcal {L}}=3000~(300)\,\mathrm{fb}^{-1}, \end{aligned}$$

(19)

indicating that a measurement of Higgs pair production in the $b\bar{b}b\bar{b}$ final state at the HL-LHC should be above the threshold for observation, even when realistic PU conditions are accounted for. A similar signal significance is obtained in the case of $n_\mathrm{PU}=150$. Under the assumption that the only relevant background would be the irreducible QCD 4b component, one obtains instead

$$\begin{aligned} \left( \frac{S}{\sqrt{B_\mathrm{4b}}}\right) _\mathrm{tot} \simeq 4.7~(1.5),\quad {\mathcal {L}}=3000~(300)\,\mathrm{fb}^{-1}. \end{aligned}$$

(20)

Therefore, a measurement of Higgs pair production in the $b\bar{b}b\bar{b}$ final state at the HL-LHC might be even above the threshold for discovery, provided the effects due to mis-identification of light and charm jets as b-jets can be reduced.

6 Conclusions and outlook

In this work we have presented a feasibility study for the measurement of Higgs pair production in the $b\bar{b}b\bar{b}$ final state at the LHC. Our strategy is based on the combination of traditional cut-based analysis with state-of-the-art multivariate techniques. We take into account all relevant backgrounds, in particular the irreducible 4b and the reducible 2b2j and 4j QCD multijets. We have illustrated how the 2b2j component leads to a contribution comparable to that of QCD 4b production, due to a combination of parton shower effects, b-quark pair radiation, and selection requirements. We have also demonstrated the robustness of our analysis strategy under the addition of significant PU. In particular, we have explored two scenarios, $n_\mathrm{PU}=80$ and $n_\mathrm{PU}=150$, and we found a comparable overall signal significance in the two cases.

Combining the contributions from the resolved, intermediate and boosted categories, we find that, for ${\mathcal {L}}=3$ ab$^{-1}$, the signal significance for the production of Higgs pairs turns out to be $S/\sqrt{B}\simeq 3$. This indicates that, already from the $b\bar{b}b\bar{b}$ final state alone, it should be possible to claim observation of Higgs pair production at the HL-LHC. Our study also suggests possible avenues that the LHC experiments could explore to further improve this signal significance. One handle would be to reduce the contribution from light and charm jet mis-identification, ensuring that the irreducible 4b background dominates over the 2b2j component. This would allow one to enhance $S/\sqrt{B}$ almost to the discovery level; see Table 9. It would also be advantageous to improve the b-tagging efficiency, allowing to achieve higher signal yields. Another possibility would be to improve the mass resolution of the Higgs reconstruction in high-PU environments, and, more generally, to optimise the PU subtraction strategy in order to reduce the impact of PU in the modeling of kinematic variables and the associated degradation in the MVA discrimination.

Another challenging aspect of the measurement of Higgs pairs in the $b\bar{b}b\bar{b}$ final state is achieving an efficient triggering strategy. In order to reduce the rate from background QCD processes sufficiently, while being able to access the relevant $p_T$ regimes, (multi-)jet triggers using b-quark tagging information online for one or more jets are likely to be necessary. The additional rejection provided by these triggers could enable events to be selected efficiently, with four jets down to $p_T=40$ GeV in the resolved category, and boosted Higgs decays in large-R jets down to jet transverse momenta of $p_T=200$ GeV. In addition, good control of the multijet backgrounds and the experimental systematics of the MVA inputs will be important to achieve these sensitivities.

Our strategy relies on the modeling of the kinematic distributions of signal and background events, since these provide the inputs to the MVA discriminant. In this respect, it would be important, having established the key relevance of the $b\bar{b}b\bar{b}$ channel for the study of Higgs pair production, to revisit and improve the theoretical modeling of our signal and background simulation, in particular using NLO calculations matched to parton showers both for signal [17, 35] and for backgrounds [63, 76].

One important implication of this work is that it should be possible to significantly improve the accuracy on the extraction of the Higgs trilinear coupling $\lambda $ from a measurement of the $\sigma \left( hh\rightarrow b\bar{b}b\bar{b}\right) $ cross section, as compared to existing estimates. A determination of $\lambda $ in our approach is however rather non-trivial, involving not only generating signal samples for a wide range of values of $\lambda $, but also repeating the analysis optimisation, including the MVA training, for each of these values. This study is left to a future publication, where we will also compare the precision from the $b\bar{b}b\bar{b}$ final state with the corresponding precision that has been reported from other final states such as $b\bar{b}\gamma \gamma $ and $b\bar{b}\tau \tau $. It will also be interesting to perform this exercise for a 100 TeV hadron collider [11–14]. While at 100 TeV the signal yields would be increased, also the (gluon-driven) QCD multijet background would grow strongly. Revisiting the present analysis, including the MVA optimisation, at 100 TeV would also allow us to assess the accuracy of an extraction of the trilinear coupling $\lambda $ from the $b\bar{b}b\bar{b}$ final state at 100 TeV.

In this work we have considered only the SM production mechanism, but many BSM scenarios predict deviations in Higgs pair production, both at the level of total rates and of differential distributions. In the absence of new explicit degrees of freedom, deviations from the SM can be parametrised in the EFT framework using higher-order operators [14, 48]. Therefore, we plan to study the constraints on the coefficients of these effective operators that can be obtained from measurements of various kinematic distributions in the $hh\rightarrow b\bar{b}b\bar{b}$ process. Note that the higher rates of the $b\bar{b}b\bar{b}$ final state as compared to other final states, such as $b\bar{b}\gamma \gamma $, allow for better constraints upon operators that modify the high-energy behaviour of the theory, for instance, it would become possible to access the tail of the $m_{hh}$ distribution.

As in the case of the extraction of the Higgs trilinear coupling $\lambda $, such a study would be a computationally intensive task, since BSM dynamics will modify the shapes of the kinematic distributions and thus in principle each point in the EFT parameter space would require a re-optimisation with a newly trained MVA. In order to explore efficiently the BSM parameters without having to repeat the full analysis for each point, modern statistical techniques such as the Cluster Analysis method proposed in Ref. [46] might be helpful.

Notes

These techniques have also important applications in the subtraction of the UE/MPI contamination for jet reconstruction in heavy ion collisions [115].
This type of ANNs are the same as those used to parametrise Parton Distribution Functions in the NNPDF global analyses [121–124].
The impact of PU on the separate significance of the three categories exhibits some dependence on the specific choice for $n_\mathrm{PU}$ and on the settings of the PU subtraction strategy. We find, however, that the overall signal significance from combining the three categories is similar in the $n_\mathrm{PU}=80$ and $n_\mathrm{PU}=150$ cases.

References

ATLAS Collaboration, Physics at a high-luminosity LHC with ATLAS, in Community Summer Study 2013: Snowmass on the Mississippi (CSS2013), Minneapolis, MN, USA, July 29-August 6, 2013. arXiv:1307.7292
CMS Collaboration, Projected performance of an upgraded CMS detector at the LHC and HL-LHC: contribution to the snowmass process, in Community Summer Study 2013: Snowmass on the Mississippi (CSS2013), Minneapolis, MN, USA, July 29-August 6, 2013. arXiv:1307.7135
J. Baglio, A. Djouadi, R. Grober, M. Muhlleitner, J. Quevillon et al., The measurement of the Higgs self-coupling at the LHC: theoretical status. JHEP 1304, 151 (2013). arXiv:1212.5581
Article ADS Google Scholar
G. Giudice, C. Grojean, A. Pomarol, R. Rattazzi, The strongly-interacting light Higgs. JHEP 0706, 045 (2007). arXiv:hep-ph/0703164
Article ADS Google Scholar
R. Contino, C. Grojean, M. Moretti, F. Piccinini, R. Rattazzi, Strong double Higgs production at the LHC. JHEP 1005, 089 (2010). arXiv:1002.1011
Article ADS Google Scholar
ATLAS Collaboration, G. Aad et al., Searches for Higgs boson pair production in the $hh\rightarrow bb\tau \tau , \gamma \gamma WW^*, \gamma \gamma bb, bbbb$ channels with the ATLAS detector. Phys. Rev. D 92, 092004 (2015). arXiv:1509.04670
ATLAS Collaboration, G. Aad et al., Search for Higgs boson pair production in the $b\bar{b}b\bar{b}$ TeV with the ATLAS detector. Eur. Phys. J. C 75(9), 412 (2015). arXiv:1506.00285
ATLAS Collaboration, G. Aad et al., Search for Higgs boson pair production in the $\gamma \gamma b\bar{b}$ TeV from the ATLAS detector. Phys. Rev. Lett. 114(8), 081802 (2015). arXiv:1406.5053
CMS Collaboration, V. Khachatryan et al., Search for resonant pair production of Higgs bosons decaying to two bottom quark–antiquark pairs in proton–proton collisions at 8 TeV. Phys. Lett. B 749, 560–582 (2015). arXiv:1503.04114
CMS Collaboration, Search for the resonant production of two Higgs bosons in the final state with two photons and two bottom quarks. (2014). Report no. CMS-PAS-HIG-13-032
N. Arkani-Hamed, T. Han, M. Mangano, L.-T. Wang, Physics opportunities of a 100 TeV proton–proton collider (2015). arXiv:1511.06495
A.J. Barr, M.J. Dolan, C. Englert, D.E. Ferreira de Lima, M. Spannowsky, Higgs self-coupling measurements at a 100 TeV hadron collider. JHEP 1502, 016 (2015). arXiv:1412.7154
Article ADS Google Scholar
A. Papaefstathiou, Discovering Higgs boson pair production through rare final states at a 100 TeV collider. Phys. Rev. D 91(11), 113016 (2015). arXiv:1504.04621
A. Azatov, R. Contino, G. Panico, M. Son, Effective field theory analysis of double Higgs boson production via gluon fusion. Phys. Rev. D 92(3), 035001 (2015). arXiv:1502.00539
R. Contino, C. Grojean, D. Pappadopulo, R. Rattazzi, A. Thamm, Strong Higgs interactions at a linear collider. JHEP 1402, 006 (2014). arXiv:1309.7038
Article ADS Google Scholar
LHC Higgs Cross Section Working Group, S. Dittmaier, C. Mariotti, G. Passarino, R. Tanaka (eds.), Handbook of LHC Higgs cross sections: 2. Differential distributions. CERN-2012-002 (CERN, Geneva, 2012). arXiv:1201.3084
R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer, P. Torrielli, E. Vryonidou, M. Zaro, Higgs pair production at the LHC with NLO and parton-shower effects. Phys. Lett. B 732, 142–149 (2014). arXiv:1401.7340
Article ADS Google Scholar
D. de Florian, J. Mazzitelli, Higgs boson pair production at next-to-next-to-leading order in QCD. Phys. Rev. Lett. 111(201801), 201801 (2013). arXiv:1309.6594
Article ADS Google Scholar
D. de Florian, J. Mazzitelli, Higgs pair production at next-to-next-to-leading logarithmic accuracy at the LHC. JHEP 09, 053 (2015). arXiv:1505.07122
U. Baur, T. Plehn, D.L. Rainwater, Probing the Higgs selfcoupling at hadron colliders using rare decays. Phys. Rev. D 69, 053004 (2004). arXiv:hep-ph/0310056
Article ADS Google Scholar
V. Barger, L.L. Everett, C. B. Jackson, G. Shaughnessy, Higgs-pair production and measurement of the triscalar coupling at LHC(8,14). Phys. Lett. B 728, 433–436 (2014). doi:10.1016/j.physletb.2013.12.013. arXiv:1311.2931
C.T. Lu, J. Chang, K. Cheung, J.S. Lee, An exploratory study of Higgs-boson pair production. JHEP 1508, 133 (2015). arXiv:1505.00957
U. Baur, T. Plehn, D.L. Rainwater, Examining the Higgs boson potential at lepton and hadron colliders: a comparative analysis. Phys. Rev. D 68, 033001 (2003). arXiv:hep-ph/0304015
Article ADS Google Scholar
A.J. Barr, M.J. Dolan, C. Englert, M. Spannowsky, Di-Higgs final states augMT2ed—selecting $hh$ events at the high luminosity LHC. Phys. Lett. B 728, 308–313 (2014). arXiv:1309.6318
Article ADS Google Scholar
M.J. Dolan, C. Englert, M. Spannowsky, Higgs self-coupling measurements at the LHC. JHEP 1210, 112 (2012). arXiv:1206.5001
Article ADS Google Scholar
M.J. Dolan, C. Englert, N. Greiner, M. Spannowsky, Further on up the road: $hhjj$ production at the LHC. Phys. Rev. Lett. 112, 101802 (2014). arXiv:1310.1084
Article ADS Google Scholar
A. Papaefstathiou, L.L. Yang, J. Zurita, Higgs boson pair production at the LHC in the $b \bar{b} W^+ W^-$ channel. Phys. Rev. D 87, 011301 (2013). arXiv:1209.1489
Article ADS Google Scholar
D. Wardrope, E. Jansen, N. Konstantinidis, B. Cooper, R. Falla et al., Non-resonant Higgs-pair production in the $b\overline{b}$ final state at the LHC. Eur. Phys. J. C 75(5), 219 (2015). arXiv:1410.2794
Article ADS Google Scholar
D.E. Ferreira de Lima, A. Papaefstathiou, M. Spannowsky, Standard model Higgs boson pair production in the ($b\overline{b}$)($ b\overline{b}$) final state. JHEP 1408, 030 (2014). arXiv:1404.7139
Article Google Scholar
M. Slawinska, W. van den Wollenberg, B. van Eijk, S. Bentvelsen, Phenomenology of the trilinear Higgs coupling at proton–proton colliders. (2014). arXiv:1408.5010
C.-R. Chen, I. Low, Double take on new physics in double Higgs boson production. Phys. Rev. D 90(1), 013018 (2014). arXiv:1405.7040
Article ADS Google Scholar
F. Goertz, A. Papaefstathiou, L.L. Yang, J. Zurita, Higgs Boson self-coupling measurements using ratios of cross sections. JHEP 1306, 016 (2013). arXiv:1301.3492
Article ADS Google Scholar
S. Dawson, A. Ismail, I. Low, What’s in the loop? The anatomy of double Higgs production. Phys. Rev. D 91(11), 115008 (2015). arXiv:1504.05596
F. Maltoni, E. Vryonidou, M. Zaro, Top-quark mass effects in double and triple Higgs production in gluon-gluon fusion at NLO. JHEP 1411, 079 (2014). arXiv:1408.6542
Article ADS Google Scholar
P. Maierhfer, A. Papaefstathiou, Higgs boson pair production merged to one jet. JHEP 1403, 126 (2014). arXiv:1401.0007
Article ADS Google Scholar
J. Grigo, J. Hoff, K. Melnikov, M. Steinhauser, On the Higgs boson pair production at the LHC. Nucl. Phys. B 875, 1–17 (2013). arXiv:1305.7340
Article ADS MathSciNet MATH Google Scholar
J. Grigo, K. Melnikov, M. Steinhauser, Virtual corrections to Higgs boson pair production in the large top quark mass limit. Nucl. Phys. B 888, 17–29 (2014). arXiv:1408.2422
Article ADS MathSciNet MATH Google Scholar
M.J. Dolan, C. Englert, N. Greiner, K. Nordstrom, M. Spannowsky, $hhjj$ production at the LHC. Eur. Phys. J. C 75(8), 387 (2015). arXiv:1506.08008
G. Brooijmans, R. Contino, B. Fuks, F. Moortgat, P. Richardson et al., Les Houches 2013: physics at TeV colliders: new physics working group report. (2014). arXiv:1405.1617
V.D. Barger, T. Han, R.J.N. Phillips, Double Higgs boson bremsstrahlung from $W$ and $Z$ bosons at supercolliders. Phys. Rev. D 38, 2766 (1988)
Article ADS Google Scholar
Q.-H. Cao, Y. Liu, B. Yan, Measuring trilinear Higgs coupling in $WHH$ and $ZHH$ productions at the HL-LHC. (2015). arXiv:1511.03311
C. Englert, F. Krauss, M. Spannowsky, J. Thompson, Di-Higgs phenomenology in $t\bar{t}hh$: the forgotten channel. Phys. Lett. B 743, 93–97 (2015). arXiv:1409.8074
Article ADS Google Scholar
L.-S. Ling, R.-Y. Zhang, W.-G. Ma, L. Guo, W.-H. Li et al., NNLO QCD corrections to Higgs pair production via vector boson fusion at hadron colliders. Phys. Rev. D 89(7), 073001 (2014). arXiv:1401.7754
Article ADS Google Scholar
K. Nishiwaki, S. Niyogi, A. Shivaji, $ttH$ Anomalous coupling in double Higgs production. JHEP 1404, 011 (2014). arXiv:1309.6907
Article ADS Google Scholar
Q.H. Cao, B. Yan, D.M. Zhang, H. Zhang, Resolving the degeneracy in single Higgs production with Higgs pair production. Phys. Lett. B 752, 285 (2016). arXiv:1508.06512
A. Carvalho, M. Dall’Osso, T. Dorigo, F. Goertz, C.A. Gottardo, M. Tosi, Higgs pair production: choosing benchmarks with cluster analysis. JHEP 04, 126 (2016). doi:10.1007/JHEP04(2016)126. arXiv:1507.02245
N. Liu, S. Hu, B. Yang, J. Han, Impact of top-Higgs couplings on di-Higgs production at future colliders. JHEP 1501, 008 (2015). arXiv:1408.4191
ADS Google Scholar
F. Goertz, A. Papaefstathiou, L.L. Yang, J. Zurita, Higgs boson pair production in the D = 6 extension of the SM. JHEP 1504, 167 (2015). arXiv:1410.3471
Article ADS Google Scholar
H.-J. He, J. Ren, W. Yao, Probing new physics of cubic Higgs boson interaction via Higgs pair production at hadron colliders. Phys. Rev. D 93, 015003 (2016). doi:10.1103/PhysRevD.93.015003. arXiv:1506.03302
R. Grober, M. Muhlleitner, M. Spira, J. Streicher, NLO QCD corrections to Higgs pair production including dimension-6 operators. JHEP 09, 092 (2015). arXiv:1504.06577
M. Gouzevitch, A. Oliveira, J. Rojo, R. Rosenfeld, G.P. Salam et al., Scale-invariant resonance tagging in multijet events and new physics in Higgs pair production. JHEP 1307, 148 (2013). arXiv:1303.6636
Article ADS Google Scholar
B. Cooper, N. Konstantinidis, L. Lambourne, D. Wardrope, Boosted $hh b\overline{b}b\overline{b}$: a new topology in searches for TeV-scale resonances at the LHC. Phys. Rev. D 88(11), 114005 (2013). arXiv:1307.0407
Article ADS Google Scholar
J.M. No, M. Ramsey-Musolf, Probing the Higgs portal at the LHC through resonant di-Higgs production. Phys. Rev. D 89(9), 095031 (2014). arXiv:1310.6035
Article ADS Google Scholar
Z. Wen-Juan, M. Wen-Gan, Z. Ren-You, L. Xiao-Zhou, G. Lei, C. Chong, Double Higgs boson production and decay in Randall–Sundrum model at hadron colliders. Phys. Rev. D 92(1), 116005 (2015). doi:10.1103/PhysRevD.92.116005. arXiv:1512.01766
A. Belyaev, M. Drees, O.J.P. Eboli, J.K. Mizukoshi, S.F. Novaes, Supersymmetric Higgs pair discovery prospects at hadron colliders, in Proceedings, International Europhysics Conference on High energy physics (EPS-HEP 1999), pp. 748–751. arXiv:hep-ph/9910400
C. Han, X. Ji, L. Wu, P. Wu, J.M. Yang, Higgs pair production with SUSY QCD correction: revisited under current experimental constraints. JHEP 1404, 003 (2014). arXiv:1307.3790
Article ADS Google Scholar
B. Hespel, D. Lopez-Val, E. Vryonidou, Higgs pair production via gluon fusion in the two-Higgs-doublet model. JHEP 1409, 124 (2014). arXiv:1407.0281
Article ADS MATH Google Scholar
L. Wu, J.M. Yang, C.-P. Yuan, M. Zhang, Higgs self-coupling in the MSSM and NMSSM after the LHC Run 1. Phys. Lett. B 747, 378–389 (2015). arXiv:1504.06932
J. Cao, D. Li, L. Shang, P. Wu, Y. Zhang, Exploring the Higgs sector of a most natural NMSSM and its prediction on Higgs pair production at the LHC. JHEP 1412, 026 (2014). arXiv:1409.8431
Article ADS Google Scholar
U. Ellwanger, Higgs pair production in the NMSSM at the LHC. JHEP 1308, 077 (2013). arXiv:1306.5541
Article ADS Google Scholar
J. Cao, Z. Heng, L. Shang, P. Wan, J.M. Yang, Pair production of a 125 GeV Higgs boson in MSSM and NMSSM at the LHC. JHEP 1304, 134 (2013). arXiv:1301.6437
Article ADS Google Scholar
CMS Collaboration, V. Khachatryan et al., Searches for a heavy scalar boson ${\rm H}$ decaying to a pair of 125 GeV Higgs bosons hh or for a heavy pseudoscalar boson A decaying to Zh, in the final states with h to tau tau. Phys. Lett. B 755, 217–244 (2016). doi:10.1016/j.physletb.2016.01.056. arXiv:1510.01181
J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni et al., The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations. JHEP 1407, 079 (2014). arXiv:1405.0301
Article ADS Google Scholar
T. Plehn, M. Spira, P. Zerwas, Pair production of neutral Higgs particles in gluon–gluon collisions. Nucl. Phys. B 479, 46–64 (1996). arXiv:hep-ph/9603205
Article ADS Google Scholar
V. Hirschi, O. Mattelaer, Automated event generation for loop-induced processes. JHEP 10, 146 (2015). arXiv:1507.00020
NNPDF Collaboration, R.D. Ball et al., Parton distributions for the LHC Run II. JHEP 1504, 040 (2015). arXiv:1410.8849
A. Buckley, J. Ferrando, S. Lloyd, K. Nordstrm, B. Page et al., LHAPDF6: parton density access in the LHC precision era. Eur. Phys. J. C 75(3), 132 (2015). arXiv:1412.7420
Article ADS Google Scholar
ATLAS Collaboration, G. Aad et al., Measurement of the Higgs boson mass from the $H\rightarrow \gamma \gamma $ collision data. Phys. Rev. D 90(5), 052004 (2014). arXiv:1406.3827
CMS Collaboration, V. Khachatryan et al., Precise determination of the mass of the Higgs boson and tests of compatibility of its couplings with the standard model predictions using proton collisions at 7 and 8 TeV. Eur. Phys. J. C 75(5), 212 (2015). arXiv:1412.8662
ATLAS, CMS Collaboration, G. Aad et al., Combined measurement of the Higgs boson mass in $pp$ and 8 TeV with the ATLAS and CMS experiments. Phys. Rev. Lett. 114, 191803 (2015). arXiv:1503.07589
T. Sjostrand, S. Mrenna, P.Z. Skands, A. Brief, Introduction to PYTHIA 8.1. Comput. Phys. Commun. 178, 852–867 (2008). arXiv:0710.3820
Article ADS MATH Google Scholar
T. Sjstrand, S. Ask, J.R. Christiansen, R. Corke, N. Desai et al., An introduction to PYTHIA 8.2. Comput. Phys. Commun. 191, 159–177 (2015). arXiv:1410.3012
Article ADS Google Scholar
P. Skands, S. Carrazza, J. Rojo, Tuning PYTHIA 8.1: the Monash 2013 tune. Eur. Phys. J. C 74(8), 3024 (2014). arXiv:1404.5630
Article ADS Google Scholar
R.D. Ball, V. Bertone, S. Carrazza, C.S. Deans, L. Del Debbio et al., Parton distributions with LHC data. Nucl. Phys. B 867, 244–289 (2013). arXiv:1207.1303
Article ADS Google Scholar
NNPDF Collaboration, R.D. Ball, V. Bertone, S. Carrazza, L. Del Debbio, S. Forte, A. Guffanti, N.P. Hartland, J. Rojo, Parton distributions with QED corrections. Nucl. Phys. B 877, 290–320 (2013). arXiv:1308.0598
T. Gleisberg et al., Event generation with SHERPA 1.1. JHEP 02, 007 (2009). arXiv:0811.4622
Z. Bern, G. Diana, L. Dixon, F. Febres, Cordero, S. Hoeche et al., Four-jet production at the large hadron collider at next-to-leading order in QCD. Phys. Rev. Lett. 109, 042001 (2012). arXiv:1112.3940
Article ADS Google Scholar
M. Czakon, P. Fiedler, A. Mitov, Total top-quark pair-production cross section at hadron colliders through $O(\frac{4}{S})$. Phys. Rev. Lett. 110, 252004 (2013). arXiv:1303.6254
Article ADS Google Scholar
ATLAS Collaboration, G. Aad et al., Search for the Standard Model Higgs boson produced in association with a vector boson and decaying to a $b$-quark pair with the ATLAS detector. Phys. Lett. B 718, 369–390 (2012). arXiv:1207.0210
CMS Collaboration, S. Chatrchyan et al., Search for the standard model Higgs boson produced in association with a W or a Z boson and decaying to bottom quarks. Phys. Rev. D 89(1), 012003 (2014). arXiv:1310.3687
ATLAS Collaboration, G. Aad et al., Search for the $b\bar{b}$ production with the ATLAS detector. JHEP 01, 069 (2015). arXiv:1409.6212
M. Cacciari, G.P. Salam, G. Soyez, FastJet user manual. Eur. Phys. J. C 72, 1896 (2012). arXiv:1111.6097
Article ADS Google Scholar
M. Cacciari, G.P. Salam, Dispelling the $N^{3}$ myth for the $k_t$ jet-finder. Phys. Lett. B 641, 57–61 (2006). arXiv:hep-ph/0512210
Article ADS Google Scholar
M. Cacciari, G.P. Salam, G. Soyez, The anti-k(t) jet clustering algorithm. JHEP 0804, 063 (2008). arXiv:0802.1189
Article ADS Google Scholar
ATLAS Collaboration, G. Aad et al., Jet energy measurement and its systematic uncertainty in proton–proton collisions at $\sqrt{s}=7$ TeV with the ATLAS detector. Eur. Phys. J. C 75(1), 17 (2015). arXiv:1406.0076
ATLAS Collaboration, Performance of large-R jets and jet substructure reconstruction with the ATLAS detector. (2012). Report no. ATLAS-CONF-2012-065
J.M. Butterworth, A.R. Davison, M. Rubin, G.P. Salam, Jet substructure as a new Higgs search channel at the LHC. Phys. Rev. Lett. 100, 242001 (2008). arXiv:0802.2470
Article ADS Google Scholar
Y.L. Dokshitzer, G. Leder, S. Moretti, B. Webber, Better jet clustering algorithms. JHEP 9708, 001 (1997). arXiv:hep-ph/9707323
Article ADS Google Scholar
M. Wobisch, T. Wengler, Hadronization corrections to jet cross-sections in deep inelastic scattering, in Monte Carlo generators for HERA physics. Proceedings, Workshop, Hamburg, Germany, 1998-1999 (1998). arXiv:hep-ph/9907280
G.P. Salam, Towards jetography. Eur. Phys. J. C 67, 637–686 (2010). arXiv:0906.1833
Article ADS Google Scholar
ATLAS Collaboration, G. Aad et al., Performance of jet substructure techniques for large-$R$ = 7 TeV using the ATLAS detector. JHEP 1309, 076 (2013). arXiv:1306.4945
J. Butterworth, B. Cox, J.R. Forshaw, $W W$ scattering at the CERN LHC. Phys. Rev. D 65, 096014 (2002). arXiv:hep-ph/0201098
Article ADS Google Scholar
S.D. Ellis, D.E. Soper, Successive combination jet algorithm for hadron collisions. Phys. Rev. D 48, 3160–3166 (1993). arXiv:hep-ph/9305266
Article ADS Google Scholar
J. Thaler, K. Van Tilburg, Identifying boosted objects with N-subjettiness. JHEP 1103, 015 (2011). arXiv:1011.2268
Article ADS Google Scholar
J. Thaler, K. Van Tilburg, Maximizing boosted top identification by minimizing N-subjettiness. JHEP 1202, 093 (2012). arXiv:1108.2701
Article ADS Google Scholar
S. Catani, Y.L. Dokshitzer, M.H. Seymour, B.R. Webber, Longitudinally invariant $K_t$ clustering algorithms for hadron hadron collisions. Nucl. Phys. B 406, 187–224 (1993)
Article ADS Google Scholar
A.J. Larkoski, G.P. Salam, J. Thaler, Energy correlation functions for jet substructure. JHEP 1306, 108 (2013). arXiv:1305.0007
Article ADS MathSciNet MATH Google Scholar
A.J. Larkoski, I. Moult, D. Neill, Power counting to better jet observables. JHEP 1412, 009 (2014). arXiv:1409.6298
Article ADS Google Scholar
ATLAS Collaboration, G. Aad et al., Performance of $b$-jet identification in the ATLAS experiment. JINST 11(04), P04008 (2016). doi:10.1088/1748-0221/11/04/P04008. arXiv:1512.01094
CMS Collaboration, V. Khachatryan et al., Measurement of $B\bar{B}$ angular correlations based on secondary vertex reconstruction at $\sqrt{s}=7$ TeV. JHEP 1103, 136 (2011). arXiv:1102.3194
CMS Collaboration, S. Chatrchyan et al., Identification of b-quark jets with the CMS experiment. JINST 8, P04013 (2013). arXiv:1211.4462
M. Cacciari, G.P. Salam, Pileup subtraction using jet areas. Phys. Lett. B 659, 119–126 (2008). arXiv:0707.1378
Article ADS Google Scholar
ATLAS Collaboration, Calibration of b-tagging using dileptonic top pair events in a combinatorial likelihood approach with the ATLAS experiment. (2014). Report no. ATLAS-CONF-2014-004
M. Cacciari, G.P. Salam, S. Sapeta, On the characterisation of the underlying event. JHEP 1004, 065 (2010). arXiv:0912.4926
Article ADS Google Scholar
ATLAS Collaboration, Pile-up subtraction and suppression for jets in ATLAS. (2013). Report no. ATLAS-CONF-2013-083
D. Krohn, J. Thaler, L.-T. Wang, Jet trimming. JHEP 02, 084 (2010). arXiv:0912.1342
Article ADS Google Scholar
D. Krohn, M.D. Schwartz, M. Low, L.-T. Wang, Jet cleansing: pileup removal at high luminosity. Phys. Rev. D 90(6), 065020 (2014). arXiv:1309.4777
Article ADS Google Scholar
M. Cacciari, J. Rojo, G.P. Salam, G. Soyez, Quantifying the performance of jet definitions for kinematic reconstruction at the LHC. JHEP 12, 032 (2008). arXiv:0810.1304
Article ADS Google Scholar
S.D. Ellis, C.K. Vermilion, J.R. Walsh, Recombination algorithms and jet substructure: pruning as a tool for heavy particle searches. Phys. Rev. D 81, 094023 (2010). arXiv:0912.0033
Article ADS Google Scholar
D. Bertolini, P. Harris, M. Low, N. Tran, Pileup per particle identification. JHEP 10, 59 (2014). arXiv:1407.6013
Article ADS Google Scholar
M. Cacciari, G.P. Salam, G. Soyez, SoftKiller, a particle-level pileup removal method. Eur. Phys. J. C 75(2), 59 (2015). arXiv:1407.0408
Article ADS Google Scholar
M. Cacciari, G.P. Salam, G. Soyez, Use of charged-track information to subtract neutral pileup. Phys. Rev. D 92(1), 014003 (2015). arXiv:1404.7353
Article ADS Google Scholar
P. Berta, M. Spousta, D.W. Miller, R. Leitner, Particle-level pileup subtraction for jets and jet shapes. JHEP 06, 092 (2014). arXiv:1403.3108
Article ADS Google Scholar
A.J. Larkoski, S. Marzani, G. Soyez, J. Thaler, Soft drop. JHEP 05, 146 (2014). arXiv:1402.2657
Article ADS Google Scholar
M. Cacciari, J. Rojo, G.P. Salam, G. Soyez, Jet reconstruction in heavy ion collisions. Eur. Phys. J. C 71, 1539 (2011). arXiv:1010.1759
ADS Google Scholar
D.E. Soper, M. Spannowsky, Finding physics signals with shower deconstruction. Phys. Rev. D 84, 074002 (2011). arXiv:1102.3480
Article ADS Google Scholar
D.E. Soper, M. Spannowsky, Finding top quarks with shower deconstruction. Phys. Rev. D 87, 054012 (2013). arXiv:1211.3140
Article ADS Google Scholar
P. Baldi, P. Sadowski, D. Whiteson, Enhanced Higgs boson to $\tau ^+\tau ^-$ search with deep learning. Phys. Rev. Lett. 114(11), 111801 (2015). arXiv:1410.3469
Article ADS Google Scholar
CDF and D0 Collaboration, T. Aaltonen et al., Evidence for a particle produced in association with weak bosons and decaying to a bottom–antibottom quark pair in Higgs boson searches at the Tevatron. Phys. Rev. Lett. 109, 071804 (2012). arXiv:1207.6436
Z. Kang, P. Ko, J. Li, New avenues to heavy right-handed neutrinos with pair production at hadronic colliders. Phys. Rev. D 93(7), 075037 (2016). doi:10.1103/PhysRevD.93.075037. arXiv:1512.08373
The NNPDF Collaboration, L. Del Debbio, S. Forte, J.I. Latorre, A. Piccione, J. Rojo, Unbiased determination of the proton structure function f2(p) with faithful uncertainty estimation. JHEP 03, 080 (2005). arXiv:hep-ph/0501067
The NNPDF Collaboration, R.D. Ball et al., A determination of parton distributions with faithful uncertainty estimation. Nucl. Phys. B 809, 1–63 (2009). arXiv:0808.1231
The NNPDF Collaboration, R.D. Ball et al., Impact of heavy quark masses on parton distributions and LHC phenomenology. Nucl. Phys. B 849, 296–363 (2011). arXiv:1101.1300
The NNPDF Collaboration, R.D. Ball et al., A first unbiased global NLO determination of parton distributions and their uncertainties. Nucl. Phys. B 838, 136–206 (2010). arXiv:1002.4407
B.C. Allanach, D. Grellscheid, F. Quevedo, Genetic algorithms and experimental discrimination of SUSY models. JHEP 07, 069 (2004). arXiv:hep-ph/0406277
Article ADS Google Scholar
J. Rojo, J.I. Latorre, Neural network parametrization of spectral functions from hadronic tau decays and determination of qcd vacuum condensates. JHEP 01, 055 (2004). arXiv:hep-ph/0401047
Article ADS Google Scholar
S. Abel, J. Rizos, Genetic algorithms and the search for viable string vacua. JHEP 08, 010 (2014). arXiv:1404.7359
Article ADS Google Scholar
S. Nesseris, J. Garcia-Bellido, A new perspective on dark energy modeling via genetic algorithms. JCAP 1211, 033 (2012). arXiv:1205.0364
Article ADS Google Scholar

Download references

Acknowledgments

We thank F. Bishara, R. Contino, A. Papaefstathiou and G. Salam for useful discussions on the topic of Higgs pair production. We thank E. Vryonidou and M. Zaro for assistance with di-Higgs production in MadGraph5_aMC@NLO. The work of K. B. is supported by a Rhodes Scholarship. D. B., J. F. and C. I. are supported by the STFC. J. R. and N. H. are supported by an European Research Council Starting Grant “PDF4BSM”. J. R. is supported by an STFC Rutherford Fellowship and Grant ST/K005227/1 and ST/M003787/1.

Author information

Authors and Affiliations

Physics Department, University of Oxford, 1 Keble Road, Oxford, UK
J. Katharina Behr, Daniela Bortoletto, James A. Frost, Nathan P. Hartland, Cigdem Issever & Juan Rojo

Authors

J. Katharina Behr
View author publications
You can also search for this author in PubMed Google Scholar
Daniela Bortoletto
View author publications
You can also search for this author in PubMed Google Scholar
James A. Frost
View author publications
You can also search for this author in PubMed Google Scholar
Nathan P. Hartland
View author publications
You can also search for this author in PubMed Google Scholar
Cigdem Issever
View author publications
You can also search for this author in PubMed Google Scholar
Juan Rojo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan Rojo.

Appendix A: Single Higgs backgrounds

As discussed in Sect. 2, in our analysis we neglect single Higgs production processes, since they are much smaller than both the signal and the main QCD multijet backgrounds. To explicitly demonstrate this, we have generated LO samples using MadGraph5_aMC@NLO for the following single-Higgs processes:

1.
$Z(\rightarrow b\bar{b})h(\rightarrow b\bar{b})$ (electroweak)
2.
$t\bar{t}h(\rightarrow b\bar{b})$
3.
$b\bar{b}h(\rightarrow b\bar{b})$ (QCD)

For each processes, we have generated 1M events, and in Table 10 we list resulting the LO and NLO cross sections at the generation level. The subsequent decays and the corresponding branching fractions are not included in these cross sections, since these are taken care by the Pythia8 parton shower. The values of these branching fractions are listed in Table 11, corresponding to the most recent averages from the PDG. In the case of the $t\bar{t}h$ process, we consider only the fully hadronic decays of the top quark, since leptonic and semi-leptonic decays can be suppressed by means of a lepton veto.

Table 10 LO and NLO cross sections at the generation level for the single-Higgs background processes listed above, computed using MadGraph5_aMC@NLO. The subsequent decays and the corresponding branching fractions are not included in these generation-level cross sections

Full size table

Table 11 The values of the branching fractions applied to the single-Higgs background processes from Table 10, corresponding to the most updated PDG values

Full size table

Table 12 Signal and background cross sections at the end of the cut-based analysis (before the MVA is applied), in the case without PU. We separate the results into the three exclusive categories used in our analysis

Full size table

In Table 12 we show the signal and background cross sections at the end of the cut-based analysis, before the MVA is applied, in the case without PU. We separate the results into the three exclusive categories used in our analysis. From this comparison, we see that as expected, at the end of the cut-based analysis, the single-Higgs backgrounds are smaller than the QCD multijet background by several orders of magnitude. In addition, we find that already at the end of the cut-based analysis the di-Higgs signal is also larger than all the single-Higgs backgrounds in all the selection categories. Since this discrimination can only be improved by the MVA, we conclude that neglecting single-Higgs backgrounds is a reasonable approximation. From Table 12 we also observe that in the resolved and intermediate categories $Zh\rightarrow b\bar{b}b\bar{b}$ is the dominant single-Higgs background, while $t\bar{t}h(\rightarrow b\bar{b})$ is instead the most important one in the boosted category.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Funded by SCOAP³

Reprints and permissions

About this article

Cite this article

Behr, J.K., Bortoletto, D., Frost, J.A. et al. Boosting Higgs pair production in the $b\bar{b}b\bar{b}$ final state with multivariate techniques. Eur. Phys. J. C 76, 386 (2016). https://doi.org/10.1140/epjc/s10052-016-4215-5

Download citation

Received: 26 January 2016
Accepted: 17 June 2016
Published: 08 July 2016
DOI: https://doi.org/10.1140/epjc/s10052-016-4215-5

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Boosting Higgs pair production in the \(b\bar{b}b\bar{b}\) final state with multivariate techniques

Abstract

Similar content being viewed by others

Standard model Higgs boson pair production in the ( $$ b\overline{b} $$ )( $$ b\overline{b} $$ ) final state

Revisiting the non-resonant Higgs pair production at the HL-LHC

Search for non-resonant Higgs boson pair production in the $$ 2b+2\ell +{E}_{\textrm{T}}^{\textrm{miss}} $$ final state in pp collisions at $$ \sqrt{s} $$ = 13 TeV with the ATLAS detector

1 Introduction

2 Modeling of signal and background processes

2.1 Higgs pair production in gluon fusion

2.2 Backgrounds

2.3 Modeling of detector resolution

3 Analysis strategy

3.1 Jet reconstruction

3.2 Tagging of b-jets

3.3 Event categorisation

3.4 Motivation for basic kinematic cuts

3.5 Impact of pileup

4 Pre-MVA loose cut-based analysis

4.1 Cut flow and signal significance

4.2 The role of light and charm jet mis-identification

5 Multivariate analysis

5.1 Deep artificial neural networks

5.2 Input kinematic variables

5.3 MVA results

5.4 Impact of PU in the MVA

6 Conclusions and outlook

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix A: Single Higgs backgrounds

Rights and permissions

About this article

Cite this article

Navigation

Boosting Higgs pair production in the \(b\bar{b}b\bar{b}\) final state with multivariate techniques

Abstract

Similar content being viewed by others

Standard model Higgs boson pair production in the ( $$ b\overline{b} $$ )( $$ b\overline{b} $$ ) final state

Revisiting the non-resonant Higgs pair production at the HL-LHC

Search for non-resonant Higgs boson pair production in the $$ 2b+2\ell +{E}_{\textrm{T}}^{\textrm{miss}} $$ final state in pp collisions at $$ \sqrt{s} $$ = 13 TeV with the ATLAS detector

1 Introduction

2 Modeling of signal and background processes

2.1 Higgs pair production in gluon fusion

2.2 Backgrounds

2.3 Modeling of detector resolution

3 Analysis strategy

3.1 Jet reconstruction

3.2 Tagging of b-jets

3.3 Event categorisation

3.4 Motivation for basic kinematic cuts

3.5 Impact of pileup

4 Pre-MVA loose cut-based analysis

4.1 Cut flow and signal significance

4.2 The role of light and charm jet mis-identification

5 Multivariate analysis

5.1 Deep artificial neural networks

5.2 Input kinematic variables

5.3 MVA results

5.4 Impact of PU in the MVA

6 Conclusions and outlook

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix A: Single Higgs backgrounds

Appendix A: Single Higgs backgrounds

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation