Search for Higgs boson pair production in the $b\bar{b} b\bar{b}$ final state from $pp$ collisions at $\sqrt{s} = 8$ TeV with the ATLAS detector

A search for Higgs boson pair production $pp \to hh$ is performed with 19.5 fb$^{-1}$ of proton--proton collision data at $\sqrt{s}=$ 8 TeV, which were recorded by the ATLAS detector at the Large Hadron Collider in 2012. The decay products of each Higgs boson are reconstructed as a high-momentum $b\bar{b}$ system with either a pair of small-radius jets or a single large-radius jet, the latter exploiting jet substructure techniques and associated $b$-tagged track-jets. No evidence for resonant or non-resonant Higgs boson pair production is observed. The data are interpreted in the context of the Randall--Sundrum model with a warped extra dimension as well as the two-Higgs-doublet model. An upper limit on the cross-section for $pp \to G^{*}_{\mathrm{KK}} \to hh \to b\bar{b} b\bar{b}$ of 3.2 (2.3) fb is set for a Kaluza--Klein graviton $G^{*}_{\mathrm{KK}}$ mass of 1.0 (1.5) TeV, at the 95\% confidence level. The search for non-resonant Standard Model $hh$ production sets an observed 95\% confidence level upper limit on the production cross-section $\sigma(pp \to hh \to b\bar{b}b\bar{b})$ of 202 fb, compared to a SM prediction of $\sigma(pp \to hh \to b\bar{b}b\bar{b}) = 3.6 \pm 0.5$ fb.


Introduction
The discovery of a Higgs boson (h) [1,2] at the Large Hadron Collider (LHC) consistent with the predictions of the Standard Model (SM) [3,4] motivates an enhanced effort to search for new physics via the Higgs sector. Many new physics models predict rates of Higgs boson pair production significantly higher than the SM rate [5][6][7]. For example, TeV-scale resonances such as the first Kaluza-Klein (KK) excitation of the graviton, G * KK , predicted in the bulk Randall-Sundrum (RS) model [8,9] or the heavy neutral scalar, H, of two-Higgs-doublet models (2HDM) [10] can decay into pairs of Higgs bosons, hh. Enhanced non-resonant pp → hh production can arise in models such as those with new, light, coloured scalars [11], or direct tthh vertices [12,13].
ATLAS has carried out a search in the bbγγ final state [14], setting limits on both resonant (masses between 260 GeVand 500 GeV) and non-resonant Higgs boson pair production. CMS has searched in the multi-lepton and multi-lepton+photons final-states in the context of 2HDM extensions of the Higgs sector [15]. CMS has also searched for narrow resonances in the bbbb channel [16].
Recent phenomenological studies have demonstrated that despite the fully hadronic final state being subject to a large multijet background, searches for new physics in the pp → hh → bbbb process have good sensitivity for both resonant [17,18] and non-resonant signals [19]. One contributing factor to this sensitivity is the high expected branching ratio for h → bb. The analysis presented in this paper is designed to search for two high-momentum bb systems with masses consistent with m h , where each bb system contains two jets identified as containing b-hadrons (the jets are "b-tagged"). Compared to a more inclusive bbbb final-state analysis, this topology has many benefits due to the large required momentum and angular separation between the two bb systems: (i) excellent rejection of all backgrounds; (ii) highly efficient triggering using b-tagged multijet triggers; and (iii) negligible combinatorial ambiguity in forming Higgs boson candidates.
Two Higgs boson reconstruction techniques, which are complementary in their acceptance, are presented. The first-"resolved"-technique reconstructs Higgs boson candidates from pairs of nearby anti−k t jets [20] with radius parameter R = 0.4, each b-tagged with a multivariate b-tagging algorithm [21]. This resolved technique offers good efficiency over a wide range of Higgs boson momenta and so can be used to reconstruct di-Higgs-boson resonances with mass m X in the range between 500 and 1500 GeV. The sensitivity is best for this technique in the range 500 ≤ m X 1100 GeV. It can be seen in Fig. 1 however, that the acceptance for four b-tagged anti−k t R = 0.4 jets decreases for m X 1200 GeV. This loss of acceptance is due to the increased boost of the Higgs boson, which reduces the average separation between the b-andb-quarks from the Higgs boson decay, ∆R = (∆η) 2 + (∆φ) 2 , to values below 0.4. This motivates the use of a second-"boosted"-Higgs boson reconstruction technique that maintains acceptance for these higher-mass resonances through the use of jet substructure techniques. The Higgs boson candidate is reconstructed as a single, trimmed [22] anti−k t R = 1.0 jet which must be associated with two b-tagged anti−k t R = 0.3 track-jets [23]. The use of track-jets with a smaller R parameter allows Higgs bosons with higher transverse momentum (p T ) to be reconstructed.
The analysis is performed with the dataset recorded by ATLAS in 2012 at √ s = 8 TeV, corresponding to an integrated luminosity of 19.5 fb −1 . For the non-resonant search, a counting experiment is performed and the results are interpreted in the context of SM non-resonant Higgs boson pair production. This interpretation is only carried out for the resolved analysis due to its higher sensitivity to such a signal. For the resonant search, a fit to the reconstructed mass spectrum of hh candidates is carried out and the results are interpreted in the context of both bulk RS G * KK (spin-2) and 2HDM CP-even H boson (spin-0) 8π is the reduced Planck mass), which cover much of the possible parameter space [8]. The 2HDMs considered have CP-conserving scalar potentials (Type-I, Type-II, Lepton-specific and Flipped) [10], in the regime m H = m A = m H ± , with the potential parameter that mixes the two Higgs doublets m 2 12 = m 2 A tan β/(1 + tan 2 β). Interpretations are made as a function of tan β and cos (β − α). The parameter tan β is the ratio of vacuum expectation values of the two Higgs doublets and α is the mixing angle between the two neutral CP-even scalars.

The ATLAS detector
ATLAS is a multi-purpose particle physics experiment [24] at the LHC. The detector 1 consists of inner tracking devices surrounded by a superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer. The inner tracking system provides charged-particle tracking in the pseudorapidity region |η| < 2.5 and vertex reconstruction. It consists of a silicon pixel detector, a silicon microstrip tracker, and a straw-tube transition radiation tracker. The system is surrounded by a solenoid that produces a 2 T axial magnetic field. The central calorimeter system consists of a liquid-argon electromagnetic sampling calorimeter with high granularity covering |η| < 3.2 and a steel/scintillator-tile calorimeter providing hadronic energy measurements in the central pseudorapidity range (|η| < 1.7). The endcap and forward regions are instrumented with liquid-argon calorimeters for both electromagnetic and hadronic energy measurements up to |η| = 4.9. The muon spectrometer is operated in a magnetic field provided by air-core superconducting toroids and includes tracking chambers for precise muon momentum measurements up to |η| = 2.7 and trigger chambers covering the range |η| < 2.4. A three-level trigger system is used to select interesting events [25]. The Level-1 trigger reduces the event rate to below 75 kHz using hardware-based trigger algorithms acting on a subset of detector information. Two software-based trigger levels, referred to collectively as the High-Level Trigger (HLT), further reduce the event rate to about 400 Hz using information from the entire detector.

Data and simulation samples
The data sample used in this analysis, after applying data quality requirements that include the availability of b-jet triggers, corresponds to an integrated luminosity of 19.5±0.5 fb −1 . The uncertainty in the integrated luminosity (2.8%) is derived following the same methodology as that detailed in Ref. [26]. The data sample is selected by a combination of five triggers requiring multiple jets or b-jets, where b-jets are identified by a dedicated HLT b-tagging algorithm. This combination of triggers is > 99.5% efficient for signal events passing the offline selection, across the full mass range considered.
Simulated Monte Carlo (MC) event samples are used to model the different signals, as well as the small background contributions from top-quark pair production (tt) and Z+jets events. The dominant multijet background source is estimated directly from data. Signal samples for both models studied are generated with Madgraph v1.5.1 [27,28], interfaced to Pythia v8.175 [29] for parton showering, hadronization and underlying-event simulation. The Higgs boson mass is set to 125 GeV. The CTEQ6L1 [30] leading-order (LO) parton distribution functions (PDFs) are used. Table 1 provides the calculated cross-sections and widths for different signal model parameters. The bulk RS model predictions are calculated at leading order using Madgraph. The 2HDM prediction corresponds to the cross-section for gluon-fusion production plus b-associated production plus vector-boson-fusion production. The gluon-fusion cross-section is calculated using SusHi v1.3.0 [31][32][33][34][35][36] at next-to-next-to-leading-order (NNLO) accuracy in QCD. For b-associated production, an empirical matching of the four-and five-flavour scheme is used [37]. The four-flavour cross-section is calculated at next-to-leading-order (NLO) accuracy in QCD following Refs. [38,39], while the five-flavour cross-section is calculated at NNLO in QCD using SusHi. The vector-bosonfusion cross-section at NNLO accuracy in QCD and NLO in electroweak is taken from Ref. [40] and corrected by a multiplicative factor of cos 2 (β − α) [10]. The 2HDM branching ratios are calculated using 2HDMC v1.6.4 [41].
For the G * KK → hh → bbbb signal, three sets of MC samples are generated for each of the three coupling 4 Resolved analysis 4

.1 Event reconstruction
Jets are reconstructed from topological clusters of calorimeter cell energy deposits at the electromagnetic scale [63] using the anti−k t jet clustering algorithm, with a radius parameter of R = 0.4. The effects of pile-up on jet energies are accounted for by a jet-area-based correction [64]. The jets are then calibrated using p T -and η-dependent calibration factors based on MC simulations and the combination of several in situ techniques applied to data [65]. Following this, the jets undergo Global Sequential calibration [63] which reduces flavour-dependent differences in calorimeter response. If a muon with p T > 4 GeV and |η| < 2.5 is found within a cone of ∆R = 0.4 around the jet axis, the four-momentum of the muon is added to that of the jet (after correcting for the expected energy deposited by the muon in the calorimeter). Such muons are reconstructed by combining measurements from the inner tracking and muon spectrometer systems, and are required to satisfy tight muon identification quality criteria [66]. Jets with a significant energy contribution from pile-up interactions [67] are removed using tracking information. For jets with p T < 50 GeV and |η| < 2.4, the p T sum of tracks matched to the jet is calculated and it is required that at least 50% of this p T sum is due to tracks originating from the primary vertex. 2 Jets with |η| < 2.5 are b-tagged using the properties of the tracks associated with them, the most important being the impact parameter (defined as the track's distance of closest approach to the primary vertex in the transverse plane) of each track, as well as the presence and properties of displaced vertices. The MV1 b-tagging algorithm [21] used in this analysis combines the above information using a neural network and is configured to achieve an efficiency of 70% for tagging b-jets, 3 with a charm-jet rejection of approximately 5 and a light-quark or gluon jet rejection of around 140, as determined in an MC sample of tt events. The b-tagging efficiency in the simulation is scaled to reproduce the one measured in data [68].

Selection
The combined acceptance times efficiency (A × ε) at different stages of the event selection is shown in Fig. 2 as a function of resonance mass for the resonant signal models. The event selection begins with the requirement of at least four b-tagged jets, each with p T > 40 GeV (shown in Fig. 2 as "4 b-tagged jets"). SM non-resonant Higgs boson pair production has a softer Higgs boson p T spectrum than the m X = 500 GeV resonances, resulting in a lower A × ε = 4.9% for this requirement. The four highest-p T btagged jets are then used to form two dijet systems, demanding that the angular distance, ∆R, between the jets in each of the dijets is smaller than 1.5. The transverse momentum of the leading (in p T ) dijet system, p lead T , is required to be greater than 200 GeV, while the subleading dijet system must have p subl T > 150 GeV. In the rare case that a jet could be used to create more than one dijet which satisfies the above kinematic requirements, the dijet with the highest mass is chosen. Thus two unique dijet systems, with no jets in common, are selected (shown as "2 dijets" in Fig. 2). For SM non-resonant Higgs boson pair production, after this requirement, A × ε = 1.2%. The impact of different decay kinematics can be seen by comparing Fig. 2b to Fig. 2a: the decay of spin-0 H bosons gives a softer Higgs boson p T spectrum than in the case of  the spin-2 G * KK decay (due to the differing angular distributions of hh), resulting in lower acceptance for these kinematic requirements at low resonance mass.
The resolved analysis considers a large range of resonance masses, 500 ≤ m X ≤ 1500 GeV. Due to the differing kinematics, the optimal selection for low-mass resonances differs from the optimum for higher masses. To increase the analysis sensitivity, three requirements which vary with the reconstructed resonance mass are used. These selection requirements are optimized simultaneously, by performing a three-dimensional scan of threshold values, using the statistical-only exclusion limit (Sect. 6.2) as the objective function. There are mass-dependent requirements (shown in Fig. 2 as "MDC") on the minimum p T of the leading and subleading dijets as well as on the maximum difference in pseudorapidity, ∆η dijets , between them. These requirements are written in terms of the four-jet mass m 4j expressed in GeV: if m 4j < 520 GeV, 0.235m 4j + 28 GeV otherwise, The different m 4j thresholds shown above are chosen to obtain a continuously varying set of requirements. The requirement on ∆η dijets leads to a lower acceptance for H compared to G * KK for m X ≥ 700 GeV because of the effect of the boson spin on the angular distribution of its decay products.
After selecting two dijets that satisfy the mass-dependent criteria, tt constitutes approximately 10% of the total background. This tt background predominantly comprises events where both top quarks decayed hadronically. These hadronic decays lead to three jets for each top quark: one b-jet directly from the top decay and two from the decay of the W boson. Since the probability to mis-tag charm-jets is much higher than the probability to mis-tag light-jets, in the majority of cases the dijet is formed from the b-jet and a charm-jet from the decay of the W boson. In order to reduce the tt background, jets not already used in the formation of the two dijets ("extra jets") in the event are used to reconstruct W boson and top quark candidates by combining them with each of the dijets. These extra jets are required to have p T > 30 GeV, |η| < 2.5, and ∆R < 1.5 relative to the dijet. The W boson candidates are reconstructed by adding the four-momentum of each of the possible extra jets to the four-momentum of the jet in the dijet system with the lowest probability of being a b-jet according to the multivariate b-tagging algorithm. Top quark candidates are then reconstructed by summing the dijets with each of the extra jets. The compatibility with the top quark decay hypothesis is then determined using the variable: where m W and m t are the invariant masses of the W boson and top quark candidates, σ m W = 0.1 m W , σ m t = 0.1 m t ,m W = 80.4 GeV andm t = 172.5 GeV. The values of σ m W and σ m t reflect the dijet and three-jet system mass resolutions. If either dijet in an event has X tt < 3.2 for any possible combination with an extra jet, the event is rejected. This requirement reduces the tt background by ∼ 60%, whilst retaining ∼ 90% of signal events (shown as "tt Veto" in Fig. 2).
The event selection criteria described above are collectively referred to as the "4-tag" selection requirements. These requirements select 1891 events.
Following the 4-tag selection, a requirement on the leading and subleading dijet masses (m lead 2j and m subl 2j , respectively) is used to define the signal region. The central value of this region corresponds to the median values of the narrowest dijet mass intervals that contain 90% of the MC signal (these were found to be stable with resonance mass). The definition of the signal region is where the 0.1 m 2j terms represent the widths of the leading and subleading dijet mass distributions. The signal region is defined as X hh < 1.6. This corresponds to the kinematical requirements illustrated by the inner region in Fig. 3, albeit with data from the 2-tag sample shown. It is optimized to maximize the expected sensitivity of the search. The acceptance times efficiency of the full selection, including this signal region requirement, is shown in Fig. 2 as "Signal Region". For SM non-resonant Higgs boson pair production, the full selection has an A × ε = 0.60%.
The final step of the Higgs boson pair resonant production search is to perform a fit to the four-jet mass m 4j in the signal region. The sensitivity of this fit is increased by improving the m 4j resolution in this region, using the constraint that the two dijet masses should equal the Higgs boson mass, i.e. m lead 2j = m subl 2j = m h . To this end, each dijet four-momentum is multiplied by a correction factor α dijet = m h /m dijet . This leads to an improvement of ∼ 30% in the signal m 4j resolution-with a significant reduction of the low-mass tails caused by energy loss-with little impact on the background.

Background estimation
After the 4-tag selection described above, about 95% of the remaining background in the signal region is expected to originate from multijet events, which are modelled using data. The remaining ∼ 5% of the background is tt events. The tt yield is determined from data, while the m 4j shape is taken from MC simulation. The Z+jets contribution is < 1% of the total background and is modelled using MC simulation. The background from all other sources-including processes featuring Higgs bosons-is negligible.

Multijet background
The multijet background is modelled using an independent data sample selected by the same trigger and selection requirements as described in Sect. 4.2, except for the b-tagging requirement: only one of the two selected dijets has to be formed from b-tagged jets, while the other dijet can be formed from jets that are not b-tagged. This "2-tag" selection yields a data sample comprising 485377 events, 98% of which are multijet events and the remaining 2% are tt. The predicted contamination by the signal is negligible.
This 2-tag sample is normalized to the 4-tag sample and its kinematical distributions are corrected for differences introduced by the additional b-tagging. These differences arise because the b-tagging efficiency as well as the charm-and light-jet rejection vary as a function of jet p T and η, the various multijet processes contribute in different fractions, and the fraction of events passed by each trigger path changes. The normalization and kinematic corrections are determined using a signal-free sideband region of the m lead 2j -m subl 2j plane, in dedicated samples collected without mass-dependent requirements, which increases the statistical precision of the kinematic corrections. The resulting background model is verified and the associated uncertainties are estimated using a control region. The sideband and control regions are shown in Fig. 3. The sideband region is defined as: while the control region is defined as the region between the signal and sideband regions. These definitions are chosen to be orthogonal to the signal region and to give approximately equal event yields in the sideband and control regions.
The normalization of the multijet background prediction is set by scaling the number of events in each region of the 2-tag sample by the following factor, µ QCD , calculated in the sideband region: where N To predict the distributions of the multijet background in each region, the predicted tt and Z+jets 2-tag distributions are first subtracted from the 2-tag data sample distribution before the distribution is scaled by µ QCD .
The correction for the kinematic differences between 2-tag and 4-tag samples is performed by reweighting events in the 2-tag sample. The weights are derived in the sideband region from linear fits to the ratio of the total background model to data for three kinematic distributions which are found to have the largest disagreement between 2-tag and 4-tag events: the leading dijet p T , the ∆R separation between the jets in the subleading dijet, and the ∆R separation between the two dijets. The reweighting is done using one-dimensional distributions, but is iterated so that correlations between the three variables are approximately accounted for. Three iterations are found to be sufficient. After the correction process, there is good agreement between the background model and sideband region data in kinematic variables that were not explicitly corrected. Systematic uncertainties in the normalization and shape of the multijet background model in the signal region are assessed using control-region data, as described in Sect. 4.4.

tt background
The tt background is described using a hybrid model: the normalization is derived from data in a tt control sample, while the shape is taken from MC simulation because there are too few events in the tt control sample to describe the shape precisely enough.
The tt control sample is formed from events which pass the 4-tag selection, except for the top veto, which is reversed: if either of the dijets fails the top veto, the event enters the tt control sample. This selection leads to a sample of 41 events within the signal region of the tt control sample, of which ∼ 50% are tt and ∼ 50% multijet. The multijet background component is estimated using the same methods as used for the nominal selection, but with a wider control region in order to reduce the sideband region tt fraction. After subtracting the multijet background, the tt control sample yield is then extrapolated to predict the tt yield in the nominal signal region, N tt , using the following equation: where N CS tt is the number of events in the signal region, after subtraction of the multijet background, within the tt control sample, and t is the efficiency for a selected dijet in a tt event to pass the top veto. This equation relies on the assumption that the t of each dijet in the event is uncorrelated, an assumption validated in tt MC simulation. The t is measured using an independent "semileptonic tt" data sample that has a high tt purity. Events in this sample are selected by requiring one dijet candidate to pass the nominal selection with p T > 150 GeV and one "leptonic top-quark" candidate. The leptonic top quark candidate is defined using a reconstructed muon and one b-tagged jet. This b-tagged jet is required to be distinct from jets in the dijet candidate, and the muon is required to have p T > 25 GeV, be isolated, and fall within a cone of radius 1.2 around the b-tagged jet. The leptonic top quark candidate is required to have p T > 180 GeV, where the leptonic top p T is defined as the vector sum of the b-jet p T , the muon p T , and the missing transverse momentum in the event. The latter is reconstructed from energy deposits in the calorimeter, including corrections for identified jets, electrons and muons. The tt veto efficiency is then measured as the fraction of the reconstructed dijet candidates which passed the tt veto, yielding Equation (3) gives a data-driven tt background prediction of 5.2 ± 2.6 events in the signal region after the full selection. The uncertainty is dominated by the statistical uncertainty in N CS tt , with a smaller contribution from the uncertainty in t .
Due to the limited number of events in the tt control sample, the m 4j shape of the tt background is modelled using MC simulation. However, despite the use of a large tt sample, very few events pass the full 4-tag selection. Therefore, the tt shape is derived from MC simulation using the "2-tag" selection, with a systematic uncertainty assigned to cover differences between the 2-tag and 4-tag m 4j distributions.

Systematic uncertainties
Two classes of systematic uncertainties are evaluated: those affecting the modelling of the signal and those affecting the background prediction.
The signal modelling uncertainties comprise: theoretical uncertainties in the acceptance, uncertainties in the jet energy scale (JES) and resolution (JER), and uncertainties in the b-tagging efficiency.
The theoretical uncertainties considered arise from initial-and final-state radiation modelling (ISR and FSR), PDF uncertainties and uncertainty in the LHC beam energy. These are estimated using particlelevel samples generated using the same generator configurations as the nominal signal samples but with appropriate variations, assessing the difference in yields after the full analysis selection. The ISR and FSR uncertainty is evaluated by varying the relevant parton shower parameters in Pythia 8. The PDF uncertainty is estimated by taking the maximum difference between the predictions when using MSTW2008nlo [69], NNPDF2.3 [70] and CTEQ6L1. The uncertainty due to the beam energy is determined by varying The JES systematic uncertainty is evaluated using 15 separate and orthogonal uncertainty components, which allow for the correct treatment of correlations across the kinematic bins [65]. The JER uncertainty is evaluated by smearing jet energies according to the systematic uncertainties of the resolution measurement performed with data [65]. For b-jets with p T < 300 GeV the uncertainty in the b-tagging efficiency is evaluated by propagating the systematic uncertainty in the measured tagging efficiency for b-jets [68], which ranges from 2% to 8% depending on b-jet p T and η. However, for the higher resonance masses considered in this analysis, there are a significant number of events containing at least one b-jet with p T > 300 GeV. The systematic uncertainties in the tagging efficiencies of these jets are derived from MC simulation and are larger, reaching 24% for p T > 800 GeV.
Systematic uncertainties in the normalization and shape of the multijet background model are assessed in the control region. Table 3 shows the estimated background yields in the control and sideband regions. The control region background prediction agrees with the observed data within the data statistical uncertainty of ±3.5%. To further test the robustness of the background estimation and the assumptions behind it, predictions are made with different sideband and control region definitions and different b-tagging requirements on the 2-tag sample. Redefinitions of the sideband and control region changed the kinematic composition of these regions, enhancing the sideband region in either high mass or low mass dijets and therefore altering the kinematic corrections that are applied. These variations induce a maximum change of ±6% in the estimated multijet yield and so the uncertainty is set to this value. Different b-tagging requirements on the b-tagged dijet in the 2-tag sample are used in order to change the composition of the sample and to vary the degree of b-tagging-related kinematic bias. No additional uncertainty is required.
The uncertainty in the description of the multijet m 4j distribution is determined by comparing the total background prediction to data in the control region, as shown in Fig. 4. Good agreement in the shape is observed and a straight line fit to the ratio of the distributions gives a slope consistent with zero. This fit, along with its uncertainties, is shown in the bottom panel of Fig. 4. The uncertainty in the multijet background shape is defined using the uncertainty in the fitted slope.
The uncertainty in the tt normalization is dominated by the statistical uncertainty of the yield in the tt control sample, with a subdominant contribution from the uncertainties in the top veto efficiency, t , leading to a total uncertainty of ±50%. The uncertainty in the MC-derived tt m 4j distribution is dominated by the uncertainty associated with using the shape after the 2-tag selection, rather than the 4-tag selection. This uncertainty is assessed by comparing the 2-tag to 4-tag MC predictions in the signal region. A straight line fit to the ratio of the normalized distributions is made and used to define a shape uncertainty in the same way as the multijet background. Due to the large statistical uncertainties of the 4-tag tt sample, the assigned shape uncertainty is large: ∼ 30% and ∼ 100% in the event yield at m 4j = 400 GeV and 1500 GeV, respectively. Table 4 shows the relative impact of the uncertainties in the event yields. Figure 5 shows the relative impact on the expected limit for σ pp → G * KK → hh → bbbb . The calculation of the expected limit is described in Sect. 6.2. It can be seen that for resonance masses below 700 GeV, the effect on the limit is dominated by the multijet description, with a small contribution from the tt background since both backgrounds are predominately at low mass. Above m X = 700 GeV, the uncertainty associated with the modelling of the b-tagging efficiency has the largest impact, since the larger high-p T uncertainties have an increasingly important effect with mass. Table 5 shows the predicted number of background events in the signal region, the number of events observed in the data, and the predicted yield for two potential signals. The numbers of predicted background events and observed events are in excellent agreement. Figure 6 shows a comparison of the predicted m 4j background distribution to that observed in the data. The predicted background agrees with the observed distributions, with no significant deviation.  Figure 5: The individual impact of the systematic uncertainties considered in the resolved analysis on the expected σ pp → G * KK → hh → bbbb 95% confidence level exclusion limit, as a function of graviton mass. The calculation of the expected limit is described in Sect. 6.2. Only the mass-dependent uncertainties are shown. The impact is the ratio of the limit calculated using all systematic uncertainties sources to the limit calculated using all systematic uncertainty sources excluding those under investigation.

Event reconstruction
The boosted analysis differs from the resolved analysis primarily by the use of large-radius jets designed to contain the decay products of a single h → bb decay. Those large-radius jets, denoted by the subscript J in the remainder of this paper, are reconstructed from locally calibrated topological clusters of calorimeter cells [63] using the anti−k t jet clustering algorithm with a radius parameter of R = 1.0. To minimize the Table 5: The number of predicted background events in the hh signal region for the resolved analysis, compared to the data. Uncertainties correspond to the total uncertainties in the predicted event yields. The yield for two potential signals, SM non-resonant Higgs boson pair production and a 500 GeV G * KK in the bulk RS model with k/M Pl = 1 are shown, with the uncertainties taken from Table 4 Events / 20 GeV  impact of energy depositions due to pile-up and the underlying event, the jets are trimmed [22]. This trimming algorithm reconstructs subjets within the large-R jet using the k t algorithm with radius parameter R sub = 0.3, then removes any subjet with p T less than 5% of the large-R jet p T . Further calibration of both the energy and mass scales is applied as a function of p T and η as determined from simulation and in situ measurements [65].
A novel aspect of the boosted technique presented here is the use of track-jets [23] to identify the presence of b-quarks inside the large-R jet. Such track-jets are built solely from tracks with p T > 0.5 GeV and |η| < 2.5, satisfying a set of hit and impact parameter criteria to make sure that those tracks are consistent with originating from the primary vertex, thereby reducing the effects of pile-up. Track jets are reconstructed using the anti−k t algorithm with R = 0.3. Flavour-tagging of those track-jets proceeds in the same way as for the R = 0.4 calorimeter jets used in the resolved analysis described in the previous section, except for a slightly looser requirement on the output of the MV1 neural network for a track-jet to be b-tagged. This leads to b-jets being b-tagged with an efficiency of 74%, with a charm-jet rejection factor of approximately 4 and a light-quark or gluon jet rejection factor of around 65, as determined in an MC sample of tt events. The b-tagging efficiency for track-jets in the MC simulation is adjusted based on studies of tt events in the data (Sect. 5.4).

Selection
The combined acceptance times efficiency at different stages of the event selection for the boosted analysis is shown in Fig. 7.
Events are required to contain at least two large-R jets with p T > 250 GeV and |η| < 2.0. To suppress contamination from tt events, the leading jet is additionally required to have p T > 350 GeV. This ensures that the top-quark decay products are typically fully contained in a single large-R jet with mass close to that of the top quark. These requirements are shown in Fig. 7 as "2 large-R jets". Only the leading and subleading large-R jets are retained for further consideration.
Track jets are associated with large-R jets using "ghost association" [64,72,73]. Each of the leading and subleading large-R jets must have at least two track-jets ghost-associated with their respective untrimmed parents, where the track-jets must have p T > 20 GeV and |η| < 2.5, as well as be consistent with originating from the primary vertex of the event (shown in Fig. 7 as "4 track-jets"). The drop in the A × value at masses above 1500 GeV is due to the decrease in the angular separation between the two track-jets from the h → bb decay to below ∆R = 0.3.
To suppress contamination from multijet events, the two selected large-R jets in the event are required to have a separation |∆η| < 1.7. This requirement (shown in Fig. 7 as "∆η") has only a small impact on the signal acceptance since high-mass resonances tend to produce jets that are more central than those from multijet background processes.
Selection of h → bb candidates proceeds by requiring that both the leading and subleading track-jets associated with each of the two large-R jets satisfy the b-tagging selection (shown in Fig. 7 as "4 b-tagged jets").
A final correction to the large-R jet four-momentum is applied to account for semileptonic b-hadron decays. If a muon passing the requirements outlined in Sect. 4.1 is ghost-associated with any of the selected b-tagged track-jets, its four-momentum is added to that of the large-R jet. If more than one muon is associated with a given track-jet, the muon closest to the track-jet axis is used. This correction improves the mass resolution for large-R jets in signal MC simulation, especially for the subleading jet.
The last requirement used to select signal event candidates is to require that the large-R jet mass is consistent with the Higgs boson mass. This requirement is defined identically to that for the resolved analysis in Eq. (1), except for the replacement of the small-R dijet mass with the large-R jet mass. The signal region is defined by the requirement X hh < 1.6. This final selection is shown in Fig. 7 as "Signal Region".

Background estimation
After the 4-tag selection described in Sect. 5.2, the background composition is similar to that of the resolved analysis. Multijet events comprise approximately 90% of the total background and are modelled entirely using data. The remaining ∼ 10% of the background is tt events. The tt yield is determined using data, while the m 2J shape is taken from MC simulation. The Z+jets contribution is < 1% of the total background and is modelled using MC simulation. The background from all other sources-including processes featuring Higgs bosons-is negligible.
Estimation and validation of the background described below relies on two data samples defined as follows.
• The "4-tag" sample corresponds to the set of events that satisfy all the requirements detailed in Sect. 5.2, except that the final requirement on the mass of the leading and subleading large-R jets is not applied.
• The "2+3-tag" sample is identical to the 4-tag sample except for having only two or three of the four track-jets b-tagged. For events with only two b-tagged track-jets, both are required to be associated with the same large-R jet.
Both samples are further subdivided based on the large-R jet masses, with each subsample having a sideband region to determine the multijet background kinematics and a control region to validate the background estimate. The control region is defined by requirements on the mass of the leading and subleading large-R jets of 95 < m lead J < 160 GeV and 85 < m subl J < 155 GeV respectively, while excluding the signal region defined by X hh < 1.6. The sideband region is complementary to the signal and control regions. Figure 8 illustrates the sideband and control regions with data from the 2+3-tag sample.
The choice of control region (and consequently sideband region) ensures that the multijet background can be estimated by extrapolation of event yields and kinematic properties from the 2+3-tag sample to the 4-tag sample with a normalization given by the relative event yields in the sideband region. Furthermore, the control region is chosen such that event kinematics in that region are representative of the kinematics in the signal region. are the numbers of events in the 4-tag tt and Z+jets MC samples. The parameter µ QCD corresponds to the ratio of multijet event yields in the 4-tag and 2+3-tag data samples, as defined in Eq. (2), except for including both 2-and 3-tag events in the denominator. Finally, the parameter α tt is a scale factor designed to adjust the tt event yield from the MC simulation. Both µ QCD and α tt are extracted from a binned likelihood fit to the leading large-R jet mass distribution obtained in the sideband region of the 4-tag data sample, as depicted in Fig. 9. Due to the large minimum p T requirement for the leading large-R jet, much of the tt contribution is concentrated at high mass close to the top-quark mass. In this fit, the multijet distribution is extracted from the 2+3-tag data sample, after subtraction of the tt and Z+jets contributions predicted by the MC simulation. The tt and Z+jets distributions in the sideband region of the 4-tag data sample are taken from the MC simulation, but the Z+jets contribution is very small and its distribution is added to the multijet distribution for the fit. The resulting fit values are µ QCD = 0.0071 ± 0.0007 and α tt = 1.44 ± 0.50 with a correlation coefficient of −0.67 between these two parameters. yield and kinematics is further validated by testing in the control region of the 4-tag data sample. Good agreement is observed between the data and the predicted background in various kinematical distributions for leading and subleading large-R jets, as well as for the dijet mass, as shown in Fig. 10b. The shapes of the tt kinematical distributions in the signal region are determined from the MC simulation requiring only three b-tagged track-jets instead of four due to the limited MC sample size. The number of tt events is then normalized to the expected yield in the 4-tag sample times α tt . It was checked that this does not introduce a bias discernible with the statistical precision of the tt MC sample.

Systematic uncertainties
Systematic uncertainties can be grouped into two classes: those affecting modelling of the signal as extracted from simulation and those arising from the background estimate.
The signal modelling is affected by two main sources of experimental uncertainty. One is related to large-R jets and the other is related to the efficiency for b-tagging track-jets. For large-R jets, the following uncertainties are accounted for: jet energy scale and resolution, as well as jet mass scale (JMS) and resolution (JMR). In the kinematic region relevant to this analysis, the JES uncertainty is below 2% and that for JMS is ∼ 2-5%. The ES uncertainty is derived with the γ-jet balance method for p T < 800 GeV and the track-jets double-ratio method for p T > 800 GeV, as described in Ref. [72]. The latter method is also used for the derivation of JMS uncertainties in the full p T range. An uncertainty of 20% is applied to account for modelling of the jet energy and mass resolutions. The magnitude of this resolution uncertainty is estimated from studies of boosted W boson production performed using the 2012 data. Jet energy and mass uncertainties are treated as uncorrelated in the statistical analysis. The uncertainty in modelling the b-tagging efficiency for the track-jets used in this analysis is applied to the signal and Z+jets MC samples. It is extracted as a function of p T using the tag-and-probe method on a sample of dilepton events from semileptonic tt decays. The resulting uncertainty varies between 2% and 7%, with the largest value obtained for track-jets with p T > 100 GeV. This uncertainty includes the following effects: statistical precision of the calibration data sample, choice of event generator and parton shower for the simulated tt sample, initial-and final-state radiation, and flavour composition. For p T > 250 GeV, the uncertainties must be evaluated using MC simulation due to the small number of data events. Consequently, the uncertainties increase, reaching 14% for p T > 600 GeV. Studies in a tt data sample with a single-lepton+jets final state indicate that the presence of nearby jets does not have a measurable effect on the b-tagging efficiency and thus no additional uncertainty is required for nearby jets.
In addition to purely statistical sources of uncertainty, the background estimate is sensitive to the following other sources. The multijet background normalization is validated with the observed yield in the control region and the statistical uncertainty of this test is included as a systematic uncertainty. The shape of the tt background used in the fit shown in Fig. 9 is varied by extracting the shape from MC samples with zero, one, two or three b-tagged track-jets. Similarly, the uncertainty in the shape of other tt kinematical distributions is extracted from those samples. The uncertainty in the shape of the multijet background extracted from the sideband region of the 2+3-tag sample is constrained by the level of agreement between the background prediction and the observed data in the control region following the procedure described in Sect. 4.4. Good agreement is observed between the data and the predicted background in both the sideband and control regions of the 4-tag sample as shown in Table 6.
Systematic uncertainties in both the background and signal event yields are summarized in Table 7. A 2.8% luminosity uncertainty is applied to the Z+jets background and to the signal samples. The JER/JES/JMR/JMS uncertainties are applied to signal, tt and Z+jets samples. The track-jet b-tag uncertainty is applied only to the signal samples as the normalization and shape differences in the tt sample are accounted for through other sources of systematic uncertainty.
Theoretical uncertainties affecting the signal acceptance are also considered, as described in Sect. 4.4. These sources do not have significant dependence on the assumed resonance mass and the largest contribution is found to be due to the ISR modelling.
The uncertainty in the multijet event yield is derived from the difference between the predicted and observed multijet yields in the control region. This source of uncertainty is dominated by the statistical  uncertainty in that region. The tt entry in Table 7 accounts for the shape uncertainty in the simulated tt leading-jet mass distribution in the sideband region used to fit for µ QCD and α tt . This uncertainty is determined by comparing the shape of the 4-tag and 2-tag tt distributions. Finally, the "Bkgd stat" accounts for the statistical uncertainties in the extraction of µ QCD and α tt . Uncertainties in the m 2J shape of the multijet and tt backgrounds are not listed in Table 7, as they do not affect the event yield, but are accounted for in the statistical analysis. Figure 11 presents the impact of each source of systematic uncertainty on the expected cross-section limit for the production of G * KK as a function of resonance mass with the choice k/M Pl = 1. These values are obtained following the statistical analysis described below while neglecting each source of uncertainty in turn. The multijet background uncertainty dominates for resonance masses below 1000 GeV, with b-tagging, large-R jet mass and the number of sideband data events for the background estimate becoming  Figure 11: The individual relative impact of the systematic uncertainties considered in the boosted analysis on the expected σ pp → G * KK → hh → bbbb 95% confidence level exclusion limit, as a function of graviton mass. The calculation of the expected limit is described in Sect. 6.2. Only the mass-dependent uncertainties are shown. The impact is the ratio of the limit calculated using all systematic uncertainties sources to the limit calculated using all systematic uncertainty sources excluding those under investigation.

Results of the boosted analysis
A total of 34 events is observed in the data whereas the background expectation is estimated to be 25.7±4.2, see Table 8 for a breakdown of the various sources of background. The significance of this excess of events in the data is evaluated below.
The dijet mass distribution in the signal region is shown in Fig. 12. For this distribution and the statistical analysis, the estimated background prediction from multijet (tt) events is fit to an exponential function at masses above 900 (800) GeV and the associated uncertainty is propagated to the statistical analysis.

Results
The results from the analyses in Sects. 4.5 and 5.5 are interpreted separately using the statistical procedure described in Ref. [1] and references therein. Hypothesized values of µ, the global signal strength factor, are tested with a test statistic based on the profile likelihood ratio [74,75]. In the profile likelihoods, the maximum likelihood values are obtained with the systematic uncertainties treated as independent, Gaussian or log-normal constraint terms. The statistical analysis described below is performed using data from the signal region solely. In the case of the search for non-resonant hh production, only the number of events passing the final selection is used whereas the m 4j or m 2J distributions are used in the case of the search for hh resonances.

Background-only hypothesis tests
Tests of the background-only hypothesis (µ = 0) are carried out to determine if there are any statistically significant local excesses in the data. The significance of an excess is quantified using the local p 0 , the probability that the background could produce a fluctuation greater than or equal to the excess observed in data. A global p 0 is also calculated for the most significant discrepancy, using background-only pseudo-experiments to derive a correction for the look-elsewhere effect across the mass range tested.
In the case of the resolved analysis, the largest deviation from the background-only hypothesis is found to be 2.1 σ for a pp → H → hh → bbbb signal with fixed Γ H = 1 GeV at m 4j = 1200 GeV. This corresponds to a global significance of 0.42 σ. The significance of any deviation for a G * KK signal with k/M Pl = 1 is very similar, albeit with slightly smaller local discrepancies as a result of the larger signal m 4j width.
In the case of the boosted analysis, the largest local deviation corresponds to the data excess at m 2J ∼ 900 GeV apparent in Fig. 12, with a local significance of 2.6 σ for pp → G * KK → hh → bbbb with k/M Pl = 1. The global significance of this deviation corresponds to 0.78 σ.
Given these low significance values, the results of both analyses are consistent with the background-only hypothesis. Of the 117 events selected in the data by either the resolved or boosted analysis, only four events are common to both.

Exclusion limits
The data are used to set upper limits on the cross-sections for the different benchmark signal processes. Exclusion limits are based on the value of the statistic CL s [76], with a value of µ regarded as excluded at 95% confidence level (CL) when CL s is less than 5%.
The non-resonant search is performed using the resolved analysis, since it has better sensitivity than the boosted analysis. Using the SM hh non-resonant production as the signal model, the observed 95% CL upper limit is σ(pp → hh → bbbb) = 202 fb. This can be compared to the inclusive SM prediction (as defined in Sect. 3) of σ(pp → hh → bbbb) = 3.6 ± 0.5 fb.
For the resonant Higgs boson pair production search, the resolved and boosted analyses offer their best sensitivity in complementary resonance mass regions. Figure 13 shows the expected and observed crosssection upper limits from each analysis for pp → G * KK → hh → bbbb within the bulk RS model with k/M Pl = 1. The resolved analysis can be seen to give a more stringent expected exclusion limit for resonance masses up to 1100 GeV, while the boosted analysis offers better sensitivity beyond that mass. This motivates a simple combination of the separate exclusion limits from the resolved and boosted analyses. For each of the signal models, the limit for each mass point is taken from the analysis which offers the most stringent expected exclusion.   Figure 14 shows the combined 95% CL upper limits for three signal models: pp → G * KK → hh → bbbb within the bulk RS model with k/M Pl = 1 and 2, and the pp → H → hh → bbbb with a fixed Γ H = 1 GeV. The most stringent limits of σ pp → X → hh → bbbb ∼ 3 fb are set in the range 900 < m X < 1600 GeV,    where there is little expected background and either the resolved or boosted analysis provides good signal acceptance. The excluded mass ranges for the bulk RS KK graviton are shown in Table 9.
The excluded mass range for the 2HDM is parameter dependent, principally because the production cross-section varies, but also because the exclusion limit depends on the parameter-dependent H boson width, Γ H . The theoretical cross-section used to determine the 95% CL excluded regions is the sum of the cross-sections of gluon-fusion production, vector-boson-fusion production and b-associated production.
The effects of Γ H are accounted for by creating m H distributions with a range of widths, 0 < Γ H /m H ≤ 0.5, for each m H considered. These distributions are based on parameterizations which include resolution and acceptance effects combined with a Breit-Wigner line-shape. A grid of limits are calculated with each of these mass distributions. Then, for each point in m H , cos (β − α), and tan β space, the cross-section limit is determined by interpolating between the appropriate limits, based on the Γ H given by the model for that point. For the widest signals considered, the exclusion limits worsen by up to a factor of three.

Conclusions
Two searches for Higgs boson pair production with the ATLAS detector at the LHC using the bbbb final state have been presented: one reconstructs Higgs boson candidates from pairs of nearby anti−k t b-tagged jets with R = 0.4; the other reconstructs Higgs boson candidates using trimmed anti−k t jets with R = 1.0 matched to two b-tagged anti−k t track-jets with R = 0.3. Thanks to the high expected h → bb branching ratio and the large background rejection factors offered by the boosted dijet topology, the sensitivity for Higgs boson pair production is high, with a mass reach spanning the range between 500 and 2000 GeV. There is no evidence for any signal in 19.5 fb −1 of pp collision data with √ s = 8 TeV. The largest deviation from the background-only hypothesis has a global significance of only 0.78 σ. The observed 95% CL upper limit on σ pp → X → hh → bbbb is 3.2 (2.3) fb for narrow resonances with a mass of 1.0 (1.5) TeV.
Constraints are placed on several benchmark models. For the bulk RS model with k/M Pl = 1, KK gravitons in the mass range 500 ≤ m G * KK ≤ 720 GeV are excluded at the 95% CL. For non-resonant signals, using Standard Model hh non-resonant production as the benchmark, the observed 95% CL upper limit on σ(pp → hh → bbbb) is 202 fb, in good agreement with the expected exclusion. This is to be compared to a SM prediction of 3.6 ± 0.5 fb.
The ATLAS Collaboration