Identification of boosted Higgs bosons decaying into b-quark pairs with the ATLAS detector at 13 TeV

This paper describes a study of techniques for identifying Higgs bosons at high transverse momenta decaying into bottom-quark pairs, H → b ¯ b , for proton–proton collision data collected by the ATLAS detector at the Large Hadron Collider at a centre-of-mass energy √ s = 13 TeV. These decays are reconstructed from calorimeter jets found with the anti- k t R = 1 . 0 jet algorithm. To tag Higgs bosons, a combination of requirements is used: b -tagging of R = 0 . 2 track-jets matched to the large- R calorimeter jet, and requirements onthejet mass andother jet substructurevariables. The Higgs boson tagging efﬁciency and corresponding multijet and hadronic top-quark background rejections are evaluated using Monte Carlo simulation. Several benchmark tagging selections are deﬁned for different signal efﬁciency targets. The modelling of the relevant input distributions used to tag Higgs bosons is studied in 36 fb − 1 of data collected in 2015 and 2016 using g → b ¯ b and Z ( → b ¯ b )γ event selections in data. Both processes are found to be well modelled within the statistical and systematic uncertainties.


Introduction
The Large Hadron Collider (LHC) centre-of-mass energy of 13 TeV greatly extends the sensitivity of the ATLAS experiment [1] to heavy new particles. In several new physics scenarios [2][3][4], these heavy new particles may have decay chains including the Higgs boson [5,6]. The large mass-splitting between these resonances and their decay products results in a high-momentum Higgs boson, causing its decay products to be collimated. The decay of the Higgs boson into a bb pair has the largest branching fraction within the Standard Model (SM), and thus is a major decay mode to use when searching for resonances involving high-momentum Higgs bosons (see e.g. Ref. [7]), as well as for measuring the SM Higgs boson properties. The signature of a boosted Higgs boson decaying into a bb pair is a collimated flow of particles, in this document called a 'Higgs-jet', having an energy and angular distribution of the jet constituents consistent with a two-body decay and containing two b-hadrons. The techniques described in this paper to identify Higgs bosons decaying into bottom-quark pairs have been used successfully in several analyses [8-10] of 13 TeV proton-proton collision data recorded by ATLAS.
In order to identify, or tag, boosted Higgs bosons it is paramount to understand the details of b-hadron identification and the internal structure of jets, or jet substructure, in such an environment [11]. The approach to tagging presented in this paper is built on studies from LHC runs at √ s = 7 and 8 TeV, including extensive studies of jet reconstruction and grooming algorithms [12], detailed investigations of track-jet-based b-tagging in boosted topologies [13], and the combination of substructure and b-tagging techniques applied in the Higgs boson pair search in the four-b-quark final state [14] and for discrimination of Z bosons from W bosons [15]. Gluon splitting into b-quark pairs at small opening angles has been studied at √ s = 13 TeV by ATLAS [16]. The identification of Higgs bosons at high transverse momenta through the use of jet substructure has also been studied by the CMS Collaboration and their techniques are described in Refs. [17,18].
The Higgs boson tagging efficiency and background rejection for the two most common background processes, the multijet and hadronic top-quark backgrounds, are evaluated using Monte Carlo simulation. In addition, two processes with a topology similar to the signal, Z → bb decays and g → bb splitting, are used to validate Higgs-jet tagging techniques in data at √ s = 13 TeV. In particular the modelling of relevant Higgs-jet properties in Monte Carlo simulation is compared with data. The g → bb process allows the modelling of one of the main backgrounds to be validated. The Z → bb process is a colour-singlet resonance with a mass close to the Higgs boson mass and thus very similar to the H → bb signal.
After a brief description of the ATLAS detector in Section 2 and of the data and simulated samples in Section 3, the object reconstruction, selection and labelling is discussed in Section 4. Section 5 describes relevant systematic uncertainties. The Higgs-jet tagging algorithm and its performance are presented in Section 6. Sections 7 and 8 discuss a comparison between relevant distributions in data control samples dominated by g → bb and Z(→ bb)γ and the corresponding simulated events, respectively. Finally, conclusions are presented in Section 9.
Several Monte Carlo (MC) simulated event samples were used for the optimisation of the Higgs boson tagger, estimation of its performance, and the comparisons between data and simulation.
Simulated events with a broad transverse momentum (p T ) spectrum of Higgs bosons were generated as decay products of Randall-Sundrum gravitons G * in a benchmark model with a warped extra dimension [2], G * → HH → bbbb, over a range of graviton masses between 300 and 6000 GeV. The events were simulated using the M G 5_aMC@NLO generator [22]. Parton showering, hadronisation and the underlying event were simulated with P 8 [23] using the leading-order (LO) NNPDF2.3 parton distribution function (PDF) set [24] and the ATLAS A14 [25] set of tuned parameters.
Events containing the Z(→ bb)γ and γ + jets processes were simulated with the S v2.1.1 [26][27][28][29] LO generator. The matrix elements were configured to allow up to three partons in the final state in addition to the Z boson or the photon. The Z boson was produced on-shell and required to decay hadronically. The CT10 next-to-leading-order (NLO) PDF set [30,31] was used. The ttγ MC events were modelled by M G interfaced with P 8 for showering, hadronisation and the underlying event with the LO NNPDF2.3 PDF set and the A14 underlying-event tune. Simulated events of hadronically decaying W γ were generated using S v2.1.1, with the same configuration as the one used for the Zγ sample.
To cover a large range of top-quark transverse momenta, hadronically decaying top quarks were generated using Z bosons decaying into tt pairs over a range of Z boson masses between 400 and 5000 GeV. These samples were simulated using P 8 with the LO NNPDF2.3 PDF set and the A14 underlying-event tune.
Finally, inclusive multijet events were generated using P 8, with the LO NNPDF2.3 PDF set and the A14 underlying-event tune; and with Herwig++ [32], with the CTEQ [33] PDF set and the UEEE [34] underlying event tune. To increase the number of simulated events with semimuonically decaying hadrons for the g → bb analysis, samples of multijet events filtered to have at least one muon with p T above 3 GeV and |η| < 2.8 were produced with P 8 and Herwig++ using the same PDF set and underlying-event tunes as the unfiltered multijet samples.
In all cases except events generated using S , EvtGen [35] was used to model the decays of band c-hadrons. All simulated event samples included the effect of multiple pp interactions in the same and neighbouring bunch crossings ('pile-up') by overlaying simulated minimum-bias events on each simulated hard-scatter event. The minimum-bias events were simulated with the single-, double-and non-diffractive pp processes of P 8 using the A2 tune [36] and the MSTW2008 LO PDF [37][38][39]. The detector response to the generated events was simulated with G 4 [40,41].
procedure, and matching to jets with irregular boundaries can be achieved in a way that is less ambiguous than a simple geometric matching.
Jet labelling: The performance of the tagger is evaluated on the basis of labelled large-R jets. Higgs-jets are defined as calorimeter-based large-R jets with a Higgs boson and the corresponding two b-hadrons from the Higgs boson decay found in the MC event record within ∆R = 1 of the large-R jet. Only the Higgs boson with the highest p T in the event is considered and it is required to have p T > 250 GeV and |η| < 2.0. The b-hadron must have p T above 5 GeV and |η| < 2.5. Configurations where more than one Higgs boson is found within the large-R jet are excluded. Top-jets are defined as large-R jets in which exactly one top quark is found in the MC event record within ∆R = 1 of the large-R jet.
Jet flavour labelling: The labelling of the flavour of the track-jets in simulation is done by geometrically matching the jet with truth hadrons. If a weakly decaying b-hadron with p T above 5 GeV is found within ∆R = 0.2 of the track-jet's direction, the track-jet is labelled as a b-jet. In the case that the b-hadron could match more than one track-jet, only the closest track-jet is labelled as a b-jet. If no b-hadron is found, the procedure is repeated for weakly decaying c-hadrons to label c-jets. If no c-hadron is found, the procedure is repeated for τ-leptons to label τ-jets. A jet for which no such matching can be made is labelled as a light-flavour jet.
b-jet identification: Track-jets containing b-hadrons are identified using a multivariate MV2c10 algorithm [51,57], which exploits the information about the jet kinematics, the impact parameters of tracks within jets, and the presence of displaced vertices. The training is performed on jets from tt events with b-jets as signal, and a mix of approximately 93% light-flavour jets and 7% c-jets as background. A particular b-tagging requirement on MV2c10 results in a given efficiency, known as an efficiency working point (WP). The efficiency WP is calculated from the inclusive p T and η spectra of jets from an inclusive tt sample. For example a WP with 70% efficiency corresponds to a factor of 120 in the light-quark/gluon-track-jet rejection and a factor of seven in the c-track-jet rejection. Different WPs (60%, 70%, 77% and 85%) are studied in the analyses presented in this paper and jets satisfying a particular MV2c10 criterion WP are referred to as 'b-tagged jets'.

Large-R jet mass:
To overcome the limited angular resolution for the energy deposits used to reconstruct the calorimeter-based jet mass (m calo ), an independent jet mass estimate using tracking information is developed, the 'track-assisted jet mass', m TA [48]. A weighted combination of calorimeter-based and track-assisted jet masses, m comb [48], is used in the analysis. The m comb resolution is very similar to the m calo resolution at Higgs-jet p T below 700 GeV and improves with increasing p T . Muons from semileptonic b-hadron decays do not leave significant energy deposits in the calorimeter, so they are considered separately in the calculation of the m comb observable. The resulting neutrinos are not taken into account because they are not measured by the detector directly. The four-momentum of the closest muon candidate within ∆R = 0.2 of the b-tagged track-jet is added to the four-momentum of the large-R-jet after subtraction of the muon energy loss in the calorimeter. Only the calorimeter-based component of the m comb observable is corrected [58]. The resolution of the muon-corrected Higgs-jet mass, m corr , is improved by about 10% at transverse momenta below 500 GeV, while the improvement is not as pronounced at higher p T , as was shown in Ref. [59].

Systematic uncertainties
Large-R jets: The uncertainties in the jet energy, mass, and substructure scales are evaluated by comparing the ratio of calorimeter-based to track-based measurements in dijet data and simulation [48]. The sources of uncertainty in these measurements are treated as fully correlated among p T , mass, and substructure scales. The resolution uncertainty of the large-R jet observables is evaluated in measurements documented in Ref.
[48] and is assessed by applying an additional smearing to these observables. The jet energy resolution uncertainty is estimated by degrading the nominal resolution by an absolute 2%. Similarly, the jet mass resolution is degraded by a relative 20% to estimate the jet mass resolution uncertainty. The parton-shower-related uncertainty for the g → bb analysis is estimated by comparing the nominal P 8 multijet sample with Herwig++ samples.

Flavour tagging:
The flavour-tagging efficiency and its uncertainty for band c-jets is estimated in tt events, while the light-flavour-jet misidentification rate and uncertainty is determined using dijet events [60][61][62]. Correction factors are applied to the simulated event samples to compensate for differences between data and simulation in the b-tagging efficiency for track-jets with p T < 250 GeV. Correction factors and uncertainties for c-jets and light-flavour jets are derived for calorimeter-based jets and extrapolated to track-jets using MC simulation. An additional term is included to extrapolate the measured uncertainties to p T above 250 GeV. This term is estimated from simulated events by varying the quantities affecting the flavour-tagging performance such as the impact parameter resolution, percentage of poorly measured tracks, description of the detector material, and track multiplicity per jet. The total uncertainties are 1-10%, 15-50%, and 50-100% for b-jets, c-jets, and light-flavour jets respectively.

Muon:
The uncertainties in the muon momentum scale and resolution are derived from data events with dimuon decays of J/ψ and Z bosons. In total, there are three independent components: one corresponding to the uncertainty in the inner detector track p T resolution, one corresponding to the uncertainty in the muon spectrometer p T resolution, and one corresponding to the momentum scale uncertainty [52].

Photon:
The uncertainties in the reconstruction, identification, and isolation efficiency for photons are determined from data samples of Z → γ, Z → ee, and inclusive photon events [53]. Uncertainties in the electromagnetic shower energy scale and resolution are taken into account as well [54].
Background modelling uncertainties for ttγ, γ+jets and W (→ qq)γ: These correspond to the main backgrounds in the Z(→ bb)γ studies presented in Section 8. The background modelling uncertainty for the γ+jets sample was estimated with the alternative MC generator, P 8 using the LO NNPDF2.3 PDF set and the A14 underlying event tune. The alternative sample includes LO photon plus jet events from the hard process and photon bremsstrahlung in dijet events.
In the case of the W(→ qq)γ background, the nominal samples were compared with samples produced using the M G 5_aMC@NLO generator interfaced with P 8. For the ttγ background three different sources of modelling uncertainty were considered: uncertainty due to the parton shower and hadronisation estimated by comparing the nominal samples produced using M G interfaced with P 8, with samples from M G interfaced with H 7 [32,63]; uncertainty due to different initial-and final-state radiation conditions from P 8 tunes with high or low QCD radiation activity; and uncertainty due to the choice of renormalisation and factorisation scales.
Uncertainties related to the photons and the γ+jets, W(→ qq)γ, and ttγ background modelling are applied only in the Z(→ bb)γ analysis.

Higgs-jet tagger
The Higgs-jet tagger algorithm consists of several reconstruction steps. First, the Higgs boson candidate is reconstructed as a large-R jet. Second, the b-tagging requirement is applied to track-jets associated with the large-R jet in order to select candidates corresponding to H → bb decays. Third, the b-tagged large-R jet mass can be required to be around the SM Higgs boson mass of 125 GeV. Finally, a requirement on other large-R jet substructure variables can be applied depending on the Higgs-jet tagger working point.
The signal acceptance for the first reconstruction step where the Higgs boson candidate is reconstructed as a large-R jet depends strongly on its transverse momentum. The angular separation between Higgs boson decay products can be approximated as ∆R ≈ 2m H /p T . Therefore, in most of the cases the Higgs boson decay products will fall within a single large-R jet with a radius parameter of R = 1.0 if the Higgs boson p T is at least 250 GeV. The signal acceptance shown in Figure 1 is determined as the fraction of Higgs bosons in simulation which are reconstructed and labelled as a Higgs-jet following the definition in Section 4. Only Higgs bosons with p T > 250 GeV, |η| < 2.0, and associated b-hadrons from its decay that have p T > 5 GeV and |η| < 2.5 are considered. The Higgs boson acceptance is around 50% at 250 GeV, where the jet p T resolution have a significant impact as well, and increases to 95% for transverse momenta above 750 GeV.
The Higgs-jet tagging efficiency is defined as the number of Higgs-jets passing a given selection requirement divided by the total number of Higgs-jets. The background rejection is defined as the inverse of the efficiency for a background jet to pass the given selection requirement.

Two-step sample reweighting
To construct the signal sample, all graviton samples are combined. To allow a valid comparison between the signal efficiency and the background rejection, the large-R jet p T spectrum of the combined graviton sample is reweighted to the reconstructed multijet p T spectrum for the Higgs boson tagger performance studies in a two-step procedure. The same two-step reweighting procedure is also applied to the Z → tt background sample. The multijet spectrum is chosen as a reference because of its smoothly falling p T spectrum being representative for many analyses. During the first step of the reweighting the highest-p T truth Higgs-jet is used, whereas for the second reweighting step the highest-p T reconstructed Higgs-jet is used. The reconstructed Higgs-jet and the truth Higgs-jet must both contain the highest-p T Higgs boson to mitigate effects from initial-state radiation (ISR).
In the first step, the p T spectrum of the truth Higgs-jet in the combined signal sample is reweighted to the p T spectrum of the reconstructed large-R jet in the multijet sample. In the second step, the reconstructed Higgs-jet p T spectrum is reweighted to the reconstructed large-R jet p T spectrum in the multijet sample. A one-step reweighting using the reconstructed Higgs-jet p T spectrum results in large weights for jets with p T much larger or smaller than half of the graviton mass. Furthermore, the reconstructed Higgs-jet can contain additional energy which does not stem from the Higgs boson decay, such as ISR, energy missing due to neutrinos, 'out-of-cone' effects, or trimming. The frequency of these effects depends on the Higgs boson boost, i.e. on the graviton mass, introducing a dependence on the choice of simulated graviton masses used in the combined signal sample. The second step is needed to account for a residual difference between reconstructed and truth Higgs-jet transverse momenta.

Flavour-tagging working points
To apply b-tagging to identify H → bb decays, the track-jets are matched to the large-R jets by ghost association as described in Section 4. At least two track-jets must be matched to the large-R jet for the double-b-tagging benchmarks, and at least one track-jet in the case of single-b-tagging benchmarks. The track-jet is considered to be b-tagged if its MV2c10 b-tagging discriminant value is larger than a given threshold value. These threshold values are defined for several b-tagging working points: 60%, 70%, 77% and 85% b-jet tagging efficiencies.
The following b-tagging benchmarks are studied: • double b-tagging: the two highest-p T track-jets must both pass a given b-tagging requirement; • asymmetric b-tagging: the track-jet which is more consistent with the interpretation of being a b-jet must pass a given fixed 60%, 70%, 77%, or 85% working point, while the b-tagging requirement on the second track-jet is varied; • single b-tagging: at least one of the two highest-p T track-jets must pass the b-tagging requirement; • leading single b-tagging: the highest-p T track-jet must pass the b-tagging requirement.
The Higgs-jet efficiencies and background rejections as a function of the jet p T for the 70% double-b-tagging benchmark are shown in Figure 2. The signal efficiency varies from 52% at low p T to about 5% for 1500 < p T < 2500 GeV. The drop in efficiency at high transverse momenta due to the increasing collimation and eventual merging of the two b-jets can be partially recovered using single-b-tagging working points as indicated in Figure 6. The multijet (top-jet) rejection is relatively constant over the whole p T range and is about 250 (60) at low p T and 500 (50) at high p T .
The multijet and top-quark background rejections as a function of the Higgs tagging efficiency for various b-tagging benchmarks are shown in Figure 3. Plots on the left show the performance for Higgs-jet p T  Figure 3: The multijet (top) and the top-jet (bottom) rejection as a function of the Higgs tagging efficiency for large-R jet p T above 250 GeV (left) and above 1000 GeV (right) for various b-tagging benchmarks defined in Section 6.2. The stars correspond to the 60%, 70%, 77% and 85% b-tagging WPs (from left to right). The curves for the double-b-tagging and asymmetric-b-tagging working points coincide over a large range of Higgs-jet efficiency.

ATLAS Simulation
above 250 GeV and plots on the right show the performance for Higgs-jet p T above 1000 GeV. The double-b-tagging and asymmetric-b-tagging selections give the best background rejection in a large range of Higgs tagging efficiencies. At high Higgs-jet efficiencies above ∼90% (∼55%) for Higgs-jet transverse momenta above 250 (1000) GeV the single-b-tagging benchmark shows a higher multijet and top-quark background rejection. To achieve such a high Higgs-jet efficiency, a very loose double-b-tagging or asymmetric-b-tagging requirement is needed, which results in a low light-flavour jet rejection. The double-b-tagging and asymmetric b-tagging working points do not reach an efficiency of 100% due to a requirement of at least two track-jets. In the case of asymmetric b-tagging, Higgs tagging efficiencies are below 100% because of the fixed b-tagging working point requirement on one of the track-jets. The drop in performance is pronounced at high jet transverse momenta due to the lower efficiency to reconstruct two subjets and the decrease in the MV2c10 b-tagging performance [64].

Mass window optimisation
The reconstructed Higgs boson mass distribution provides a powerful way to distinguish the Higgs boson signal from background processes. The muon-corrected combined mass described in Section 4 is used to impose the Higgs boson mass requirement and select large-R jets with a mass around the SM Higgs boson mass. The Higgs boson mass resolution, σ m , varies as a function of the reconstructed large-R jet p T , so the mass window is optimised and parameterised as a function of Higgs-jet p T . Two working points are defined: • tight mass window, containing 68% of Higgs-jets; • loose mass window, containing 80% of Higgs-jets.
The mass window is defined as the smallest window containing the given fraction of Higgs-jets. The out-of-cone effects, ISR and the missing neutrinos from semileptonic b-hadron decays have an impact on the mass resolution that is similar to their impact on the p T response; therefore, the mass window optimisation depends on the applied Higgs-jet selection and on the Higgs-jet p T spectrum. Figure 4 shows the reconstructed Higgs boson mass distribution for Higgs-jets with a p T in the range 350 to 500 GeV. The mass region below 50 GeV is affected by grooming and out-of-cone effects. In the case of asymmetric H → bb decays, where one of the b-hadrons carries a large fraction of the Higgs boson p T , the large-R jet's axis is close to the direction of the higher-p T b-hadron. The decay products of the lower-p T b-hadron could be removed by grooming or not fully captured in the large-R jet. That leads to smaller Higgs-jet masses. The mass region above 150 GeV suffers from additional contributions from initial-state radiation. A large fraction of the ISR is suppressed by selecting the reconstructed Higgs-jet containing the highest-p T Higgs boson candidate. However, the high mass tails are still substantial in high Higgs-jet p T regions and affect the Higgs boson mass window definition.
In order to suppress the impact of the tails on the mass window definition, a fit of the mass distribution is performed. The fit function is chosen empirically to describe the core of the mass distribution, while mitigating the tails. The chosen function is a linear combination of a Landau function to describe the low mass part of the distribution and a Gaussian function to describe the high mass part.
The fit is performed in 12 Higgs-jet p T bins across the entire range of transverse momentum from 250 to 2500 GeV.  A toy MC simulation is used as input to model the mass window and to estimate the statistical uncertainty on the mass window determination. This toy MC simulation samples the fit functions mentioned above and is performed many times in each p T slice. For each toy MC sample, the mass window is calculated by selecting the smallest window containing the required signal fraction. The final upper and lower boundaries for a given p T slice are found by averaging over the upper and lower boundaries from the corresponding toy MC samples. The mean defines the position and the RMS the uncertainty of the window boundaries in each p T slice. Using the mean and RMS from the toy MC samples as input, the mass window is parameterised as a function of the Higgs-jet p T using the fit function: The jet mass depends primarily on the energies of the jet constituents and their angular separations. Consequently, there are two competing effects: the improving precision of the calorimeter energy scale with increasing jet p T and the decreasing ability of the calorimeter granularity to resolve individual energy deposits due to increasing decay collimation with increasing jet p T . Fit results are shown in Figure 5 for tight and loose mass window working points.
The Higgs boson acceptance times efficiency is presented in Figure 6. In addition to the truth-matching requirements defined for Figure 1, the double-and single-b-tagging, tight, loose and no mass window working points are applied. The double-b-tagging requirement in particular leads to a significant drop in the Higgs boson acceptance times efficiency at high Higgs boson transverse momenta, where the efficiency to reconstruct two track-jets and the double-b-tagging efficiency decrease quickly. Figure 7 shows the rejection of the multijet background as a function of the Higgs-jet p T . Applying a combination of loose mass window and double-b-tagging requirements improves the rejection by a factor of about four relative to the corresponding benchmark without the mass requirement shown in Figure 2. The tight mass window requirement leads to an additional improvement of about 30-50% in the background rejection. The efficiency of the mass window requirements changes by a few percent after the application of the double b-tagging-requirement due to the dependence of the b-tagging efficiency on the jet kinematics.
The corresponding rejection of the multijet background as a function of the Higgs-jet efficiency is shown in Figure 8 for different Higgs-jet p T ranges, b-tagging benchmarks, and mass window requirements. Application of the mass window requirement improves the performance of the tagger substantially. For a fixed signal efficiency of 40% and large-R jet p T above 250 GeV, the multijet rejection rises from roughly   Figure 9 shows the hadronic top-quark background rejection as a function of the Higgs-jet p T for combinations of mass window and b-tagging benchmarks. The background rejection is higher for multijets than for hadronically decaying top quarks. The rejection varies between 120 (170) at low p T and 1000 (1300) at high p T for the loose (tight) mass window and double-b-tagging benchmark. In comparison with the benchmarks without the mass window requirement, the rejection is improved by about one order of magnitude, but the shape as function of p T is fundamentally different. At low p T , not all decay products of the top quark are contained in the large-R jet. Thus the reconstructed jet mass has a long tail towards low jet masses with a substantial fraction of jets within the mass window of the tagger. Hence, the rejection at low jet p T is not improved as much as at high jet p T . The tight mass window requirement further improves the background rejection by 15-40% as function of p T .
The rejection of the hadronic top-quark background as a function of the Higgs tagging efficiency is shown in Figure 10. For the loose mass window requirement, an improvement from 140 to 200 is found at a fixed Higgs-jet efficiency of 40%, whereas for the tight mass window a smaller improvement from 140 to 160 is observed relative to no mass requirement for large-R jet p T above 250 GeV. The rejection values are lower for double b-tagging and asymmetric b-tagging for large-R jet p T above 1 TeV, and for high Higgs tagging efficiency single and single leading b-tagging are better options.  Figure 10: Rejection of the top-jet background as a function of the Higgs tagging efficiency for loose (top) and tight (bottom) mass window requirements for large-R jet p T above 250 GeV (left) and above 1000 GeV (right) for various b-tagging benchmarks. The stars correspond to the 60%, 70%, 77% and 85% b-tagging WPs (from left to right). The curves for the double-and asymmetric-b-tagging working points coincide over a large range of Higgs-jet efficiency.   Many jet substructure variables exist that can capture features of a jet's internal structure and can potentially give additional discrimination power against backgrounds from multijet production and top-quark decays.

ATLAS Simulation
They are based on the jet constituents and exploit quantities such as transverse momentum and angular distance between the constituents. They give information about different jet attributes such as shape (e.g.   Figure 11: Multijet background rejection at 80% signal efficiency (ε S = 80%) for a variety of substructure variables using different benchmarks in terms of b-tagging strategy and transverse momentum range. The z-axis colour scale represents the absolute value of the linear-correlation coefficient, |C(m corr , v JSS )|, between the jet mass and the jet substructure variables. The selection efficiency is determined relative to the mass window and b-tagging benchmark working points defined in Sections 6.3 and 6.2 respectively. sphericity, aplanarity) or number of axes (e.g. two-subjettiness τ 2 ). Ratios are often used to avoid scale dependence of substructure variables. Table 1 lists the jet substructure variables that are investigated in this study, together with a short description and references. Secondary selections on jet mass and the flavour-tagging discriminant for the track-jets, MV2c10, are also considered relative to the previously defined mass window and b-tagging benchmark working points and their performance is compared with that achieved by the application of additional jet substructure variables to these benchmarks. Two categories of secondary selections are used for the b-tagging discriminant MV2c10, and these exploit the potential of tighter b-tagging working points where the criteria are tightened for both track-jets (double b-tagging) or for only one track-jet (single b-tagging).
For all secondary selection variables an optimal two-sided range is chosen for each variable and each benchmark working point. Searches of new-physics resonances typically use tagging definitions with relatively high signal efficiency, around 40% (75%) for Higgs-jets with p T = 500 GeV for double (single) b-tagging and a mass requirement. Hence, the two-sided range for a secondary variable which contains the smallest fraction of background but at least 80% of signal events is determined. Figures 11 and 12 show the background rejection for a 80% retention of signal efficiency relative to the jet mass and b-tagging benchmark working points for multijet and hadronic top-quark backgrounds, respectively. The matrices in Figures 11 and 12 show the background rejection for substructure variables, secondary jet mass, and MV2c10 b-tagging discriminant on the y-axis for the four benchmark points of the Higgs-jet tagger on   Figure 12: Hadronic top-quark background rejection at 80% signal efficiency (ε S = 80%) for a variety of substructure variables using different benchmarks in terms of b-tagging strategy and transverse momentum range. The z-axis colour scale represents the absolute value of the linear-correlation coefficient, |C(m corr , v JSS )|, between the jet mass and the jet substructure variables. The selection efficiency is determined relative to the mass window and b-tagging benchmark working points defined in Sections 6.3 and 6.2 respectively. the x-axis. The z-axis colour scale represents the absolute value of the linear-correlation coefficient of the substructure variable and the jet mass for the corresponding background. For each benchmark, five variables with the largest background rejection are selected and all selected variables for every benchmark are shown.
In general, there are improvements across the various benchmark points. The background rejection is often higher for the multijet background than for the hadronically decaying top quarks. The secondary b-tagging discriminant is very powerful, and there are only a few areas of phase space where substructure yields larger improvements than an optimised b-tagging working point. However, substructure variables are an interesting alternative to tighter b-tagging working points for large-R jet p T above 1 TeV. For the multijet background ( Figure 11), a tighter requirement on the double b-tagging achieves a background rejection of 3.62 (1.35) in the inclusive range p T > 250 GeV for the single-b-tagging (double-b-tagging) working point. In contrast, the improvement from the double-b-tagging discriminant is small for working points for p T > 1000 GeV, achieving a background rejection of 1.29 (1.37) for the single-b-tagging (double-b-tagging) working point. At large p T the background rejection for substructure variables varies between 2.12 (D β=1 2 ) and 1.55 (Fox-Wolfram ratio F 1 /F 0 ) for a signal efficiency of 80%. In general, correlations with the jet mass greater than 10% are observed for most of the jet substructure variables. The Fox-Wolfram ratios F 3 /F 0 and F 1 /F 0 show the lowest correlations: less than 1% for most of the benchmarks.
The room for improvement is smaller if secondary jet substructure selections on top of the jet mass window and b-tagging benchmark working points are used in the case of the hadronic top-quark background ( Figure 12). A tighter double-b-tagging working point reaches a factor of 4.81 (2.34) background rejection in the inclusive range p T > 250 GeV for the single-b-tagging (double-b-tagging) working point. In contrast, the improvement from the double-b-tagging discriminant is small at large p T , achieving a background rejection of 1.18 (1.57) for the single-b-tagging (double-b-tagging) working point. The background rejection for other variables varies between 1.84 (Fox-Wolfram ratio F 2 /F 0 and exclusive dipolarity) and 1.24 (k t ∆R) for a signal efficiency of 80%. Compared with the multijet background the correlations between the jet mass and the jet substructure variables are smaller in the case of the top-quark background, especially for p T > 1000 GeV. The Fox-Wolfram ratio F 4 /F 1 shows the lowest correlation: less than 1% for most of the benchmarks.
In conclusion, the application of jet substructure variables improves the background rejection moderately, while better improvements are observed for high transverse momenta. Furthermore, it is important to take into account the correlation between the large-R jet mass and the substructure variables since requirements on the substructure variables sculpt the jet mass distribution [79,80].

Modelling tests in g → bb data
Multijet events enriched in b-jets, which predominately originate from gluon to bb production, are used to evaluate the b-tagging efficiency in data and simulation as well as the modelling of jet substructure variables. The multijet background is one of the main backgrounds for searches in fully hadronic final states, for example the Higgs boson pair search in the four-b-quark final state [81]. This background also provides a unique opportunity to validate the modelling of the double-b-jets in a large data sample. Events with one large-R jet with two ghost-associated track-jets ('g → bb candidate jet') and one recoiling ISR small-R jet ('recoil jet', j recoil ) are used for this study.

Event selection
Events are required to have a primary vertex that has at least two tracks, each with p T > 500 MeV [82]. The primary vertex with the highest p 2 T sum of associated tracks is selected. A single-small-R-jet trigger with an online E T threshold of 380 GeV was used to collect the data. An offline R = 0.4 recoil jet with p T above 500 GeV is matched to the jet which fired the trigger.
Non-collision backgrounds originating from calorimeter noise, beam-halo interactions, or cosmic rays can lead to spurious calorimeter signals. This effect is suppressed by applying the criteria described in Ref. [83].
Selected events are required to have at least one large-R jet with p T > 500 GeV and |η| < 2.0, for which the small-R jet trigger is fully efficient and unbiased. The large-R jet must have at least two ghost-associated R = 0.2 track-jets. To enrich the event sample in jets containing b-hadrons, it is required that at least one of the ghost-associated track-jets be matched to a muon. The highest-p T track-jet matched to a muon is called the muon-tagged jet, j trk µ . The matching is performed using a geometric ∆R < 0.2 requirement between the track-jet's axis and the muon. The highest-p T jet among the remaining track-jets matched by ghost association to the large-R jet is called the non-muon jet, j trk non-µ . The highest-p T large-R jet satisfying these criteria is selected as gluon-jet candidate. Furthermore, the event must satisfy ∆R( j recoil , j trk µ ) > 1.5. This requirement ensures that the triggering jet and the gluon-jet candidate are well separated.

Flavour fraction corrections
To reduce discrepancies between data and MC simulation in the flavour composition of the large-R jet, the flavour fractions of the sample are determined from the data before applying b-tagging. Each large-R jet carries two flavours, that of j trk µ and j trk non-µ , leaving nine possible flavour combinations for the large-R jet (each track-jet can be a b-jet, c-jet, or light-flavour jet; B, C and L abbreviations are used in the following). The long decay length of band c-hadrons makes the signed impact parameter significance, s d 0 , of tracks associated with a jet a good discriminating variable for different jet flavours. The s d 0 of a track is defined as: where d 0 is the track's transverse impact parameter relative to the primary vertex, σ(d 0 ) is the uncertainty in the d 0 measurement, and s j is the sign of d 0 relative to the track-jet's axis, depending on whether the track crosses the track-jet's axis in front of or behind the primary vertex. For a given track-jet, the average s d 0 is built from the three highest-p T tracks associated with the track-jet. The tracks from band c-hadron decays are expected to have higher p T than tracks in light-flavour jets, because the heavy-flavour hadrons carry on average a larger fraction of the jet energy. The requirement that s d 0 is built from the three highest-p T tracks helps to distinguish them from light-flavour jets, which may have tracks with large s d 0 values, e.g. from Λ and K s decays.
The impact parameter resolution depends on the intrinsic track resolution, the traversed detector material, the detector alignment, and other effects. To determine the impact parameter resolution in data, minimum-bias, dijet, and Z+jets events are used. The impact parameter resolution is extracted in fine bins of track p T and η with an iterative method described in Ref. [51]. The simulation is corrected to match the measured impact parameter resolution as a function of track p T and η by using a Gaussian function to smear the impact parameter resolution in the simulation.  Figure 14 shows an example of the flavour fraction fit to the s d 0 distributions of j trk µ and j trk non-µ for one particular bin of the track-jet transverse momenta. The fit uncertainty includes the statistical uncertainty of the templates and is evaluated using toy MC simulations. The flavour fraction corrections relative to the simulated fractions vary between 0.7 and 1.7 in the jet p T bins with a statistical uncertainty below 10%.
After correcting for the observed flavour-pair fractions the level of agreement between data and MC simulation is evaluated in the selected event sample before and after b-tagging is applied to the track-jets. The 70% double-b-tagging working point is used.

b-tagging results
Since the flavour fractions are corrected in the MC simulation, differences between the data and predictions after the b-tagging can be attributed to a difference between data and MC simulation in the dependence of the b-tagging performance on the large-R jet topology, in particular on the topology with two closely spaced track-b-jets. Figure 15 shows the flavour-fit-corrected p T spectrum of the large-R jet as well the j trk µ and j trk non-µ before and after b-tagging. As seen in the ratio plots, there is good agreement within uncertainties between data and MC simulation. The shape differences between data and MC simulations especially for the j trk non-µ transverse momentum can be partially explained by the difference observed between P 8 and Herwig++ MC simulations. The double-b-tagging rate is defined as the number of selected large-R jets with at least two track-jets, two of which are b-tagged, divided by the number of all selected large-R jets with at least two track-jets. Figure 16 shows the double-b-tagging rate as a function of the large-R jet p T . Data and MC simulation agree within the uncertainties. The performance of the double b-tagging applied to two track-jets seems not to depend on the large-R jet topology with two closely spaced track-b-jets, and the default b-tagging calibration described in Section 5 can be applied for this analysis.

Jet substructure results
As possible variations of Higgs taggers may make use of the large-R jet p T , and substructure variables such as mass, n-subjettiness, or D β=1 2 , it is important to ensure that these variables are well modelled by MC simulations. The distributions of kinematic and substructure variables are shown in Figure 17, for double-b-tagged jets after the flavour-fit correction. As seen in the ratio plots, there is acceptable agreement within uncertainties between data and MC simulations.
The relative impact of the systematic uncertainties on the yields of signal and background are presented in Table 2. The dominant signal uncertainty is the modelling uncertainty followed by the b-tagging-related uncertainties. The b-tagging-related uncertainties (misidentification of light-flavour jets and c-jets as b-jets) are dominant for background. The dominant uncertainties are shown separately in Figure 17. The difference in the shapes between data and MC simulations can be partially explained by the difference observed between P 8 and Herwig++ MC simulations.

Modelling tests in Z → bb data
As mentioned in the introduction, the Z → bb process is a colour-singlet resonance with a mass close to the Higgs boson mass, so kinematic properties of the Z → bb and H → bb events are expected to be similar. Events with one double-b-tagged large-R jet ('Z → bb candidate jet') and a photon that are back-to-back are used for this study. The photon requirement improves the signal-to-background ratio in comparison with the fully hadronic final state.

Event selection
Events are selected using a single-photon trigger with a transverse energy (E T ) threshold of 140 GeV and loose photon identification requirements [21]. This trigger is non-prescaled for the entire data-taking period and is fully efficient for offline photons with E T > 175 GeV. The same primary vertex and jet-cleaning requirements are applied as for the g → bb study, described in Section 7.
Exactly one photon and at least one large-R jet are required to be present in the event. The large-R jet is required to have p T > 200 GeV, |η| < 2.0, and mass greater than 30 GeV. A jet-photon overlap removal procedure is applied, removing photons within ∆R = 1.0 of the large-R jet. The large-R jet with the highest    p T is chosen as the Z → bb candidate. The two highest-p T track-jets that are associated with the Z → bb candidate are required to be identified as b-jets using the 70% working point.

Background estimate
The dominant SM background in this analysis is γ+jets with gluon-to-bb splitting. The contribution from the Standard Model ttγ and W γ processes is smaller than that from the γ + jets process. Other background contributions such as jets faking photons, electrons faking photons, and tt are found to be negligible. To extract the Z → bb and γ+jets normalisations, the Z → bb candidate jet mass distribution is fitted to data. Both templates are taken from the MC simulation as described in Section 3. The tt + γ and W(qq) + γ background contributions estimated from MC simulation are subtracted before the fit to data. The jet mass variable is used in the fit because the difference between the shapes of the Z → bb and γ+jets templates is larger than for other substructure variables. The extracted normalisations are applied to all other distributions. Figure 18 shows the Z → bb candidate jet mass, p T , D β=1 2 , and τ 21 distributions in data and MC simulation. Systematic uncertainties summarised in Section 5 are applied to the templates, and for each systematic variation the fit to data is performed. The fit uncertainty and the contribution for each of the systematic uncertainties summed in quadrature are presented in Figure 18. The relative impact of the systematic uncertainties on the Z → bb and γ+jets yields are presented in Table 3. The observed data/MC discrepancies are covered by systematic uncertainties.

Jet substructure results
Further requirements on the jet substructure variables can improve the purity of the selection. Figure 19 shows the Z → bb candidate jet mass after further selections: τ 21 < 0.45 or D  Figure 20 shows the D  and τ 21 distributions. Events with two b-tagged track-jets are used. The γ+ jets background and the Z → bb signal are normalised to data by applying a scale factor of 1.51 and 0.98, respectively. The upward-or downward-pointing arrows indicate that the Data/Fit ratio is out of the histogram range for these bins.  and τ 21 distributions after requiring the Z → bb candidate jet mass to be between 70 and 110 GeV. Events with two b-tagged track-jets are used. The γ+ jets background and the Z → bb signal are normalised to data by applying a scale factor of 1.51 and 0.98, respectively. The upward-or downward-pointing arrows indicate that the Data/Fit ratio is out of the histogram range for these bins.
The Z(→ bb)γ process provides a unique possibility to validate the Higgs-jet tagging algorithm given the similarity of the H → bb and Z → bb processes. For the current integrated luminosity of 36 fb −1 , the dominant uncertainties are the statistical and systematic uncertainties of the jet scales and jet mass for the Z → bb process and the γ+jets modelling uncertainties. To reduce the dominant uncertainties, a larger dataset is needed. Within the uncertainties the studied jet substructure variables are modelled well by the signal plus background MC simulations.

Conclusions
Techniques to identify Higgs bosons at high transverse momenta decaying into bottom-quark pairs are described in this paper. The identification is based on the b-tagging of R = 0.2 track-jets matched to the Higgs-jet and requirements placed on the Higgs-jet mass and other substructure variables. The modelling of the relevant input distributions is studied in 36 fb −1 of 13 TeV proton-proton collision data recorded by the ATLAS detector at the LHC in 2015 and 2016.
The choice of b-tagging working point for an analysis depends on the required background rejection rate and on the Higgs-jet p T range relevant for the analysis. The double-b-tagging working points give the best background rejection for a large range of the Higgs-jet-tagging efficiency but the efficiency decreases faster with increasing Higgs-jet p T than it does for single-b-tagging working points. At high efficiencies above ∼ 90% (∼ 55%) for Higgs-jet p T above 250 (1000) GeV the single-b-tagging selection provides better background rejection.
Application of the Higgs boson mass window requirement improves the performance of the Higgs-jet tagger substantially. The multijet background rejection improves by a factor of about five by adding a loose (corresponding to 80% signal efficiency) mass window requirement on top of the double-b-tagging requirement. The tight (corresponding to 68% signal efficiency) mass window requirement leads to an additional 30-50% improvement in the multijet background rejection. The multijet background rejection has a weak dependence on the jet p T for both mass window requirements. The hadronic top-quark rejection depends strongly on the jet p T . The rejection varies between 60 and 230 for the loose mass window and double-b-tagging working points. The largest improvement in the top-quark rejection for the tight mass window is about 70% and corresponds to the high p T and double-b-tagging working point.
The performance of the additional jet substructure variables depends on the chosen Higgs-jet tagger working point. The jet mass and other substructure variables are often correlated and the double-b-tagging requirement enforces a two-prong structure. In general, the background rejection is larger for the multijet background than for hadronically decaying top quarks but still below two for the individual variables and the loose mass window working point. The b-tagging discriminant is very powerful but the jet substructure variables offer an alternative to the b-tagging working points. Especially at high Higgs-jet p T the efficiency to reconstruct two track-jets and the double-b-tagging efficiency decrease quickly. A combination of several substructure variables using multivariate methods could potentially increase the gain in performance in this phase space.
The modelling of representative Higgs-jet properties is tested in ATLAS data for g → bb and Z(→ bb)γ event selections. Good modelling is observed given the size of the available data sample and the systematic uncertainties. In particular, the use of jet substructure variables is shown to improve the purity of the Z(→ bb)γ event selection.       [75] S. Brandt, C. Peyrou, R. Sosnowski and A. Wroblewski, The Principal axis of jets. An Attempt to analyze high-energy collisions as two-body processes, Phys. Lett. 12 (1964) 57.