Search for Higgs boson pair production in the $\gamma\gamma b\bar{b}$ final state with 13 TeV $pp$ collision data collected by the ATLAS experiment

A search is performed for resonant and non-resonant Higgs boson pair production in the $\gamma\gamma b\bar{b}$ final state. The data set used corresponds to an integrated luminosity of 36.1 fb$^{-1}$ of proton-proton collisions at a centre-of-mass energy of 13 TeV recorded by the ATLAS detector at the CERN Large Hadron Collider. No significant excess relative to the Standard Model expectation is observed. The observed limit on the non-resonant Higgs boson pair cross-section is 0.73 pb at 95% confidence level. This observed limit is equivalent to 22 times the predicted Standard Model cross-section. The Higgs boson self-coupling ($\kappa_\lambda = \lambda_{HHH} / \lambda_{HHH}^{\rm SM}$) is constrained at 95% confidence level to $-8.2<\kappa_\lambda<13.2$. For resonant Higgs boson pair production through X $\rightarrow$ HH $\rightarrow$ $\gamma\gamma b\bar{b}$, the limit is presented, using the narrow-width approximation, as a function of $m_X$ in the range 260 GeV $


Introduction
The Higgs boson (H) was discovered by the ATLAS [1] and CMS [2] collaborations in 2012 using proton-proton (pp) collisions at the Large Hadron Collider (LHC).Measurements of the properties of the boson are in agreement with the predictions of the Standard Model (SM) [3,4].If SM expectations hold, the production of a Higgs boson pair in a single pp interaction should not be observable with the currently available LHC data set.In the SM, the dominant contributions to this process are shown in Figures 1(a) and 1(b).However, some beyond-the-Standard-Model (BSM) scenarios may enhance the Higgs boson pair production rate.
Many BSM theories predict the existence of heavy particles that can decay into a pair of Higgs bosons.These could be identified as a resonance in the Higgs boson pair invariant mass spectrum.They could be produced, for example, through the gluon-gluon fusion mode shown in Figure 1(c).Models with two Higgs doublets [5], such as the minimal supersymmetric extension of the SM [6], twin Higgs models [7] and composite Higgs models [8,9], add a second complex scalar doublet to the Higgs sector.In general, the neutral Higgs fields from the two doublets will mix, which may result in the existence of a heavy Higgs boson that decays into two of its lighter Higgs boson partners.Alternatively, the Randall-Sundrum model of warped extra dimensions [10] predicts spin-0 radions and spin-2 gravitons that could couple to a Higgs boson pair.
In addition to the resonant production, there can also be non-resonant enhancements to the Higgs boson pair cross-section.These can either originate from loop corrections involving new particles, such as light, coloured scalars [11], or through non-SM couplings.Changes to the single Higgs boson production crosssection arising from such loop-corrections are neglected in this paper.Anomalous couplings can either be extensions to the SM, such as contact interactions between two top quarks and two Higgs bosons [12], or be deviations from the SM values of the couplings between the Higgs boson and other particles.In this work, the effective Higgs self-coupling, λ H H H , is parameterised by a scale factor κ λ (κ λ = λ H H H /λ SM H H H ) where the SM superscript refers to the SM value of this parameter.The theoretical and phenomenological implications of such couplings for complete models are discussed in Refs.[13] and [14].The Yukawa coupling between the top quark and the Higgs boson is set to its SM value in this paper, consistent with its recent direct observation [15,16].

H H
Figure 1: Leading-order production modes for Higgs boson pairs.In the SM, there is destructive interference between (a) the heavy-quark loop and (b) the Higgs self-coupling production modes, which reduces the overall cross-section.BSM Higgs boson pair production could proceed through changes to the Higgs couplings, for example the t tH or HHH couplings which contribute to (a) and (b), or through an intermediate resonance, X, which could, for example, be produced through a quark loop as shown in (c).
This paper describes a search for the production of pairs of Higgs bosons in pp collisions at the LHC.The search is carried out in the γγb b final state, and considers both resonant and non-resonant contributions.For the resonant search, the narrow-width approximation is used, focusing on a resonance with mass (m X ) in the range 260 GeV < m X < 1000 GeV.Although this search is for a generic scalar decaying into a pair of Higgs bosons, the simulated samples used to optimise the search were produced in the gluon-gluon fusion mode.Previous searches were carried out by the ATLAS and CMS collaborations in the γγb b channel at √ s = 8 TeV [17,18], as well as in other final states [19][20][21][22] at both √ s = 8 TeV and √ s = 13 TeV.
Events are required to have two isolated photons, accompanied by two jets with dijet invariant mass (m j j ) compatible with the mass of the Higgs boson, m H = 125.09GeV [3].At least one of these jets must be tagged as containing a b-hadron; events are separated into signal categories depending on whether one or both jets are tagged in this way.
Loose and tight kinematic selections are defined, where the tight selection is a strict subset of the loose one.The searches for low-mass resonances and for non-SM values of the Higgs boson self-coupling both use the loose selection, as the average transverse momentum (p T ) of the Higgs bosons is lower in these cases [23].The tight selection is used for signals where the Higgs bosons typically have larger average p T , namely in the search for higher-mass resonances and in the measurement of SM non-resonant HH production.
In the search for non-resonant production, the signal is extracted using a fit to the diphoton invariant mass (m γγ ) distribution of the selected events.The signal consists of a narrow peak around m H superimposed on a smoothly falling background.For resonant production, the signal is extracted from the four-object invariant mass (m γγ j j ) spectrum for events with a diphoton mass compatible with the mass of the Higgs boson, by fitting a peak superimposed on a smoothly changing background.
The rest of this paper is organised as follows.Section 2 provides a brief description of the ATLAS detector, while Section 3 describes the data and simulated event samples used.An overview of object and event selection is given in Section 4, while Section 5 explains the modelling of signal and background processes.The sources of systematic uncertainties are detailed in Section 6. Final results including expected and observed limits are presented in Section 7, and Section 8 summarises the main findings.

ATLAS detector
The ATLAS detector [24] at the LHC is a multipurpose particle detector with a forward-backward symmetric cylindrical geometry1 and a near 4π coverage in solid angle.It consists of an inner tracking detector (ID) surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic (EM) and hadronic calorimeters, and a muon spectrometer (MS).The inner tracking detector, consisting of silicon pixel, silicon microstrip, and transition radiation tracking systems, covers the pseudorapidity range |η| < 2.5.The innermost pixel layer, the insertable B-layer (IBL) [25], was added between the first and second runs of the LHC, around a new, narrower and thinner beam pipe.The IBL improves the experiment's ability to identify displaced vertices and thereby improves the performance of the b-tagging algorithms [26].Lead/liquid-argon (LAr) sampling calorimeters with high granularity provide energy measurements of EM showers.A hadronic steel/scintillator-tile calorimeter covers the 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the z-axis along the beam pipe.The x-axis points from the IP to the centre of the LHC ring, and the y-axis points upwards.Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis.
The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2).Angular distance is measured in units of ∆R ≡ (∆η central pseudorapidity range (|η| < 1.7), while a LAr hadronic endcap calorimeter provides coverage over 1.5 < |η| < 3.2.The endcap and forward regions are instrumented with LAr calorimeters for both the EM and hadronic energy measurements up to |η| = 4.9.The MS surrounds the calorimeters and is based on three large air-core toroidal superconducting magnets, each with eight coils, and with bending power in the range of 2.0 to 7.5 T m.It includes a system of precision tracking chambers, covering the region |η| < 2.7, and fast detectors for triggering purposes, covering the range |η| < 2.4.
A two-level trigger system is used to select interesting events [27].The first-level trigger is implemented in hardware and uses a subset of the total available information to make fast decisions to accept or reject an event, aiming to reduce the rate to around 100 kHz.This is followed by the software-based highlevel trigger (HLT), which runs reconstruction and calibration software, reducing the event rate to about 1 kHz.
3 Data and simulated samples

Data selection
This analysis uses the pp data sample collected at √ s = 13 TeV with the ATLAS detector in 2015 and 2016, corresponding to an integrated luminosity of 36.1 fb −1 .All events for which the detector and trigger system satisfy a set of data-quality criteria are considered.Events are selected using a diphoton trigger, which requires two photon candidates with transverse energy (E T ) above 35 and 25 GeV, respectively.The overall trigger selection efficiency is greater than 99% for events having the characteristics to satisfy the event selection detailed in Section 4.

Simulated event samples
Non-resonant production of Higgs boson pairs via the gluon-gluon fusion process was simulated at nextto-leading-order (NLO) accuracy in QCD using an effective field theory (EFT) approach, with form factors for the top-quark loop from HPAIR [28,29] to approximate finite top-quark mass effects.The simulated events were reweighted to reproduce the m H H spectrum obtained in Refs.[30] and [31], which calculated the process at NLO in QCD while fully accounting for the top-quark mass.The total cross-section is normalised to 33.41 fb, in accordance with a calculation at next-to-next-to-leading order (NNLO) in QCD [32,33].Only the predominant gluon-gluon fusion production mode, which represents over 90% of the SM cross-section, is considered.
Non-resonant BSM Higgs boson pair production with varied κ λ was simulated at LO accuracy in QCD [34] for eleven values of κ λ in the range −10 < κ λ < 10.The total cross-sections for these samples were computed as a function of κ λ at LO accuracy in QCD.A constant NNLO/LO K-factor (2.283) computed at κ λ = 1, was then applied.As the amplitude for Higgs boson pair production can be expressed in terms of κ λ and the top quark's Yukawa coupling, weighted combinations of the simulated samples can produce predictions for other values of κ λ .
Resonant BSM Higgs boson pair production via a massive scalar, was simulated at NLO accuracy for ten different mass points (260, 275, 300, 325, 350, 400, 450, 500, 750 and 1000 GeV) using the narrow-width approximation.For all generated Higgs boson pair samples, both resonant and non-resonant, the branching fractions for H → b b and H → γγ and are taken to be 0.5809 and 0.00227 respectively [32].This analysis is affected both by backgrounds from single-Higgs-boson production and by non-resonant backgrounds with continuum m γγ spectra.Background estimation is carried out using data-driven methods whenever possible; in particular, data are used to estimate the continuum background contribution from SM processes with multiple photons and jets, which constitute the dominant background for this search.Monte Carlo event generators were used for the simulation of different signal hypotheses and the background from SM single Higgs boson production.The major single Higgs boson production channels contributing to the background are gluon-gluon fusion (ggH), associated production with a Z boson (Z H), associated production with a top quark pair (t tH) and associated production with a single top quark (tH).In addition, contributions from vector-boson fusion (VBF H), associated production with a W boson (W H) and associated production with a bottom quark pair (b bH) are also considered.Overall, the largest contributions come from t tH and Z H.More information about these simulated background samples can be found in Ref. [35] and in Table 1.
For all matrix element generators other than S , the resulting events were passed to another program for simulation of parton showering, hadronisation and the underlying event.This is either Herwig++ with the CTEQ6L1 parton distribution function (PDF) set [36] using the UEEE5 set of tuned parameters [37] or P 8 with the NNPDF 2.3 LO PDF set [38] and the A14 set of tuned parameters [39].For all simulated samples except those generated by S , the E G v1.2.0 program [40] was used for modelling the properties of band c-hadron decays.Multiple overlaid pp collisions (pile-up) were simulated with the soft QCD processes of P 8.186 using the A2 set of tuned parameters [41] and the MSTW2008LO PDF set [42].The distribution of the number of overlaid collisions simulated in each event approximately matches what was observed during 2015 and 2016 data-taking.Event-level weights were applied to the simulated samples in order to improve the level of agreement.
The final-state particles were passed either through a G 4 [43] simulation of the ATLAS detector, or through the ATLAS fast simulation framework [44], which has been extensively cross-checked against the G 4 model.The output from this detector simulation step is then reconstructed using the same software as used for the data.A list of the signal and dominant background samples used in the paper is shown in Table 1.

Object and event selection
The photon selection and event selection for the present search follow those in another published ATLAS H → γγ analysis [35].The subsections below detail the selection and identification of all detector-level objects used in the analysis, followed by the event selection criteria and the classification into signal and background control categories.

Object selection
Photon candidates are reconstructed from energy clusters in the EM calorimeter [63].The reconstruction algorithm searches for possible matches between energy clusters and tracks reconstructed in the inner detector and extrapolated to the calorimeter.Well-reconstructed tracks matched to clusters are classified as electron candidates, while clusters without matching tracks are classified as unconverted photon candidates.Clusters matched to a reconstructed conversion vertex or to pairs of tracks consistent with the hypothesis of a γ → e + e − conversion process are classified as converted photon candidates.Photon energies are Table 1: Summary of the event generators and PDF sets used to model the signal and the main background processes.The SM cross-sections σ for the Higgs boson production processes with m H = 125.09GeV are also given separately for √ s = 13 TeV, together with the orders of the calculation corresponding to the quoted crosssections, which are used to normalise samples.The following generator versions were used: P 8.212 [45] (event generation), P 8.186 [46] (pile-up overlay); Herwig++ 2.7.1 [47,48]; P -B r3154 (base) v2 [49][50][51]; M G 5_aMC@NLO 2.4.3 [52]; S 2.2.1 [53][54][55][56].The PDF sets used are: CT10 NLO [57], CTEQ6L1 [36], NNPDF 2.3 LO [38], NNPDF 3.0 LO [58], PDF4LHC15 [59].For the BSM signals, no crosssection is specified as it is the parameter of interest for measurement.For the S background, no cross-section is used, as the continuum background is fit in data.determined by summing the energies of all cells belonging to the associated cluster.Simulation-based corrections are then applied to account for energy losses and leakage outside the cluster [63].The absolute energy scale and response resolution is calibrated using Z → e + e − events from data.For the photons considered in this analysis, the reconstruction efficiency for both the converted and unconverted photons is 97%.Photon identification is based on the lateral and longitudinal energy profiles of EM showers measured in the calorimeter [64].The reconstructed photon candidates must satisfy tight photon identification criteria.These exploit the fine granularity of the first layer of the EM calorimeter in order to reject background photons from hadron decays.The photon identification efficiency varies as a function of E T and |η| and is typically 85-90% (85-95%) for unconverted (converted) photons in the range of 30 GeV < E T < 100 GeV.
All photon candidates must satisfy a set of calorimeter-and track-based isolation criteria designed to reject the background from jets misidentified as photons and to maximise the signal significance of simulated H → γγ events against the continuum background.The calorimeter-based isolation variable E iso T is defined as the sum of the energies of all topological clusters of calorimeter cells within ∆R = 0.2 of the photon candidate, excluding clusters associated to the photon candidate.The track-based isolation variable p iso T is defined as the sum of the transverse momenta (p T ) of all tracks with p T > 1 GeV within ∆R = 0.2 of the photon candidate, excluding tracks from photon conversions and tracks not associated with the interaction vertex.Candidates with E iso T larger than 6.5% of their transverse energy or with p iso T greater than 5% of their transverse energy are rejected.The efficiency of this isolation requirement is approximately 98%.Photons satisfying the isolation criteria are required to fall within the fiducial region of the EM calorimeter defined by |η| < 2.37, excluding a transition region between calorimeters (1.37 < |η| < 1.52).Among the photons satisfying the isolation and fiducial criteria, the two with the highest p T are required to have E T /m γγ > 0.35 and 0.25, where m γγ is the invariant mass of the diphoton system.
A neural network, trained on a simulated gluon-gluon fusion single-Higgs-boson sample, is used to select the primary vertex most likely to have produced the diphoton pair.The algorithm uses the directional information from the calorimeter and, in the case of converted photons, tracking information, to extrapolate the photon trajectories back to the beam axis.Additionally, vertex properties such as the sum of the squared transverse momenta or the scalar sum of the transverse momenta of the tracks associated with the vertex, are used as inputs to this algorithm.Due to the presence of two high-p T jets in addition to the two photons, the efficiency for selecting the correct primary vertex is more than 85%.All relevant tracking and calorimetry variables are recalculated with respect to the chosen primary vertex [35].
Jets are reconstructed via the FastJet package [65] from topological clusters of energy deposits in calorimeter cells [66], using the anti-k t algorithm [67] with a radius parameter of R = 0.4.Jets are corrected for contributions from pile-up by applying an event-by-event energy correction evaluated using calorimeter information [68].They are then calibrated using a series of correction factors, derived from a mixture of simulated events and data, which correct for the different responses to EM and hadronic showers in each of the components of the calorimeters [69].Jets that do not originate from the diphoton primary vertex, as detailed above, are rejected using the jet vertex tagger (JVT) [70], a multivariate likelihood constructed from two track-based variables.A JVT requirement is applied to jets with 20 GeV < p T < 60 GeV and |η| < 2.4.This requirement is 92% efficient at selecting jets arising from the chosen primary vertex.Jets are required to satisfy |η| < 2.5 and p T > 25 GeV; any jets among these that are within ∆R = 0.4 of an isolated photon candidate or within ∆R = 0.2 of an isolated electron candidate are discarded.
The selected jets are classified as b-jets (those containing b-hadrons) or other jets using a multivariate classifier taking impact parameter information, reconstructed secondary vertex position and decay chain reconstruction as inputs [26,71].Working points are defined by requiring the discriminant output to exceed a particular value that is chosen to provide a specific b-jet efficiency in an inclusive t t sample.Correction factors derived from t t events with final states containing two leptons are applied to the simulated event samples to compensate for differences between data and simulation in the b-tagging efficiency [72].The analysis uses two working points which have a b-tagging efficiency of 70% (60%), a c-jet rejection factor of 12 (35) and a light-jet rejection factor of 380 (1540) respectively.Muons [73] within ∆R = 0.4 of a b-tagged jet are used to correct for energy losses from semileptonic b-hadron decays.This correction improves the energy measurement of b-jets and improves the signal acceptance by 5-6%.

Event selection and categorisation
Events are selected for analysis if there are at least two photons and at least two jets, one or two of which are tagged as b-jets, which satisfy the criteria outlined in Section 4.1.The diphoton invariant mass is initially required to fall within a broad mass window of 105 GeV < m γγ < 160 GeV.In order to remain orthogonal to the ATLAS search for HH → b bb b [19], any event with more than two b-jets using the 70% efficient working point is rejected, before the remaining events are divided into three categories.The 2-tag signal category consists of events with exactly two b-jets satisfying the requirement for the 70% efficient working point.Another signal category is defined using events failing this requirement but nevertheless containing exactly one b-jet identified using a more stringent (60% efficient) working point.Here the second jet, which is in this case not identified as a b-jet, is chosen using a boosted decision tree (BDT).Different BDTs are used when applying the loose and tight kinematic selections.These are optimised using simulated continuum background events as well as signal events from lower-mass or higher-mass resonances, respectively.The BDTs use kinematic variables, namely jet p T , dijet p T , dijet mass, jet η, dijet η and the ∆η between the selected jets, as well as information about whether each jet satisfied less stringent b-tagging criteria.The ranking of the jets from best to worst in terms of closest match between the dijet mass and m H , highest jet p T and highest dijet p T are also used as inputs.The jet with the highest BDT score is selected and the event is included in the 1-tag signal category.The efficiency with which the correct jet is selected by this BDT is 60-80% across the range of resonant and non-resonant signal hypotheses considered in this paper.If the event contains no b-jet from either working point, the event is not directly used in the analysis, but is instead reserved for a 0-tag control category, which is used to provide data-driven estimates of the background shape in the signal categories.
Further requirements are then made on the p T of the jets and on the mass of the dijet system, which differ for the loose and tight selections.In the loose selection, the highest-p T jet is required to have p T > 40 GeV, and the next-highest-p T jet must satisfy p T > 25 GeV, with the invariant mass of the jet pair (m j j ) required to lie between 80 and 140 GeV.For the tight selection, the highest-p T and the next-highest-p T jets are required to have p T > 100 GeV and p T > 30 GeV, respectively, with 90 GeV < m j j < 140 GeV.Finally, in the resonant search, the diphoton invariant mass is required to be within 4.7 (4.3) GeV of the Higgs boson mass for the loose (tight) selection.This additional selection on m γγ is optimised to contain at least 95% of the simulated Higgs boson pair events for each mass hypothesis.
For non-resonant Higgs boson pair production, among events in the 2-tag category, the efficiency with which the kinematic requirements are satisfied is 10% and 5.8% for the loose and tight selections, respectively.In the 1-tag category, the corresponding efficiencies are 7.2% and 3.9%, which are slightly lower than for the 2-tag category due to the lower probability of selecting the correct jet pair.For the resonant analysis, efficiencies range from 6% to 15.4% in the 2-tag category and from 5.1% to 12.3% in the 1-tag category for 260 GeV < m X < 1000 GeV.
Due to the differing jet kinematics, the signal acceptance is lower in all cases for the generated NLO signal than for a LO signal.The acceptance of the LO prediction is approximately 15% higher when using the tight selection and 10% higher when using the loose selection.
In the resonant analysis, before reconstructing the four-object mass, m γγ j j , the four-momentum of the dijet system is scaled by m H /m j j .As shown in Figure 2, this improves the four-object mass resolution by 60% on average across the resonance mass range of interest.It also modifies the shape of the non-resonant background in the region below 270 GeV.After the correction, the m γγ j j resolution is approximately 3% for all signal hypotheses considered in this paper.

Signal and background modelling
Both the resonant and non-resonant searches for Higgs boson pairs proceed by performing unbinned maximum-likelihood fits to the data in the 1-tag and 2-tag signal categories simultaneously.The nonresonant search involves a fit to the m γγ distribution, while the search for resonant production uses the m γγ j j distribution.The signal-plus-background fit to the data uses parameterised forms for both the signal and background probability distributions.These parameterised forms are determined through fits to simulated samples.
As the loose selection is used for resonances with m X ≤ 500 GeV and the tight selection for resonances with m X ≥ 500 GeV, different ranges of m γγ j j are used in each case.For the loose (tight) selection, only events with m γγ j j in the range 245 GeV < m γγ j j < 610 GeV (335 GeV < m γγ j j < 1140 GeV) are considered.These ranges are the smallest that contain over 95% of all of the simulated signal sample events with m X below, or above, 500 GeV respectively.

Background composition
Contributions to the continuum diphoton background originate from γγ, γ j, jγ and j j sources produced in association with jets, where j denotes jets misidentified as photons and γ j and jγ differ by the jet faking the sub-leading or the leading photon candidate respectively.These are determined from data using a double two-dimensional sideband method (2x2D) based on varying the photon identification and isolation criteria [74,75].The number and relative fraction of events from each of these sources is calculated separately for the 1-and 2-tag categories.In each case the contribution from γγ events is in the range 80-90%.
The choice of functional form used to fit the background in the final likelihood models is derived using simulated events.Continuum γγ events were simulated using the S event generator as described in Section 3. As this prediction from S does not provide a good description of the m γγ spectrum in data, the mismodelling is corrected for using a data-driven reweighting function.
In the 0-tag control category, the number of events in data is high enough that the 2x2D method can be applied in bins of m γγ .The events generated by S can also be divided into γγ, γ j, jγ and j j sources based on the same photon identification and isolation criteria as used in data.For each of these sources, the m γγ distributions for both S and the data are fit using an exponential function and the ratio of the two fit results is taken as an m γγ -dependent correction function.The size of the correction is less than 5% for the majority of events.These reweighting functions are then applied in the 1-tag and 2-tag signal categories to correct the shape of the S prediction.The fractional contribution from the different continuum background sources is fixed to the relative proportions derived in data with the 2x2D method.Finally, the overall normalisation is chosen such that, in the disjoint sideband region 105 GeV < m γγ < 120 GeV and 130 GeV < m γγ < 160 GeV, the total contribution from all backgrounds is equal to that from data.
The contribution from γγ produced in association with jets is further divided in accord with the flavours of the two jets (for example bb, bc, c + light jet).This decomposition is taken directly from the proportions predicted by the S event generator and no attempt is made to classify the data according to jet flavour.The continuum background in the 1-tag category comes primarily from γγbj events (∼60%) and in the 2-tag category from γγbb events (∼80%).A comparison between data in the 0-tag control category and this data-driven prediction of the total background can be seen in Figure 3

Signal modelling for the non-resonant analysis
The shape of the diphoton mass distribution in HH → γγ j j events is described by the double-sided Crystal Ball function [35], consisting of a Gaussian core with power-law tails on either side.The parameters of this model are determined through fits to the simulated non-resonant SM HH sample described in Section 3.2.

Background modelling for the non-resonant analysis
For the non-resonant analysis, the continuum m γγ background is modelled using a functional form obtained from a fit to the data.The potential bias arising from this procedure, termed 'spurious signal', is estimated by performing signal-plus-background fits to the combined continuum background from simulation, including the γγ, γ j, jγ and j j components [35].The maximum absolute value of the extracted signal, for a signal in the range 121 GeV < m γγ < 129 GeV, is taken as the bias.This method is used to discriminate between different potential fit functions -the function chosen is the one with the smallest spurious signal bias.If multiple functions have the same bias, the one with the smallest number of parameters is chosen.The first-order exponential function has the smallest bias among the seven functions considered and is therefore chosen.The background from single Higgs boson production is described using a double-sided Crystal Ball function, with its parameters determined through fits to the appropriate simulated samples.

Signal modelling for the resonant analysis
For each resonant hypothesis, a fit is performed to the m γγ j j distribution of the simulated events in a window around the nominal m X .The shape of this distribution is described using a function consisting of a Gaussian core with exponential tails on either side.A simultaneous fit to all signal samples is carried out in which each of the model parameters is further parameterised in terms of m X .This allows the model to provide a prediction for any mass satisfying 260 GeV < m X < 1000 GeV, where these boundaries reflect the smallest and largest m X values among the generated samples described in Section 3.2.

Background modelling for the resonant analysis
For the resonant analysis, a spurious-signal study is also carried out, using the m γγ j j distribution for events within the m γγ window described in Section 4.2.The background used to evaluate the spurious-signal contribution is a combination of the continuum m γγ backgrounds together with the single Higgs boson backgrounds.
Due to the different m γγ j j ranges used with the loose and tight selections, the shape of the m γγ j j distribution differs between these two cases and hence different background functions are considered.For the loose (tight) mass selection, the Novosibirsk function2 (exponential function) has the smallest bias among the three (four) functions considered and is therefore chosen.As a result, for low-mass resonances both the signal and background fit functions have a characteristic peaked shape.This degeneracy could potentially introduce a bias in the extracted signal cross-section.In order to stabilise the background fit, nominal values of the shape parameters are estimated by fitting to the simulated events described in Section 5.1.
The shape is then allowed to vary in the likelihood to within the statistical covariance of this template fit.Experimental systematics on the background shape have a small effect and are neglected.The normalisation of the background is estimated by interpolating the m γγ sideband data.Additionally, a simple bias test is performed by drawing pseudo-data sets from the overall probability distribution created by combining the Novosibirsk background function with the signal function.For each mass point and each value of the injected signal cross-section, fits are performed on the ensemble of pseudo-data sets and the median extracted signal cross-section is recorded.For resonances with masses below 400 GeV, a small correction is applied to remove the observed bias.The correction is less than ±0.05 pb everywhere and a corresponding uncertainty of ±0.02 pb in this correction is applied to the extracted signal cross-section.The corresponding uncertainty in the number of events in each category is roughly half that of the spurious signal.

Systematic uncertainties
Although statistical uncertainties dominate the sensitivity of this analysis given the small number of events, care is taken to make the best possible estimates of all systematic uncertainties, as described in more detail below.

Theoretical uncertainties
Theoretical uncertainties in the production cross-section of single Higgs bosons are estimated by varying the renormalisation and factorisation scales.In addition, uncertainties due to the PDF and the running of the QCD coupling constant (α S ) are considered.The scale uncertainties reach a maximum of +20% −24% and the PDF+α S uncertainty is not more than ±3.6% [32].An uncertainty in the rate of Higgs boson production with associated heavy-flavour jets is also considered.A 100% uncertainty is assigned to the ggH and W H production modes, motivated by studies of heavy-flavour production in association with top-quark pairs [77] and W boson production in association with b-jets [78].No heavy-flavour uncertainty is assigned to the Z H and t tH production modes, where the dominant heavy-flavour contribution is already accounted for in the LO process.Finally, additional theoretical uncertainties in single Higgs boson production from uncertainties in the H → γγ and H → b b branching fractions are +2.9%−2.8% and ±1.7%, respectively [32].
The same sources of uncertainty are considered on the SM HH signal samples.The effect of scale and PDF+α S uncertainties on the NNLO cross-section for SM Higgs boson pair production are 4-8% and 2-3% respectively.In addition, an uncertainty of 5% arising from the simplifications used in the EFT approximation is taken into account [30].
In the search for resonant Higgs boson pair production, uncertainties arising from scale and PDF uncertainties, which primarily affect the signal yield, are neglected.For this search, the SM non-resonant HH production is considered as a background, with an overall uncertainty on the cross-section of +7% −8% .Interference between SM HH and the BSM signal is neglected.For all samples, systematic differences between alternative models of parton showering and hadronisation were considered and found to have a negligible impact.

Experimental uncertainties
The systematic uncertainty in the integrated luminosity for the data in this analysis is 2.1%.It is derived following a methodology similar to that detailed in Ref. [79], using beam-separation scans performed in 2015 and 2016.
The efficiency of the diphoton trigger is measured using bootstrap methods [27], and is found to be 99.4% with a systematic uncertainty of 0.4%.Uncertainties associated with the vertex selection algorithm have a negligible impact on the signal selection efficiency.
Differences between data and simulation give rise to uncertainties in the calibration of the photons and jets used in this analysis.As the continuum backgrounds are estimated from data, these uncertainties are applied only to the signal processes and to the single-Higgs-boson background process.In order to calculate the impact of the experimental uncertainties, signal and background fits are performed as described in Section 5, with the relevant observables varied within their uncertainties.Changes in the peak location (m peak ), width (σ peak ) and expected yield in m γγ (m γγ j j ) for the non-resonant (resonant) model, relative to the nominal fits, are extracted.The tail parameters are kept at their nominal values in these modified fits.For the resonant analysis, systematic uncertainties are evaluated for each m X and the maximum across the range is taken as a conservative uncertainty.
The dominant yield uncertainties are listed in Table 2. Uncertainties in the photon identification and isolation directly affect the diphoton selection efficiency; jet energy scale and resolution uncertainties affect the m bb window acceptance [69,80,81], while flavour-tagging uncertainties lead to migration of events between categories.Uncertainties in the peak location (width), which are mainly due to uncertainties in the photon energy scale (energy resolution), are about 0.2-0.6%(5-14%) for both the single-Higgs-boson and Higgs boson pair samples in the resonant and non-resonant analyses.
The spurious signal for the chosen background model, as defined in Sections 5.3 and 5.5, is assessed as an additional uncertainty in the total number of signal events in each category.In the 2-tag (1-tag) category, the uncertainty corresponds to 0.63 (0.25) events for the non-resonant analysis, 0.58 (2.06) events for the resonant analysis with the loose selection, and 0.21 (0.89) events for the resonant analysis with the tight selection.
Finally, as described in Section 5.5, an m X -dependent correction to the signal cross-section, together with its associated uncertainty, is applied in the case of the resonant analysis at low masses to adjust for a small degeneracy bias.

Results
The observed data are in good agreement with the data-driven background expectation, as summarised in Table 3. Across all categories, the number of observed events in data is compatible with the number of expected background events within the calculated uncertainties.
The signal and background models described in Section 5 are used to construct an unbinned likelihood function which is maximised with respect to the observed data.The models for the 1-tag and 2-tag categories are simultaneously fit to the data.In each case the parameter of interest is the signal crosssection, which is related in the likelihood model to the number of signal events after considering the integrated luminosity, branching ratio, phase-space acceptance and detection efficiency of the respective categories.The likelihood model also includes a number of nuisance parameters associated with the background shape and normalisation, as well as the theoretical and experimental systematic uncertainties described in Section 6.These nuisance parameters are included in the likelihood as terms which modulate their respective parameters, such as signal yield, along with a constraint term which encodes the scale of the uncertainty by reducing the likelihood when the parameter is pulled from its nominal value.In general the nuisance parameter for each systematic uncertainty has a correlated effect between 1-tag and Table 2: Summary of dominant systematic uncertainties affecting expected yields in the resonant and non-resonant analyses.For the non-resonant analysis, uncertainties in the Higgs boson pair signal and SM single-Higgs-boson backgrounds are presented.For the resonant analysis, uncertainties on the Higgs boson pair signal for the loose and tight selections are presented.Sources marked '-' and other sources not listed in the table are negligible by comparison.No systematic uncertainties related to the continuum background are considered, since this is derived through a fit to the observed data.Figure 4 shows the observed diphoton invariant mass spectra for the non-resonant analysis with the loose (top) and tight (bottom) selections.The best-fit Higgs boson pair cross-section is 0.04 +0.43 −0.36 (−0.21 +0.33 −0.25 ) pb for the loose (tight) selection.Figure 5 shows the observed four-body invariant mass spectra for the resonant analysis in the loose (top) and tight (bottom) selections.Maximum-likelihood background-only fits are also shown.The largest discrepancy between the background-only hypothesis and the data manifests as an excess at 480 GeV with a local significance of 1.2 σ.The results are also interpreted as upper limits on the relevant Higgs boson pair production cross-sections.
Exclusion limits are set on Higgs boson pair production in the γγb b final state.The limits for both resonant and non-resonant production are calculated using the CL S method [82], with the likelihood-based test statistic qµ which is suitable when considering signal strength µ ≥ 0 [83,84].Because both the expected and observed numbers of events are small in the case of the resonant analysis, test-statistic distributions are evaluated by pseudo-experiments generated by profiling the nuisance parameters of the likelihood model on the observed data, as described in Ref. [84].Better limits on κ λ are expected with the loose selection, Table 3: Expected and observed numbers of events in the 1-tag and 2-tag categories for events passing the selection for the resonant analysis, including the m γγ requirement.The event numbers quoted for the SM Higgs boson pair signal assume that the total production cross-section is 33.41 fb.The uncertainties on the continuum background are those arising from the fitting procedure.The uncertainties on the single-Higgs-boson and Higgs boson pair backgrounds are the systematics from experimental and theoretical sources.The loose and tight selections are not orthogonal.whereas for the SM value κ λ = 1 the strongest limits on the Higgs boson pair cross-section are derived from the tight selection.

Exclusion limits on non-resonant H H production
The 95% confidence level (CL) upper limit for the non-resonant Higgs boson pair cross-section is obtained using the tight selection.Figure 6(a) shows this upper limit, together with ±1σ and ±2σ uncertainty bands.The observed (expected) value is 0.73 (0.93) pb.As a multiple of the SM production cross-section, the observed (expected) limits are 22 (28).The limits and the ±1σ band around each expected limit are presented in Table 4.
Table 4: The 95% CL observed and expected limits on the Higgs boson pair cross-section in pb and as a multiple of the SM production cross-section.The ±1σ band around each 95% CL limit is also indicated.

Exclusion limits on λ H H H
Varying the Higgs boson self-coupling, λ H H H , affects both the total cross-section of the non-resonant Higgs boson pair production and the event kinematics, affecting the signal selection efficiency.In the non-resonant analysis, results are interpreted in the context of κ λ , using the loose selection, which is more sensitive for the range of κ λ values accessible with this data set.As discussed in Section 3.2, the samples used for this interpretation were generated at LO.The 95% CL limits on σ gg→H H are shown together with

Exclusion limits on resonant H H production
The 95% CL limits on resonant Higgs boson pair production are shown in Figure 7, utilising both the loose and tight selections.The SM HH contribution is considered as part of the background in this search although its inclusion has a negligible impact on the results.For resonance masses in the range 260 GeV < m X < 1000 GeV, the observed (expected) limits range between 1.14 (0.90) pb and  [pb] ATLAS s = 13 TeV, 36.1 fb 1 Observed limit Expected limit Expected limit ±1 Expected limit ±2 Figure 7: The expected and observed 95% CL limits on the resonant production cross-section, σ X × B (X → HH) as a function of m X .The loose selection is used for m X ≤ 500GeV, while the tight selection is used for m X ≥ 500GeV.This is delineated with the blue dashed line.

Conclusions
Searches for resonant and non-resonant Higgs boson pair production in the γγb b final state are performed using 36.1 fb −1 of pp collision data collected at √ s = 13 TeV with the ATLAS detector at the LHC in 2015 and 2016.No significant deviations from the Standard Model predictions are observed.A 95% CL upper limit of 0.73 pb is set on the cross-section for non-resonant production, while the expected limit is 0.93 pb.This observed (expected) limit is 22 (28) times the predicted SM cross-section.The Higgs boson selfcoupling is constrained at 95% CL to −8.2 < κ λ < 13.2 whereas the expected limits are −8.3 < κ λ < 13.2.For resonant production of X → HH → γγb b, a limit is presented for the narrow-width approximation as a function of m X .The observed (expected) limits range between 1.1 pb (0.9 pb) and 0.12 pb (0.15 pb) in the range 260 GeV < m X < 1000 GeV.

Figure 2 :
Figure 2: Reconstructed m γγ j j with (solid lines) and without (dashed lines) the dijet mass constraint, for a subset of the mass points used in the resonant analysis.The examples shown here are for (a) the 2-tag category with the loose selection and (b) the 1-tag category with the tight selection.The effect on the continuum background is also shown in (a).

Figure 3 :
Figure3: The predicted number of background events from continuum diphoton plus jet production (blue), other continuum photon and jet production (orange) and single Higgs boson production (green) is compared with the observed data (black points) in the 0-tag control category for (a) the m γγ distribution with the tight selection and (b) the m γγ j j distribution with the loose selection. Photon

Events / 2 Figure 4 :Figure 5 :
Figure 4: For the non-resonant analysis, data (black points) are compared with the background-only fit (blue solid line) for m γγ in the 1-tag (left) and 2-tag (right) categories with the loose (top) and tight (bottom) selections.Both the continuum γγ background and the background from single Higgs boson production are considered.The lower panel shows the residuals between the data and the best-fit background.

Figure 6 :
Figure6: The expected and observed 95% CL limits on the non-resonant production cross-section σ gg→H H (a) for the SM-optimised limit using the tight selection and (b) as a function of κ λ using the loose selection.In (a) the red line indicates the 95% confidence level.The intersection of this line with the observed, expected, and ±1σ and ±2σ bands is the location of the limits.In (b) the red line indicates the predicted HH cross-section if κ λ is varied but all other couplings remain at their SM values.The red band indicates the theoretical uncertainty of this prediction.
[17] ATLAS Collaboration, Search for Higgs Boson Pair Production in the γγb b Final State Using pp Collision Data at √ s = 8 TeV from the ATLAS Detector, Phys.Rev. Lett.114 (b c Also at CERN, Geneva; Switzerland.d Also at CPPM, Aix-Marseille Université, CNRS/IN2P3, Marseille; France.e Also at Département de Physique Nucléaire et Corpusculaire, Université de Genève, Genève; Switzerland.f Also at Departament de Fisica de la Universitat Autonoma de Barcelona, Barcelona; Spain.g Also at Departamento de Física Teorica y del Cosmos, Universidad de Granada, Granada (Spain); Spain.h Also at Department of Applied Physics and Astronomy, University of Sharjah, Sharjah; United Arab Emirates.i Also at Department of Financial and Management Engineering, University of the Aegean, Chios; Greece.j Also at Department of Physics and Astronomy, University of Louisville, Louisville, KY; United States of America.k Also at Department of Physics and Astronomy, University of Sheffield, Sheffield; United Kingdom.l Also at Department of Physics, California State University, Fresno CA; United States of America.m