Search for nonresonant Higgs boson pair production in final states with two bottom quarks and two photons in proton-proton collisions at s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \sqrt{s} $$\end{document} = 13 TeV

A search for nonresonant production of Higgs boson pairs via gluon-gluon and vector boson fusion processes in final states with two bottom quarks and two photons is presented. The search uses data from proton-proton collisions at a center-of-mass energy of s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \sqrt{s} $$\end{document} = 13 TeV recorded with the CMS detector at the LHC, corresponding to an integrated luminosity of 137 fb−1. No significant deviation from the background-only hypothesis is observed. An upper limit at 95% confidence level is set on the product of the Higgs boson pair production cross section and branching fraction into γγbb¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \gamma \gamma \mathrm{b}\overline{\mathrm{b}} $$\end{document}. The observed (expected) upper limit is determined to be 0.67 (0.45) fb, which corresponds to 7.7 (5.2) times the standard model prediction. This search has the highest sensitivity to Higgs boson pair production to date. Assuming all other Higgs boson couplings are equal to their values in the standard model, the observed coupling modifiers of the trilinear Higgs boson self-coupling κλ and the coupling between a pair of Higgs bosons and a pair of vector bosons c2V are constrained within the ranges −3.3 < κλ< 8.5 and −1.3 < c2V< 3.5 at 95% confidence level. Constraints on κλ are also set by combining this analysis with a search for single Higgs bosons decaying to two photons, produced in association with top quark-antiquark pairs, and by performing a simultaneous fit of κλ and the top quark Yukawa coupling modifier κt.

This paper describes a search for nonresonant production of pairs of Higgs bosons decaying to γγbb using a data sample of 137 fb −1 collected by the CMS experiment from 2016 to 2018. The γγbb final state has a combined branching fraction of 2.63 ± 0.06 × 10 −3 [16] for a Higgs boson mass of 125 GeV. This channel is one of the most sensitive to HH produc--2 -

JHEP03(2021)257
tion because of the large SM branching fraction of Higgs boson decays to bottom quarks, the good mass resolution of the H → γγ channel, and relatively low background rates.
The analysis targets the main HH production modes: ggF and VBF. Both modes are analyzed following similar strategies. After reducing the nonresonant γγbb background and the background coming from single Higgs boson production in association with a top quark-antiquark pair (ttH), the events are categorized into ggF-and VBF-enriched signal regions using a multivariate technique. The signal is extracted from a fit to the invariant masses of the Higgs boson candidates in the bb and γγ final states. The analysis described in this paper advances the previous pp → HH → γγbb search [29] by a factor of four, benefiting equally from the larger collected data sets, and the innovative analysis techniques. The enhanced sensitivity of the present analysis was achieved by improving the b jet energy resolution with a dedicated energy regression, introducing new multivariate methods for background rejection, optimizing the event categorization, and adding dedicated VBF categories.
Finally, the search for Higgs boson pair production is combined with an independent analysis that targets ttH production, where the Higgs boson decays to a diphoton pair [32]. The ttH production cross section depends on y t , and also includes a trilinear Higgs boson self-coupling contribution from NLO electroweak corrections [33,34]. The combination enables λ HHH and y t to be measured simultaneously and provides constraints applicable to a wide range of theoretical models, where both couplings have anomalous values.
This paper is organized as follows: after a brief description of the CMS detector in section 2, the production of Higgs boson pairs is described in section 3. The data samples and simulation, event reconstruction, and analysis strategy are discussed in sections 4, 5, and 6, respectively. Sections 7 and 8 are dedicated to the description of the background rejection methods. The event categorization is described in section 9. Sections 10 and 11 describe the modeling of the signal and background, respectively. The systematic uncertainties are discussed in section 12. Finally, the results are presented in section 13. The analysis and its results are then summarized in section 14.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.
A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in ref. [35].
Events of interest are selected using a two-tiered trigger system [36]. The first level (L1), composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a time interval of less -3 -JHEP03(2021)257 than 4 µs. The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimised for fast processing, and reduces the event rate to around 1 kHz before data storage [37].
The particle-flow algorithm [38] (PF) aims to reconstruct and identify each individual particle in an event (PF candidate), with an optimised combination of information from the various elements of the CMS detector. The energy of photons is obtained from the ECAL measurement. The energy of electrons is determined from a combination of the track momentum at the main interaction vertex, the corresponding ECAL cluster energy, and the energy sum of all bremsstrahlung photons attached to the track. The momentum of muons is obtained from the curvature of the corresponding track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for zero-suppression effects and for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies.
For each event, hadronic jets are clustered from these reconstructed particles using the infrared and collinear safe anti-k T algorithm [39,40] with a distance parameter of 0.4. Jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found from simulation to be, on average, within 5 to 10% of the true momentum over the whole p T spectrum and detector acceptance. Additional proton-proton interactions within the same or nearby bunch crossings can contribute additional tracks and calorimetric energy depositions, increasing the apparent jet momentum. To mitigate this effect, tracks identified to be originating from pileup vertices are discarded and an offset correction is applied to correct for remaining contributions. Jet energy corrections are derived from simulation studies so that the average measured energy of jets becomes identical to that of particle level jets. In situ measurements of the momentum balance in dijet, photon+jet, Z+jet, and multijet events are used to determine any residual differences between the jet energy scale in data and in simulation, and appropriate corrections are made [41]. Additional selection criteria are applied to each jet to remove jets potentially dominated by instrumental effects or reconstruction failures. The jet energy resolution amounts typically to 15-20% at 30 GeV, 10% at 100 GeV, and 5% at 1 TeV [41].
The missing transverse momentum vector p miss T is computed as the negative vector sum of the transverse momenta of all the PF candidates in an event, and its magnitude is denoted as p miss T [42]. The p miss T is modified to account for corrections to the energy scale of the reconstructed jets in the event.

Higgs boson pair production
Nonresonant ggF HH production at the LHC can be described using an effective field theory (EFT) approach [16]. Considering operators up to dimension 6 [17], the tree-level interactions of the Higgs boson are modeled by five parameters. Deviations from the SM values of λ HHH and y t are parametrized as κ λ ≡ λ HHH /λ SM HHH and κ t ≡ y t /y SM t , where the SM values of the couplings are defined as λ SM Here, v = 246 GeV is the vacuum expectation value of the Higgs field, and m t ≈ 173 GeV  Table 1. Coupling parameter values in the SM and in twelve BSM benchmark hypotheses identified using the method described in ref. [44].
is the top quark mass. The anomalous couplings c 2g , c 2 , and c g are not present in the SM. The corresponding part of the Lagrangian can be written as [43]: where t L and t R are the top quark fields with left and right chiralities, respectively. The Higgs boson field is denoted as H, G µν is the gluon field strength tensor, and h.c. denotes the Hermitian conjugate.
At LO the full cross section of ggF Higgs boson pair production can be expressed by a polynomial with 15 terms corresponding to five individual diagrams, shown in figure 1, and their interference. It has been observed in ref. [44] that twelve benchmark hypotheses, described by various combinations of the five parameters (κ λ , κ t , c 2 , c g , c 2g ), are able to represent the distributions of the main kinematic observables of the HH processes over the full phase space. The parameter values for these benchmark hypotheses are summarized in table 1. The simulated samples generated with the EFT parameters that describe the twelve benchmark hypotheses are combined to cover all possible kinematic configurations of the EFT parameter space. The specific kinematic configurations at any point in the full 5D parameter space are obtained through a corresponding reweighting procedure [44,45] that parametrizes the changes in the differential ggF HH cross section.
The reweighting procedure described in ref. [44] to obtain the distributions of the kinematic observables is implemented for LO only, and cannot be applied to the higherorder simulation because of the presence of additional partons at the matrix element level. Therefore, the 12 BSM signal benchmark hypotheses summarized in table 1 are investigated using an LO Monte Carlo (MC) simulation, and only anomalous values of κ λ and κ t are studied with the NLO simulation, as described in section 4.
In the SM, three different couplings are involved in HH production via VBF: λ HHH , HVV, and HHVV. The Lagrangians corresponding to the left, middle, and right diagrams in figure 2 scale with c V κ λ , c 2 V , and c 2V , respectively, where c 2V and c V are the HHVV and HVV coupling modifiers, normalized to the SM values.

Data sample and simulated events
The analyzed data correspond to a total integrated luminosity of 137 fb −1 and were collected over a data-taking period spanning three years: 35.9 fb −1 in 2016, 41.5 fb −1 in 2017, and 59.4 fb −1 in 2018. Events are selected using double-photon triggers with asymmetric thresholds on the photon transverse momenta of p γ1 T > 30 GeV and p γ2 T > 18(22) GeV for the data collected during 2016 (2017 and 2018). In addition, loose calorimetric identification requirements [46], based on the shape of the electromagnetic shower, the isolation of the photon candidate, and the ratio between the hadronic and electromagnetic energy deposit of the shower, are imposed on the photon candidates at the trigger level.
The ggF HH signal samples are simulated at NLO [47][48][49][50][51] including the full top quark mass dependence [52] using powheg 2.0. The samples are generated for different values of κ λ . As shown in ref. [49] the dependence of the ggF HH cross section on κ λ and κ t can be reconstructed from three terms corresponding to the diagrams involving κ λ , κ t and the interference. Therefore, samples corresponding to any point in the (κ λ , κ t ) parameter space can be obtained from the linear combination of any three of the generated MC samples with different values of κ λ .
In addition, LO signal samples are generated for the BSM benchmark hypotheses described in section 3 using MadGraph5_amc@nlo v2.2.2 (2016) or v2.4.2 (2017 and 2018) [53][54][55]. The simulated LO signal samples, corresponding to the 12 BSM benchmark hypotheses, are added together to increase the number of events, and then reweighted to any coupling configuration (κ λ , κ t , c 2 , c g , c 2g ) using generator-level information on the HH system.
The VBF HH signal samples are generated at LO [53] using MadGraph5_amc@nlo v2.4.2. The simulated samples are generated for different combinations of the coupling modifier values (κ λ , c V , c 2V ). Similarly to what is done for the ggF HH samples generated at NLO, samples corresponding to any point in the (κ λ , c V , c 2V ) parameter space can be obtained from the linear combination of any six of the generated samples.
We apply a global k-factor to the generated ggF HH and VBF HH signal samples to scale the cross section to NNLO and next-to-NNLO accuracy respectively. The k-factor is obtained for the cross section prediction in the SM and applied to all considered scenarios. The k-factor for the ggF HH cross section depends on the invariant mass of the two Higgs bosons, however, within the region of sensitivity of this analysis, this effect is covered by the total scale uncertainty.
The dominant backgrounds in this search are irreducible prompt diphoton production (γγ+jets) and the reducible background from γ+jets events, where the jets are misidentified as isolated photons and b jets. Although these backgrounds are estimated using data-driven methods, simulated samples are used for the training of multivariate discriminants and the optimization of the analysis categories. The γγ + jets background is modeled with sherpa v.2.2.1 [56] at LO and includes up to three additional partons at the matrix element level. In addition, a b-enriched diphoton background is generated with sherpa at LO requiring up to two b jets to increase the number of simulated events in the analysis region of interest. The γ + jets background is modeled with pythia 8.212 [57] at LO.

JHEP03(2021)257
Single Higgs boson production, where the Higgs boson decays to a pair of photons, is considered as a resonant background. These production processes are simulated at NLO in QCD precision using powheg 2.0 [47,[58][59][60] for ggF H (ggH) and VBF H, and MadGraph5_amc@nlo v2.2.2 (2016) / v2.4.2 (2017 and 2018) for ttH, vector boson associated production (VH), and production associated with a single top quark. The cross sections and decay branching fractions are taken from ref. [16]. The contribution from the other single H decay modes is negligible.
All simulated samples are interfaced with pythia for parton showering and fragmentation with the standard p T -ordered parton shower (PS) scheme. The underlying event is modeled with pythia, using the CUETP8M1 tune for 2016 and the CP5 tune for 2017-2018 [61,62]. PDFs are taken from the NNPDF3.0 [63] NLO (2016) or NNPDF3.1 [64] NNLO (2017 and 2018) set for all simulated samples except for the signal simulated at LO, for which the PDF4LHC15_NLO_MC set at NLO [63,[65][66][67][68] is used. The response of the CMS detector is modeled using the Geant4 [69] package. The simulated events include additional pp interactions within the same or nearby bunch crossings (pileup), as observed in the data.
Additionally, the simulated VBF HH signal events are also interfaced with the pythia dipole shower scheme to model initial-state radiation (ISR) and final-state radiation (FSR) [70]. The dipole shower scheme correctly takes into account the structure of the color flow between incoming and outgoing quark lines, and its predictions are found to be in good agreement with the NNLO QCD calculations, as reported in ref. [71]. These simulated samples are used to derive the uncertainties associated with the pythia PS ISR and FSR parameters.

Event reconstruction and selection
The photon candidates are reconstructed from energy clusters in the ECAL not linked to charged-particle tracks (with the exception of converted photons). The photon energies measured by the ECAL are corrected with a multivariate regression technique based on simulation that accounts for radiation lost in material upstream of the ECAL and imperfect shower containment [46]. The ECAL energy scale in data is corrected using simulated Z → ee events, while the photon energy in simulated events is smeared to reproduce the resolution measured in data.
Photons are identified using a boosted decision tree (BDT)-based multivariate analysis (MVA) technique trained to separate photons from jets (photon ID) [46]. The photon ID is trained using variables that describe the shape of the photon electromagnetic shower and the isolation criteria, defined using sums of the transverse momenta of photons, and of charged hadrons, inside a cone of radius ∆R = (∆η) 2 + (∆φ) 2 = 0.3 around the photon candidate direction, where φ is the azimuthal angle in radians. The imperfect MC simulation modeling of the input variables is corrected to match the data using a chained quantile regression method [72] based on studies of Z → ee events. In this method, a set of BDTs is trained to predict the cumulative distribution function for a given input. Its prediction is conditional upon the three kinematic variables (p T , |η|, φ) and the global -7 -JHEP03(2021)257 event energy density [46], which are the input variables to the BDTs. The corrections are then applied to the simulated photons such that the predicted cumulative distribution function of the simulated variables is morphed onto the one observed in data.
Events are required to have at least two identified photon candidates that are within the ECAL and tracker fiducial region (|η| < 2.5), excluding the ECAL barrel-endcap transition region (1.44 < |η| < 1.57) because the reconstruction of a photon object in this region is not optimal. The photon candidates are required to pass the following criteria: 100 < m γγ < 180 GeV, p γ1 T /m γγ > 1/3 and p γ2 T /m γγ > 1/4, where m γγ is the invariant mass of the photon candidates. When more than two photon candidates are found, the photon pair with the highest transverse momentum p γγ T is chosen to construct the Higgs boson candidate. The primary pp interaction vertex in the event is identified using a multivariate technique based on a BDT following the same approach described in ref. [73]. The BDT is trained on simulated ggH events and has observables related to tracks recoiling against the identified diphoton system as inputs. The efficiency of the correct vertex assignment is greater than 99.9%, thanks to the requirement of at least two jets in the γγbb final state.
Jet candidates are required to have p T > 25 GeV and |η| < 2.4 (2.5) for 2016 (2017-2018) and to be separated from the identified photons by a distance of ∆R γj ≡ (∆η γj ) 2 + (∆φ γj ) 2 > 0.4. The jet η range is extended for the 2017 and 2018 data-taking years because of the new CMS pixel detector installed during the Phase-1 upgrade [74]. In addition, identification criteria are applied to remove spurious jets associated with calorimeter noise [75]. Jets from the hadronization of b quarks are tagged by a secondary vertex algorithm, DeepJet, based on the score from a deep neural network (DNN) [76,77]. We will refer to the output of this DNN as the b tagging score.
In addition to standard CMS jet energy corrections [78], a b jet energy regression [79] is used to improve the energy resolution of b jets and, therefore, the m jj resolution. The energy correction and resolution estimator are computed for each of the Higgs boson candidate jets through a regression implemented in a DNN and trained on jet properties. The regression simultaneously provides a b jet energy correction and a resolution estimator.
In events with more than two jets, the Higgs boson candidate is reconstructed from the two jets with the highest b tagging scores. The dijet invariant mass is required to be 70 < m jj < 190 GeV.
An additional regression was developed specifically for the γγbb final states to further improve the dijet invariant mass resolution. This regression exploits the fact that there is no genuine missing transverse momentum from the hard-scattering process in the γγbb final state, and follows a similar approach as used in ref. [29]. The regression targets the dijet invariant mass at the generator level, and is trained using the kinematic properties of the event and p miss T . The regression is trained on a simulated sample of b-enriched γγ + jets events.
The two regression techniques were validated on data collected by the CMS experiment. The two-step regression technique improves the dijet invariant mass resolution of the SM HH signal by about 20%, and the m jj peak position is shifted by 5.5 GeV (5%) closer to the expected Higgs boson mass.

JHEP03(2021)257
To select events corresponding to HH production via VBF, additional requirements are imposed. The VBF process is characterized by the presence of two additional energetic jets, corresponding to two quarks from each of the colliding protons scattered away from the beam line. These "VBF-tagged" jets are expected to have a large pseudorapidity separation, |∆η VBF jj |, and a large dijet invariant mass, m VBF jj . VBF-tagged jets are required to have p T > 40 (30) GeV for the leading (subleading) jet, |η| < 4.7, and be separated from the selected photon and b jet candidates by ∆R γj > 0.4 and ∆R bj > 0.4. Jets must also pass an identification criterion designed to reduce the number of selected jets originating from pileup [75]. The dijet pair with the highest dijet invariant mass m VBF jj is selected as the two VBF-tagged jets. We will refer to these requirements as "VBF selection criteria".

Analysis strategy
To improve the sensitivity of the search, MVA techniques are used to distinguish the ggF and VBF HH signal from the dominant nonresonant background. The output of the MVA classifiers is then used to define mutually exclusive analysis categories targeting VBF and ggF HH production. The HH signal is extracted from a fit to the invariant masses of the two Higgs boson candidates in the (m γγ , m jj ) plane simultaneously in all categories.
We study the properties of the HH system, built from the reconstructed diphoton and dijet candidates, to identify observables that can help us distinguish between the signal and background. The invariant mass distributions are shown in figure 3 for diphoton and dijet pairs in data and in signal and background simulation after imposing the selection criteria described in section 5. The signal has a peaking distribution in m γγ and m jj . The data distribution, dominated by the γγ + jets and γ + jets backgrounds, exhibits a falling spectrum because of the nonresonant nature of these processes. In this analysis, these characteristics are used to extract the signal via a fit to m γγ and m jj .
The distribution of M X , defined as: where m γγjj is the invariant mass of the two Higgs boson candidates, is particularly sensitive to different values of the couplings described in section 3. The M X distribution is less dependent on the dijet and diphoton energy resolutions than m γγjj if the dijet and diphoton pairs originate from a Higgs boson decay [80]. In figure 4, the distribution of M X is shown for several BSM benchmark hypotheses affecting ggF HH production (described in table 1) and for different values of c 2V affecting the VBF HH production mode. The SM HH process exhibits a broad structure in M X , induced by the interference between different processes contributing to HH production and shaped by the analysis selection. The signals with c 2V = 0 and c 2V = 2 have a much harder spectrum than the SM VBF HH signal.

The ttH background rejection
Single Higgs boson production is an important resonant background in the γγbb final state, with ttH production being dominant in high purity signal regions. To reduce ttH background contamination, a dedicated classifier (ttHScore) was developed. The classifier is Events / 1 GeV   trained on a mixture of SM HH events and events generated for the twelve BSM benchmark hypotheses (described in table 1) as signal, and ttH events as background. The discriminant uses a combination of low-level information from the individual PF candidates and high-level features describing kinematic properties of the event. The kinematic variables used in the training can be classified in three groups: angular variables, variables to distinguish semileptonic decays of W bosons produced in the top quark decay, and variables to distinguish hadronic decays of W bosons. The ttHScore discriminant is implemented with a DNN combining feed-forward and long short-term memory neural networks [81], based on the topology-classifier architecture introduced in ref. [82]. The network is implemented in Keras [83] using the TensorFlow [84] backend, and the hyperparameters are chosen through Bayesian optimization. The ttHScore output is shown in figure 5 (left) for data -10 -

JHEP03(2021)257
and simulated events. The events entering the analysis are required to pass a selection based on this classifier, which is optimized as described in section 9.

Background reduction in the ggF HH signal region
An MVA discriminant implemented with a BDT is used to separate the ggF HH signal and the dominant nonresonant γγ + jets and γ + jets backgrounds. We select several discriminating observables to be used in the training. They can be classified in three groups: kinematic variables, object identification variables, and object resolution variables. The first group exploits the kinematic properties of the HH system, the second helps to separate the signal from the reducible γ +jets background, and the third takes into account the resonant nature of the γγ and bb final states for signal. The following discriminating variables were chosen: • The H candidate kinematic variables: p γ T /m γγ , p j T /m jj for leading and subleading photons and jets, where p γ T and p j T are the transverse momenta of the selected photon and jet candidates.
• The HH transverse balance: p γγ T /m γγjj and p jj T /m γγjj , where p γγ T and p jj T are the transverse momenta of the diphoton and dijet candidates.
• Helicity angles: |cos θ CS HH |, |cos θ jj |, |cos θ γγ |, where |cos θ CS HH | is the Collins-Soper angle [85] between the direction of the H → γγ candidate and the average beam direction in the HH center-of-mass frame, while |cos θ jj | and |cos θ γγ | are the angles between one of the Higgs boson decay products and the direction defined by the Higgs boson candidate.
• Angular distance: minimum ∆R γj between a photon and a jet, ∆R min γj , considering all combinations between objects passing the selection criteria, and ∆R γj between the other photon-jet pair not used in the ∆R min γj calculation.
• b tagging: the b tagging score of each jet in the dijet candidate.
• photon ID: photon identification variables for leading and subleading photons.
• Object resolution: energy resolution for the leading and subleading photons and jets obtained from the photon [46] and b jet [79] energy regressions, the mass resolution estimators for the diphoton and dijet candidates.
The BDT is trained using the xgboost [86] software package using a gradient boosting algorithm. The γγ+jets and γ+jets MC samples are used as background, while an ensemble of SM HH and the 12 BSM HH benchmark hypotheses listed in table 1 is used as signal.
Training on an ensemble of BSM and SM HH signals makes the BDT sensitive to a broad spectrum of theoretical scenarios. During the training, signal events are weighted with the product of the inverse mass resolution of the diphoton and dijet systems. These resolutions   are obtained using the per-object resolution estimators provided by the energy regressions developed for photons and b jets. In the training, the mass dependence of the classifier is removed by using only dimensionless kinematic variables. The inverse resolution weighting at training time improves the performance by bringing back the information about the resonant nature of the signal. Independent training and testing samples are created by splitting the signal and background samples. The classifier hyperparameters are optimized using a randomized grid search and a 5-fold cross-validation technique [87]. The BDT is trained separately for the 2016, 2017, and 2018 data-taking years. The BDT output distribution is very similar among the three years, leading to the same definitions of optimal signal regions based on the BDT output. Therefore, during the event categorization, a single set of analysis categories is defined using data from 2016-2018. The distributions of the BDT output for signal and background are very well separated. In order to avoid problems of numerical precision when defining optimal signal-enriched regions, the BDT output is transformed such that the signal distribution is uniform. This transformation is applied to all events, both in simulation and data. The distribution of the MVA output for data and simulated events is shown in figure 5 (right).

Background reduction in the VBF HH signal region
Similarly to the ggF HH analysis strategy, an MVA discriminant is employed to separate the VBF HH signal from the background. As for the ggF case, the γγ + jets and γ + jets processes are the dominant sources of background. For the VBF production mode, the ggF HH events are considered as background. About a third of the ggF HH events passing the selection requirements described in section 5 also pass the dedicated VBF selection criteria. The distinctive topology of the VBF HH process is used to separate the VBF HH signal from the various sources of background. In addition to the discriminating features of the -12 -
• VBF-tagged jet invariant mass: invariant mass m VBF jj of the VBF-tagged jets.
• Rapidity gap: product of and difference in the pseudorapidity of the two VBF-tagged jets.
• Quark-gluon likelihood [88,89] of the two VBF-tagged jets. A likelihood discriminator used to distinguish between jets originating from quarks and from gluons.
• Kinematic variables related to the HH system: M X and the transverse momentum of the pair of reconstructed Higgs bosons.
• Angular distance: minimum ∆R between a photon and a VBF-tagged jet, and between a b jet and a VBF-tagged jet.
• Centrality variables for the reconstructed Higgs boson candidates: where H is the Higgs boson candidate reconstructed either from diphoton or dijet pairs, and η VBF 1 and η VBF 2 are the pseudorapidities of the two VBF-tagged jets.
We split events into two regions: M X < 500 GeV and M X > 500 GeV. While the region of M X > 500 GeV is sensitive to anomalous values of c 2V , the M X < 500 GeV region retains the sensitivity to SM VBF HH production.
A multi-class BDT, using a gradient boosting algorithm and implemented in the xgboost [86] framework, is trained to separate the VBF HH signal from the γγ + jets, γ + jets, and SM ggF HH background. A mix of VBF HH samples with the SM couplings and quartic coupling c 2V = 0 is used as signal. Training on the mix of samples makes the BDT sensitive to both SM and BSM scenarios. Although the kinematic properties of different BSM signals with anomalous values of c 2V are similar, the cross section of the signal with c 2V = 0 is significantly enhanced with respect to that predicted by the SM. Therefore, the signal samples used for the training were chosen to maximize sensitivity of the analysis to a range of potential signals. Signal events are weighted with the inverse of the mass resolution of the diphoton and dijet systems during the training, as it is done for the ggF MVA. The BDT is trained separately for each of the three data-taking years in the two M X regions. As it is done for the ggF MVA output, data from 2016-2018 are merged to create a single set of analysis categories based on the BDT output. The BDT output is transformed such that the distribution of the mix of the VBF HH signals with SM couplings and quartic coupling c 2V = 0 is uniform. The transformation is applied to all events in the two M X regions. The distribution of the MVA outputs for data and simulated events is shown in figure 6.   Figure 6. The distribution of the two MVA outputs is shown in data and simulated events in the two VBF M X regions: M X > 500 GeV (left) and M X < 500 GeV (right). Data, dominated by the γγ + jets and γ + jets backgrounds, are compared to the VBF HH signal samples with SM couplings and c 2V = 0, SM ggF HH and single H samples (ttH, ggH, VBF H, VH) after imposing the VBF selection criteria described in section 5. The error bars on the data points indicate statistical uncertainties. The HH signal has been scaled by a factor of 10 3 for display purposes.

Event categorization
In order to maximize the sensitivity of the search, events are split into different categories according to the output of the MVA classifier and the mass of the Higgs boson pair system M X . The M X distribution changes significantly for different BSM hypotheses, as shown in figure 4. Therefore, a categorization of HH events in M X creates signal regions sensitive to multiple theoretical scenarios. In the search for VBF HH production, the categories in M X are defined before the MVA is trained, as described in section 8.2. For the categories that target ggF HH production, categories in M X are defined after the MVA is trained.
The categorization is optimized by maximizing the expected significance estimated as the sum in quadrature of S/ √ B over all categories in a window centered on m H : 115 < m γγ < 135 GeV. Here, S and B are the numbers of expected signal and background events, respectively. Simulated events are used for this optimization. The SM HH process is considered as signal, while the background consists of the γγ + jets, γ + jets, and ttH processes. The MVA categories are optimized simultaneously with a threshold on the value of ttHScore. Two VBF and three ggF categories are optimized based on the MVA output. For ggF HH in each MVA category a set of M X categories is then optimized. The optimization procedure leads to 12 ggF analysis categories: four categories in M X in each of the three categories in the MVA score. The optimized selection on ttHScore > 0.26 corresponds to 80 (85)  of the overwhelming background contamination such events do not improve the expected sensitivity of the analysis. Similarly, events with ggF MVA scores below 0.37 are not considered in the ggF signal region.

Combination of the HH and ttH signals to constrain κ λ and κ t
As discussed in section 3, the HH production cross section depends on κ λ and κ t . The production cross section of the single H processes also depends on κ λ , as a result of NLO electroweak corrections [33]. The ggH and ttH production cross sections additionally depend on κ t . Therefore, the HH → γγbb signal can be combined with the single H production modes to provide an improved constraint on the κ λ and κ t parameters. In the case of anomalous values of κ λ , the single H process with the largest modification of the cross section is ttH. For this reason, additional orthogonal categories targeting the ttH process are included in the analysis: the "ttH leptonic" and the "ttH hadronic" categories, developed and optimized for the measurement of the ttH production cross section in the diphoton decay channel [32]. The events that do not pass the selections for the HH categories defined in table 2 are tested for the ttH categories. This ensures the orthogonality between the events selected by the HH and ttH categories. The H → γγ candidate selection is the same as described in section 5. The ttH leptonic categories target ttH events where at least one W boson, originating from the top or antitop quark, decays leptonically. At least one isolated electron (muon) with |η| < 2.4 and p T > 10 (5) GeV, and at least one jet with p T > 25 GeV are required. The ttH hadronic categories target hadronic decays of W bosons. In these categories at least three jets are -15 -JHEP03(2021)257 required, one of which must be b tagged, and a lepton veto is imposed. In order to maximize the sensitivity, an MVA approach is used to separate the ttH events from the background, dominated by γγ + jets, γ + jets, tt + jets, tt + γ, and tt + γγ events. A BDT classifier is trained for each of the two channels using simulated events. The variables used for the training include kinematic properties of the reconstructed objects, object identification variables, and global event properties such as jet and lepton multiplicities. The BDT input variables also include the outputs of other machine learning algorithms trained specifically to target different backgrounds. These include DNN classifiers trained to reduce the tt + γγ and γγ + jets background, and a top quark tagger based on a BDT [90]. The output scores of the BDTs are used to reject background-like events and to classify the remaining events in four subcategories for each of the two channels. The boundaries of the categories are optimized by maximizing the expected significance of the ttH signal.

Signal model
In each of the HH categories, a parametric fit in the (m γγ , m jj ) plane is performed. In the ttH categories, the m γγ distribution is fitted to extract the signal. When the HH and ttH categories are combined, both the HH and ttH production modes are considered as signals.
The shape templates of the diphoton and dijet invariant mass distributions are constructed from simulation. In each HH and ttH analysis category, the m γγ distribution is fitted using a sum of, at most, five Gaussian functions. Figure 7 (left) shows the signal model for m γγ in the VBF and ggF CAT0 categories, which are the categories with the best resolution.
For the HH categories, the m jj distributions are modeled with a double-sided Crystal Ball (CB) function, a modified version of the standard CB function [91] with two independent exponential tails. Figure 7 (right) shows the signal model for m jj in the VBF and ggF categories with the best resolution.
For the HH signal, the final two-dimensional (2D) signal probability distribution function is a product of the independent m γγ and m jj models. The possible correlations are investigated by comparing the 2D m γγ -m jj distributions in the simulated signal samples with the 2D probability distributions built as a product of the one-dimensional (1D) ones. With the statistical precision available in this analysis, the correlations have been found to be negligible.

Single Higgs background model
The SM single H background shape is constructed from the simulation following the same methodology as used for the signal model described in section 10. For each analysis category and single H production mode, the m γγ distributions are fitted using a sum of, at most, five Gaussian functions. The m jj modeling in the HH categories depends on the production mechanism, and a parametrisation is obtained from the simulated distributions: for the ggH and VBF H processes, the m jj distribution is modeled with a Bernstein polynomial; for VH production, a CB function is used to model the distribution of the hadronic decays of vector bosons; for ttH, where the two b jets are produced from a top quark decay, a Gaussian function with a mean around 120 GeV is used. Like for the signal modeling, the final 2D SM single H model is a product of the independent models of the m γγ and m jj distributions.

Nonresonant background model
The model used to describe the nonresonant background is extracted from data using the discrete profiling method [92] as described in ref. [73]. This technique was designed as a way to estimate the systematic uncertainty associated with choosing a particular analytic function to fit the background m γγ and m jj distributions. The method treats the choice of the background function as a discrete nuisance parameter in the likelihood fit to the data.

JHEP03(2021)257
This method is used to model m γγ distribution of the nonresonant background in the ttH categories. For the HH categories, the method is generalized to the 2D model case as a product of two 1D models for m γγ and m jj .
A set of MC pseudo-experiments was generated with positive and negative correlations between m γγ and m jj injected and then fitted with the factorized 2D model. A negligible bias has been observed, and the correlations have been found to be within the statistical precision of the analysis.

Systematic uncertainties
The systematic uncertainties only affect the signal model and the resonant single H background, since the nonresonant background model is constructed in a data-driven way with the uncertainties associated with the choice of a background fit function taken into account by the discrete profiling method described in section 11.2. The systematic uncertainties can affect the overall normalization, or a variation in category yields, representing event migration between the categories. Theoretical uncertainties have been applied to the HH and single H normalizations. The following sources of theoretical uncertainty are considered: the uncertainty in the signal cross section arising from scale variations, uncertainties on α S , PDFs and in the prediction of the branching fraction B(HH → γγbb). The dominant theoretical uncertainties arise from the prediction of the SM HH and ttH production cross sections. In addition, a conservative PS uncertainty is assigned to the VBF HH signal, defined as the full symmetrized difference in yields in each category obtained with simulated samples of VBF HH events interfaced with the standard p T -ordered and dipole shower PS schemes.
The dominant experimental uncertainties are: • Photon identification BDT score: the uncertainty arising from the imperfect MC simulation of the input variables to the photon ID is estimated by rederiving the corrections with equally sized subsets of the Z → ee events used to train the quantile regression BDTs. Its magnitude corresponds to the standard deviation of the eventby-event differences in the photon ID evaluated on the two different sets of corrected input variables. This uncertainty reflects the limited capacity of the BDTs arising from the finite size of the training set. It is seen to cover the residual discrepancies between data and simulation. The uncertainty in the signal yields is estimated by propagating this uncertainty through the full category selection procedure.
• Photon energy scale and resolution: the uncertainties associated with the corrections applied to the photon energy scale in data and the resolution in simulation are evaluated using Z → ee events [93].
• Per-photon energy resolution estimate: the uncertainty in the per-photon resolution is parametrized as a rescaling of the resolution by ±5% around its nominal value. This is designed to cover all differences between data and simulation in the distribution, which is an output of the energy regression.

JHEP03(2021)257
• Jet energy scale and resolution corrections: the energy scale of jets is measured using the p T balance of jets with Z bosons and photons in Z → ee, Z → µµ, and γ + jets events, as well as using the p T balance between jets in dijet and multijet events [41,89]. The uncertainty in the jet energy scale and resolution is a few percent and depends on p T and η. The impact of uncertainties on the event yields is evaluated by varying the jet energy corrections within their uncertainties and propagating the effect to the final result. Some sources of the jet energy scale uncertainty are fully (anti-)correlated, while others are considered uncorrelated.
• Jet b tagging: uncertainties in the b tagging efficiency are evaluated by comparing data and simulated distributions for the b tagging discriminator [94]. These include the statistical uncertainty in the estimate of the fraction of heavy-and light-flavor jets in data and simulation.
• Trigger efficiency: the efficiency of the trigger selection is measured with Z → ee events using a tag-and-probe technique [95]. An additional uncertainty is introduced to account for a gradual shift in the timing of the inputs of the ECAL L1 trigger in the region |η| > 2.0, which caused a specific trigger inefficiency during 2016 and 2017 data taking. Both photons and, to a greater extent, jets can be affected by this inefficiency, which has a small impact.
• Photon preselection: the uncertainty in the preselection efficiency is computed as the ratio between the efficiency measured in data and in simulation. The preselection efficiency in data is measured with the tag-and-probe technique in Z → ee events [95].
• Integrated luminosity: uncertainties are determined by the CMS luminosity monitoring for the 2016-2018 data-taking years [96][97][98] and are in the range of 2.3-2.5%. To account for common sources of uncertainty in the luminosity measurement schemes, some sources are fully (anti-)correlated across the different data-taking years, while others are considered uncorrelated. The total 2016-2018 integrated luminosity has an uncertainty of 1.8%.
• Pileup jet identification: the uncertainty in the pileup jet classification output score is estimated by comparing the score of jets in events with a Z boson and one balanced jet in data and simulation. The assigned uncertainty depends on p T and η, and is designed to cover all differences between data and simulation in the distribution.
Most of the experimental uncertainties are uncorrelated among the three data-taking years. Some sources of uncertainty in the measured luminosity and jet energy corrections are fully (anti-)correlated, while others are considered uncorrelated. This search is statistically limited, and the total impact of systematic uncertainties on the result is about 2%.

Results
An unbinned maximum likelihood fit to the m γγ and m jj distributions is performed simultaneously in the 14 HH categories to extract the HH signal. A likelihood function is defined for each analysis category using analytic models to describe the m γγ and m jj distributions of signal and background events, with nuisance parameters to account for the experimental and theoretical systematic uncertainties described in section 12. The fit is performed in the mass ranges 100 < m γγ < 180 GeV and 70 < m jj < 190 GeV for all categories apart from ggF CAT10 and CAT11. In those two categories, a small but nonnegligible shoulder was observed in the m jj distribution. Therefore, the m jj fit range is reduced to 90 < m jj < 190 GeV to avoid a possible bias with minimal impact on the analysis sensitivity.
In order to determine κ λ and κ t , the HH and ttH categories are used together in a simultaneous maximum likelihood fit. In the ttH categories, a binned maximum likelihood fit is performed to m γγ in the mass range 100 < m γγ < 180 GeV.
The data and the signal-plus-background model fit to m γγ and m jj are shown in figure 8 for the best resolution ggF and VBF categories. The distribution of events weighted by S/(S+B) from all HH categories is shown in figure 9 for m γγ and m jj . In this expression, S (B) is the number of signal (background) events extracted from the signal-plusbackground fit.
No significant deviation from the background-only hypothesis is observed. We set upper limits at 95% CL on the product of the production cross section of a pair of Higgs bosons and the branching fraction into γγbb, σ HH B(HH → γγbb), using the modified frequentist approach for confidence levels (CL s ), taking the LHC profile likelihood ratio as a test statistic [99][100][101][102] in the asymptotic approximation. The observed (expected) 95% CL upper limit on σ HH B(HH → γγbb) amounts to 0.67 (0.45) fb. The observed (expected) limit corresponds to 7.7 (5.2) times the SM prediction. All results were extracted assuming m H = 125 GeV. We observe a variation smaller than 1% in both the expected and observed upper limits when using m H = 125.38 ± 0.14 GeV, corresponding to the most precise measurement of the Higgs boson mass to date [103].
Limits are also derived as a function of κ λ , assuming that the top quark Yukawa coupling is SM-like (κ t = 1). The result is shown in figure 10. The variation in the excluded cross section as a function of κ λ is directly related to changes in the kinematical properties of HH production. At 95% CL, κ λ is constrained to values in the interval [−3.3, 8.5], while the expected constraint on κ λ is in the interval [−2.5, 8.2]. This is the most sensitive search to date.
Assuming instead that an HH signal exists with the properties predicted by the SM, constraints on λ HHH can be set. The results are obtained both with the HH categories only, and with the HH categories combined with the ttH categories in a simultaneous maximum likelihood fit. The HH signal is considered together with the single H processes (ttH, ggH, VBF H,VH, and Higgs boson production in association with a single top quark). The cross sections and branching fractions of the HH and single H processes are scaled as a function of κ λ , while the top quark Yukawa coupling is assumed to be SM-like, κ t = 1.  One-dimensional negative log-likelihood scans for κ λ are shown in figure 11 for an Asimov data set [101] generated with the SM signal-plus-background hypothesis, κ λ = 1, and for the observed data. When combining the HH analysis categories with the ttH categories, we obtain κ λ = 0.6 +6.3  figure 11 is characterized by 2 minima. This is related to an interplay between the cross section dependence on κ λ and differences in acceptance between the analysis categories.   Figure 11. Negative log-likelihood, as a function of κ λ , evaluated with an Asimov data set assuming the SM hypothesis (left) and the observed data (right). The 68 and 95% CL intervals are shown with the dashed gray lines. The two curves are shown for the HH (blue) and HH +ttH (orange) analysis categories. All other couplings are set to their SM values.
The HH and single Higgs boson production cross sections depend not only on κ λ , but also on κ t . To better constrain the κ λ and κ t coupling modifiers, a 2D negative loglikelihood scan in the (κ λ , κ t ) plane is performed, taking into account the modification of the production cross sections and B(H → bb), B(H → γγ) for anomalous (κ λ , κ t ) values. The modification of the single H production cross section for anomalous κ λ is modeled at NLO, while the dependence on κ t is parametrized at LO only, neglecting NLO effects [33]. This approximation holds as long as the value of |κ t | is close to unity, roughly in the range 0.7 < κ t < 1.3. The parametric model is not reliable outside of this range. Figure 12 shows the 2D likelihood scans of κ λ versus κ t for an Asimov data set assuming the SM hypothesis and for the observed data. The regions of the 2D scan where the κ t parametrization for anomalous values of κ λ at LO is not reliable are shown with a gray band.
The inclusion of the ttH categories significantly improves the constraint on κ t . The 1D negative log-likelihood scan, as a function of κ t with κ λ fixed at κ λ = 1, is shown in figure 13 for an Asimov data set generated assuming the SM hypothesis, κ t = 1, as well as for the observed data. The measured value of κ t is κ t = 1. Upper limits at 95% CL are also set on the product of the HH VBF production cross section and branching fraction, σ VBF HH B(HH → γγbb), with the yield of the ggF HH signal constrained within uncertainties to the one predicted in the SM. The observed (expected) 95% CL upper limit on σ VBF HH B(HH → γγbb) amounts to 1.02 (0.94) fb. The limit corresponds to 225 (208) times the SM prediction. This is the most stringent constraint on σ VBF HH B(HH → γγbb) to date.   Figure 12. Negative log-likelihood contours at 68 and 95% CL in the (κ λ , κ t ) plane evaluated with an Asimov data set assuming the SM hypothesis (left) and the observed data (right). The contours obtained using the HH analysis categories only are shown in blue, and in orange when combined with the ttH categories. The best fit value for the HH categories only (κ λ = 0.6, κ t = 1.2) is indicated by a blue circle, for the HH + ttH categories (κ λ = 1.4, κ t = 1.3) by an orange diamond, and the SM prediction (κ λ = 1.0, κ t = 1.0) by a black star. The regions of the 2D scan where the κ t parametrization for anomalous values of κ λ at LO is not reliable are shown with a gray band.  Figure 13. Negative log-likelihood scan, as a function of κ t , evaluated with an Asimov data set assuming the SM hypothesis (left) and the observed data (right). The 68 and 95% CL intervals are shown with the dashed gray lines. The two curves are shown for the HH (blue) and the HH +ttH (orange) analysis categories. All other couplings are fixed to their SM values.
-24 - Limits are also set, as a function of c 2V , as presented in figure 14. The observed excluded region corresponds to c 2V < −1.3 and c 2V > 3.5, while the expected exclusion is c 2V < −0.9 and c 2V > 3.1. It can be seen in figure 14 that this analysis is more sensitive to anomalous values of c 2V than to the region around the SM prediction. This is related to the fact that, for anomalous values of c 2V , the total cross section is enhanced and the M X spectrum is harder as shown in figure 4 (right). This leads to an increase in the product of signal acceptance and efficiency as well as a more distinct signal topology.
Assuming HH production occurs via the VBF and ggF modes, we set constraints on the κ λ and c 2V coupling modifiers simultaneously. A 2D negative log-likelihood scan in the (κ λ , c 2V ) plane is performed using the 14 HH analysis categories. Figure 15 shows 2D likelihood scans for the observed data and for an Asimov data set assuming all couplings are at their SM values.
We also set upper limits at 95% CL for the twelve BSM benchmark hypotheses defined in table 1. In this fit, the yield of the VBF HH signal is constrained within uncertainties to the one predicted in the SM. The limits for different BSM hypotheses are shown in figure 16 (upper). In addition, limits are also calculated as a function of the BSM coupling between two Higgs bosons and two top quarks, c 2 , as presented in figure 16 (lower). The observed excluded region corresponds to c 2 < −0.6 and c 2 > 1.1, while the expected exclusion is c 2 < −0.4 and c 2 > 0.9.

Summary
A search for nonresonant Higgs boson pair production (HH) has been presented, where one of the Higgs bosons decays to a pair of bottom quarks and the other to a pair of photons. This search uses proton-proton collision data collected at √ s = 13 TeV by the CMS experiment at the LHC, corresponding to a total integrated luminosity of 137 fb −1 . No significant deviation from the background-only hypothesis is observed. Upper limits at 95% confidence level (CL) on the product of the HH production cross section and the branching fraction into γγbb are extracted for production in the standard model (SM) and in several scenarios beyond the SM. The expected upper limit at 95% CL on σ HH B(HH → γγbb) is 0.45 fb, corresponding to about 5.2 times the SM prediction, while the observed upper limit is 0.67 fb, corresponding to 7.7 times the expected value for the SM process. The presented search has the highest sensitivity to the SM HH production to date. Upper limits at 95% CL on the SM HH production cross section are also derived as a function of the Higgs boson self-coupling modifier κ λ ≡ λ HHH /λ SM HHH assuming that the top quark Yukawa coupling is SM-like. The coupling modifier κ λ is constrained within a range −3.3 < κ λ < 8.5, while the expected constraint is within a range −2.5 < κ λ < 8.2 at 95% CL.
This search is combined with an analysis that targets top quark-antiquark associated production of a single Higgs boson decaying to a diphoton pair. In the scenario in which the HH signal has the properties predicted by the SM, the coupling modifier κ λ has been constrained. In addition, a simultaneous constraint on κ λ and the modifier of the coupling between the Higgs boson and the top quark κ t is presented when both the HH and single Higgs boson processes are considered as signals.
Limits are also set on the cross section of nonresonant HH production via vector boson fusion (VBF). The most stringent limit to date is set on the product of the HH VBF -26 -

JHEP03(2021)257
Shape benchmark  Figure 16. Expected and observed 95% CL upper limits on the product of the ggF HH production cross section and B(HH → γγbb) obtained for different nonresonant benchmark models (defined in table 1) (upper) and BSM coupling c 2 (lower). In this fit, the yield of the VBF HH signal is constrained within uncertainties to the one predicted in the SM. The green and yellow bands represent, respectively, the one and two standard deviation extensions beyond the expected limit. On the lower plot the long-dashed red line shows the theoretical prediction.

JHEP03(2021)257
production cross section and the branching fraction into γγbb. The observed (expected) upper limit at 95% CL amounts to 1.02 (0.94) fb, corresponding to 225 (208) times the SM prediction. Limits are also set as a function of the modifier of the coupling between two vector bosons and two Higgs bosons, c 2V . The observed excluded region corresponds to c 2V < −1.3 and c 2V > 3.5, while the expected exclusion is c 2V < −0.9 and c 2V > 3.1.
Numerous hypotheses on coupling modifiers beyond the SM have been explored, both in the context of inclusive Higgs boson pair production and for HH production via gluongluon fusion and VBF. The production of Higgs boson pairs was also combined with the top quark-antiquark pair associated production of a single Higgs boson. Overall, all of the results are consistent with the SM predictions.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies:  -35 -