Search for nonresonant Higgs boson pair production in the four leptons plus two b jets final state in proton-proton collisions at $\sqrt{s}$ = 13 TeV

The first search for nonresonant production of Higgs boson pairs (HH) with one H decaying into four leptons and the other into a pair of b quarks is presented, using proton-proton collisions recorded at a center-of-mass energy of $\sqrt{s}$ = 13 TeV by the CMS experiment. The analyzed data correspond to an integrated luminosity of 138 fb$^{-1}$. A 95% confidence level upper limit of 32.4 is set on the signal strength modifier $\mu$, defined as the ratio of the observed HH production rate in the HH $\to$ ZZ*b$\mathrm{\bar{b}}$ $\to$ 4$\ell$b$\mathrm{\bar{b}}$ decay channel to the standard model expectation. Possible modifications of the H trilinear coupling $\lambda_\text{HHH}$ with respect to the standard model (SM) value are investigated. The coupling modifier $\kappa_{\lambda}$, defined as $\lambda_\text{HHH}$ divided by its SM prediction, is constrained to be within the observed (expected) range -8.8 (-9.8) $<$ $\kappa_{\lambda}$ $<$ 13.4 (15.0) at 95% confidence level.


Introduction
The discovery of the Higgs boson (H) by the ATLAS and CMS Collaborations [1, 2] is the first experimental proof of the predicted mechanism of electroweak symmetry breaking. Further measurements of the H properties performed by both Collaborations were found to be compatible with the standard model (SM) predictions [3][4][5]. A determination of the H self-coupling provides an independent test of the SM by probing the Higgs scalar field potential [6]; this parameter can be obtained by measuring the production of H pair (HH) at the LHC.
By exploiting different decay channels, studies of HH production allow different regions of the kinematic parameters describing possible extensions of the SM and of the nonresonant invariant mass spectrum to be probed. A combination of different decay channels is therefore needed to achieve the best possible sensitivity to HH production.
A wide variety of HH decay channels and their combination were studied by the ATLAS [7-10] and CMS [11][12][13][14][15][16][17] Collaborations: the most recent published results on the HH combination [18,19], with a data set corresponding to an integrated luminosity of about 36 fb −1 , set an observed (expected) upper limit of, respectively, 7 (10) and 22 (13) at the 95% confidence level (CL) on the SM production cross section times the theoretical prediction.
The H to four-lepton (H → ZZ * → 4 , = e, µ) decay channel is the rarest observed so far at the LHC but it has the largest signal-to-background ratio. The analysis presented in this paper focuses on gluon-gluon fusion HH production where one H decays to a Z boson pair, which in turn decays into 4 , and the other to a pair of b quarks (bb), hadronizing into jets. The final state forms a clear signature granted by the presence of the four leptons, while the high branching fraction of the bb decay channel partially compensates for the small branching fraction of the 4 channel. The leading-order (LO) gg → HH Feynman diagrams are shown in  The HH SM production cross section via gluon-gluon fusion has been computed at next-tonext-to-LO (NNLO) in quantum chromodynamics (QCD), including next-to-next-to-leading logarithmic (NNLL) corrections and finite top quark mass (m top ) effects at next-to-LO (NLO). Its value is σ HH = 31.05 +2.2 −5.0 %(scale uncertainties) ± 2.1%(PDF) ± 2.1%(α S ) +4.0 −18.0 %(m top ) fb in proton-proton (pp) collisions at 13 TeV for a H mass of 125 GeV [20], where PDF stands for parton distribution function. After accounting for the branching fraction B(HH → ZZ * bb → 4 bb ), the cross section is reduced to approximately 4.5 ab in the SM.
Beyond the SM physics can significantly modify the cross section and the kinematic properties of HH production, such as the HH invariant mass distribution. In order to provide constraints on these effects, an electroweak chiral Lagrangian model [21,22], an effective field theory based approach that extends the SM with dimension-six operators, is used in an equivalent way as done in Ref. [23] but at NLO in QCD. This approach results in five couplings relevant for HH production, out of which the H trilinear coupling λ HHH is selected to perform scans of its allowed variations based on the results of this analysis. The variations of this coupling with respect to its SM prediction are parametrized by the coupling modifier, κ λ = λ HHH /λ SM .

The CMS detector
The CMS detector features a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid. Events of interest are selected using a two-tiered trigger system. A detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [24].

Data sets and simulations
Data from pp collisions collected by the CMS experiment in 2016-2018, at a center-of-mass energy of 13 TeV, are used for the analysis. Data-taking conditions vary year by year, and the samples used correspond to a total integrated luminosity of 138 fb −1 . The triggers used for this analysis require the presence of a pair of loosely isolated leptons (muons or electrons), whose exact requirements depend on the data-taking year. Triggers requiring a triplet of low transverse momentum (p T ) leptons, and isolated single-electron and single-muon triggers are used as well to recover efficiency. The overall trigger efficiency for events that satisfy the 4 selection described below is greater than 98%. The trigger scheme used is the same as that used in the H → ZZ * → 4 analysis previously published by the CMS Collaboration [25].
Signal samples of the SM gg → HH → ZZ * bb → 4 bb process are generated at NLO in QCD using the POWHEG v2 [26][27][28] generator with the implementation described in [21,22]. Using the same model, beyond SM signal samples with alternative values of κ λ in steps of size 0.5 between −10 and +10 are generated. Single H production processes constitute the main SM background for this analysis. Descriptions of these processes are obtained using the POWHEG v2 generator for the six main production modes: gluon fusion (ggH), vector boson fusion (VBF) [29], and associated production (WH, ZH, bbH, and ttH [30]). In the case of ggH, WH, and ZH, the MINLO extension of POWHEG is used [31,32] to increase the accuracy in the dijet phase space. The description of the decay of the H to 4 is obtained using the JHUGEN generator [33] version 7.0.9. In the case of WH, ZH, bbH, and ttH, the H is allowed to decay to H → ZZ * → 2 2X where X could be any other particle from the Z decay so that 4 events where two leptons originate from the decay of associated vector bosons or top quarks are also taken into account. The hadronic shower propagation of parton-level events, referred to as showering, is then performed by allowing QCD emissions at all energies in the shower and by vetoing them afterwards according to the POWHEG internal scale [27].
Several backgrounds include genuine nonresonant ZZ * events. Production of ZZ * via quarkantiquark annihilation is generated at NLO using MADGRAPH5 aMC@NLO with up to one extra parton, with appropriate settings to merge jet multiplicities [34]. As this simulation covers a large range of ZZ * invariant masses m(ZZ * ), dynamical factorization and renormalization scales have been chosen. The simulated sample is rescaled by the ratio (denoted K-factor) of NNLO to NLO cross sections, and additional NLO electroweak corrections [35] are applied to the Monte Carlo (MC) sample as a function of m(ZZ * ).
The gg → ZZ * process is simulated at LO with MCFM [36,37] version 7.0.1. In order to match the gg → H → ZZ * p T spectra predicted at NLO, the showering for MCFM samples is performed allowing only emissions up to the parton-level scale. An exact calculation beyond the NLO does not exist for the gg → ZZ * background, but it has been shown [38] that the soft collinear approximation can describe the background cross section and the interference term at NNLO. The NNLO K-factor for the gluon fusion H production is obtained as a function of m(ZZ * ) using the HNNLO v2 MC generator [39][40][41]. Triboson production with at least one Z boson with leptonic decays is generated at NLO using MADGRAPH5 aMC@NLO, using similar settings.
The contribution from Z+jets events is estimated from data-driven techniques as described in Section 5.2. Other small backgrounds, such as ttZ, are estimated from simulation.
The PYTHIA 8.2.30 [42] package is used for parton showering, hadronization, and the underlying event simulation, with parameters set according to the CUETP8M1 tune [43] for the 2016 data-taking period and the CP5 tune [44] for the 2017-2018 data-taking periods. The NNPDF set of PDFs [45] is used, NNPDF3.0 (3.1) for the 2016 (2017-2018) periods.
The detector response is simulated using a detailed description of the CMS detector implemented using the GEANT4 package [46,47]. Simulated events are reconstructed by the same algorithms used for the data. The simulated samples include additional interactions in the same and neighboring bunch crossings, referred to as pileup. Simulated events are weighted to match the pileup distribution observed in the data; this procedure is performed separately for each year.

Event reconstruction and selection
The final state consists of at least two pairs of opposite-charged isolated leptons and at least two jets. The 4 selection is similar to that used in the CMS H → ZZ * → 4 measurement described in Refs. [4,25].
Events are reconstructed using a particle-flow algorithm [48] that aims at reconstructing and identifying each individual particle with an optimized combination of all subdetector information.
The primary interaction vertex, from which the H candidates arise, is chosen to be the one corresponding to the hardest scattering in the event, evaluated using tracking information alone, as described in Ref. [49].
Electrons are identified with a multivariate classifier, which includes observables sensitive to Bremsstrahlung along the electron trajectory, the geometrical and kinematical compatibility between the electron track and the associated energy cluster in the electromagnetic calorimeter, the shape of the electromagnetic shower, and variables that discriminate against electrons originating from photon conversions [50].
In order to further suppress electrons from photon conversions and muons originating from in-flight decays of hadrons, the three-dimensional impact parameter significance is required to be smaller than four.
Muons are reconstructed by combining information from the silicon tracker and the muon system [51]. The muons are selected from the reconstructed muon track candidates by applying minimal requirements on the track in both the muon system and silicon tracker, and taking into account their compatibility with small energy deposits in the calorimeters.
Muons are required to be isolated from other particles in the event. The relative isolation is defined as where the sums run over the charged and neutral hadrons, and photons, in a cone defined by ∆R ≡ (∆η) 2 + (∆φ) 2 = 0.3 around the lepton trajectory, where φ is the azimuthal angle in radians, and p T is the muon transverse momentum. To minimize the contribution of particles originating from pileup interactions to the isolation calculation, charged hadrons are included only if they originate from the primary vertex. The ∆β term is a correction factor accounting for the energy deposits originating from pileup interactions [52]. Muons with R iso < 0.35 are considered isolated. A similar requirement is not applied to electrons, because the isolation information is included in the multivariate classifier.
The efficiency of the lepton reconstruction and selection is measured in both data and simulated samples in bins of lepton transverse momentum, p T , and pseudorapidity, η , by using the tagand-probe technique applied to leptonic Z boson decays [53]. Correction factors, denoted scale factors (SFs), are used to rescale the simulation so that it matches the data [25]. Leptons energy scale and resolution corrections are derived using the J/ψ meson and Z boson leptonic decays in bins of p T and η .
The lepton final-state radiation is recovered following an algorithm described in Ref. [4].
Jets are reconstructed from particle-flow candidates using the anti-k T clustering algorithm [54], as implemented in the FASTJET package [55], with a distance parameter of 0.4. To ensure a good reconstruction efficiency and to reduce the instrumental background as well as the contamination from pileup, loose identification criteria based on the multiplicities and energy fractions carried by charged and neutral hadrons are imposed on jets [56]. Only jets in the tracker fiducial region (|η| < 2.4) are considered.
Jet energy corrections are extracted from data and simulated events to account for the effects of pileup, non-uniformity of the detector response, and residual differences between the jet energy scale in the data and in the simulation. The jet energy scale calibration [57][58][59] relies on corrections parametrized in terms of the raw (uncorrected) p T and η of the jet. In order to ensure that jets are well measured and to reduce the pileup contamination, all jets must have a corrected p T > 20 GeV.
Jets originating from the fragmentation of one b quark are identified by using the DeepCSV algorithm [60], which combines impact parameter significance, secondary vertex, and jet kinematical information to provide a final output discriminator whose score is computed using a deep artificial neural network. Data-to-simulation b-tagging correction factors are applied to simulated events as a function of jet p T , η, and flavour.
A signal event must contain at least two Z boson candidates, each formed from pairs of isolated electrons or muons of opposite electrical charges; these leptons are selected according to the studies described in Ref.
[4]. Only reconstructed electrons (muons) with a p T > 7(5) GeV are considered. Among the 4 , the highest-p T one must have p T > 20 GeV, while the secondhighest p T one must have p T > 12 (10) GeV if it is an electron (muon). All leptons are required to be separated by ∆R ( 1 , 2 ) > 0.02 from each other, and electrons are required to be separated from muons by ∆R (e, µ) > 0.05.
Within each event, all permutations of leptons giving a valid pair of Z boson candidates are considered. For each ZZ candidate, the lepton pair with the invariant mass closest to the nominal Z boson mass is denoted Z 1 and required to have a mass greater than 40 GeV. The other dilepton candidate is denoted Z 2 and must satisfy 12 < m Z 2 < 120 GeV where the lower cut is used to discard low mass dilepton resonances while the upper threshold corresponds to the higher limit of the Z 1 mass range, as described in Ref.
[4]. All pairs of oppositely charged leptons that can be built from the ZZ candidate, regardless of flavour, are required to satisfy a requirement on their invariant mass, m , to be larger than 4 GeV to suppress the QCD background with low transverse momentum leptons from hadron decays. If more than one ZZ candidate passes the selection, the one with the highest value of the scalar p T -sum of leptons is retained.
In order to search for HH production, in addition to the ZZ selections, events are required to have at least two jets, which are required to be separated from the leptons of the ZZ candidate by ∆R(lepton, jet) > 0.3 and to have p T > 20 GeV. At this stage, no other requirement on the value of the invariant mass or on the score of the DeepCSV b-tagging algorithm for the jets is applied. The bb candidate is formed using the two jets with the highest b-tagging score.
The 4 signal region (SR) is defined by applying the additional requirement on the four-lepton invariant mass to be 115 < m(4 ) < 135 GeV. Events that do not satisfy this requirement are included in the 4 control region (CR). The expected yields in the SR after the above selection are reported in Table 1. Figure 2 shows the m(4 ) and m(bb ) distributions in signal, estimated background components, and data.

Irreducible background
The irreducible background to the HH signal is estimated from simulation. One of the main background contributions is the production of single H decaying into ZZ * , whose description is provided using the POWHEG v2 generator. For the gluon fusion H production mode, NNLO parton shower weights are applied. Another large contribution is the qq → ZZ * process, which is simulated at NLO and then reweighted with NNLO/NLO K-factors to account for the missing higher orders in the cross section used. The gg → ZZ * contribution too is estimated from MC by reweighting it with NNLO K-factors. Other backgrounds, such as ttZ and ttW, are also estimated from simulation.

Reducible background
The reducible background (Z+X) originates from processes that contain one or more nonprompt leptons (leptons not originating from the hard-interaction primary vertex). The main sources of non-prompt leptons are non-isolated electrons and muons coming from decays of  The number of expected events for these background processes is estimated by measuring the probabilities f e and f µ for misidentified electrons and muons that do pass looser selection criteria to also pass the final selection criteria in selected samples of Z( ) + e + 2 jets and Z( ) + µ + 2 jets events, where the additional lepton (e or µ) satisfies a loose selection. The misidentification probabilities are evaluated using the requirement |m inv ( 1 2 ) − M Z | < 7 GeV, to reduce the contribution from asymmetric conversions of photons, which is dominant at low dilepton masses. The misidentification rates are measured in bins of the p T of the looselyidentified lepton and separately for the barrel and the endcap regions.
To estimate the Z+X yield in the SR, we define two control samples as subsets of Z + 2 jets events that pass the Z candidate selection, requiring an additional pair of loose leptons of same flavour and opposite charge passing the impact parameter selection. The first control sample is obtained by requiring that the two loose leptons, which do not form the Z 1 candidate, do not pass the final identification and isolation criteria (2P2F region). The second control sample is obtained by requiring that one of the 4 does not pass the final identification and isolation criteria, and the other three leptons do (3P1F region).
The expected number of events from reducible backgrounds in the SR can be written as described in Ref. [4]: where N 2P2F is the observed number of events in the first control sample, N 3P1F is the observed number of events in the second control sample, N bkg 3P1F is the contamination of 2P2F in the 3P1F CR, N ZZ 3P1F is the number of background events which have 4 real leptons and 2 jets estimated from MC simulation, and f i and f j are the estimated misidentification probabilities of the two loose leptons.
The dominant systematic uncertainty in the reducible background estimation arises mainly from the limitated number of events both in the CR and in the region where the misidentification rates are computed. Uncertainties that arise from the difference in the composition of the sample from which the misidentification rate is computed are also taken into account.

Multivariate analysis
A boosted decision tree (BDT) is used to further discriminate signal from backgrounds. The BDT is implemented in the TMVA tool of the ROOT analysis package [61], and trained using simulated events in the 4 SR; training is done using the gluon-gluon HH process as signal and all the other processes as background.
An optimal set of input variables was chosen maximizing the area under the receiver operating characteristic curve for each resulting discriminator. The optimized set contains ten variables, namely: the p T of each of the 4 , the ∆R between the reconstructed H → ZZ * → 4 and H → bb systems, the two b-tagging scores, the p T of each of the two jets with the highest value of b-tagging score and their invariant mass. The most important variables for the BDT training are the b-tagging scores of the two selected jets, the invariant mass of the dijet system, and the ∆R between the reconstructed H → ZZ * → 4 and H → bb systems.
To take into account the possible differences due to data-taking conditions and final states, the BDT is trained separately for each data-taking year and for each of the leptonic final states (4µ, 4e, and 2e2µ), for a total of nine independent discriminators. Detailed studies were performed on the discrimination ranking, variable correlations, and possible discriminator overtraining. Figure 3 shows the inclusive BDT discriminator distribution for simulated signal, estimated background components, and data: different data-taking years and leptonic final states are combined together (4µ, 4e, and 2e2µ). Signal and backgrounds normalisation results from the fit procedure described in Section 8. The binning of the BDT output distribution is chosen in order to have (almost) the same statistical uncertainties in the last 4-5 bins, which is around 10% and which ensures stable results under statistical fluctuations of the most sensitive bins.
The output of the BDT is used in a statistical analysis as the input variable of a maximum likelihood fit to set constraints on both the signal strength and the κ λ coupling modifier.

Systematic uncertainties
Systematic uncertainties are modelled in a maximum likelihood fit as nuisance parameters. Two types of nuisance parameters are considered: normalization uncertainties, describing possible modification to the event yields, and shape uncertainties, accounting also for variations of the shape of distributions. The former are assigned a log-normal probability density function, while the latter are included as template shape variations and taken into account via continuous template morphing [62].
The experimental uncertainties considered for the measurement arise from different sources. The uncertainty in the integrated luminosity is 1.2% for 2016 [63], 2.3% for 2017 [64], and 2.5% for 2018 [65]. The uncertainty in the lepton identification and reconstruction efficiency ranges 1.0 − −15.5% on the overall event yield for the different final states. Muon uncertainties are always below 3%, while electron uncertainties surpass 10% for electrons with p T < 20 GeV and |η| > 1.5. The uncertainties in the jet energy scale (JES) and jet energy resolution (JER)  are accounted for by changing the jet p T by one standard deviation for each of the several uncorrelated sources considered [57]. The effects on the signal acceptance are also accounted for. Uncertainties in the scale factors are computed and applied as a function of the jet p T , η, and hadron flavour. The above uncertainties apply equally to all simulated signal and background samples. The JES, JER, and b-tagging scale factor systematics are considered shape uncertainties. The JES, together with the statistical uncertainties in the yield of the last bin of the BDT, are the systematic effects with the largest impact on the final result. All the other sources are considered normalization uncertainties. The experimental uncertainties originating from the reducible background estimation, described in Section 5.2, are also considered. The main contribution comes from the mismatch in the composition of backgrounds between the samples where the misidentification probability is derived and those where it is applied. The summary of the experimental systematic uncertainties is reported in Table 2.
Theoretical uncertainties arise from the choice of the PDF set, the uncertainty in α S , and the renormalization and factorization scales [66]: they affect both signal and background processes. For the HH signal, in addition to the uncertainty sources just described, an uncertainty caused by missing finite m top effects [20] contributes, too. No additional uncertainties on H production without real b jets are applied since they are negligible compared to all the other uncertainties involved. An additional uncertainty of 10% for the gg → ZZ * prediction and of 0.1% for the qq → ZZ * prediction is considered to account for the uncertainty in the applied K-factors, described in Section 3.
The uncertainty in the luminosity is treated as partially correlated across data-taking periods. All the other experimental uncertainties are considered uncorrelated: they are estimated for each year independently and therefore their dominating statistical uncertainty is uncorrelated. All theoretical uncertainties are considered correlated among the three data-taking years. A summary of the theory systematic uncertainties is reported in Table 3.
This analysis is still dominated by statistical uncertainties, thus the impact of the systematic uncertainties is overall negligible.

Results
No significant excess of events is observed over the expected backgrounds. Upper limits are set on the production cross section of HH times the branching fraction B(HH → 4 bb ), using the modified frequentist approach (CL s ), taking the profile likelihood as a test statistic [67][68][69][70] in the asymptotic approximation. The limits are compared to the theoretical predictions assuming SM branching fractions for the H decays.
The results are extracted with a binned maximum likelihood fit to the nine BDT distributions in data. Shape templates for the BDT score for signal are obtained from simulated samples and for backgrounds from both simulated samples and data, as described in Section 6. Systematic uncertainties are treated as described in Section 7.
The observed (expected) upper limit on the signal strength modifier µ, defined as the ratio of the HH fraction in the 4 bb final state to the SM expectation, is 32.4 (39.6) at 95% CL, as illustrated in Fig. 4 (left).
Upper limits are also set for different hypotheses of anomalous values of the H self-coupling κ λ , assuming all the other couplings [71] are equal to their SM values. The result is shown in Fig. 4 (right), where the exclusion is compared to the theoretical prediction for the HH cross section (red line). The analysis constrains κ λ to be within the observed (expected) range −8.8 (−9.8) < κ λ < 13.4 (15.0) at 95% CL.
The limits set by this analysis are obtained with a data set which is independent from those used for the other channels [11,12].

CMS
Theoretical prediction Theoretical uncentarties Observed 95% CL limit Expected 95% CL limit Expected ±1 s.d. Expected ±2 s.d. The results presented in this section are provided in a tabulated form in the HEPDATA record [72] for this analysis.

Summary
The first search for nonresonant Higgs boson (H) pair production in the four lepton bb final state is presented. This search uses a data sample collected in proton-proton collisions at √ s = 13 TeV by the CMS detector during the years 2016-2018 corresponding to an integrated luminosity of 138 fb −1 . The results are found to be compatible with the standard model (SM) expectations. An observed (expected) upper limit on the HH production cross section is set at 32.4 (39.6) times the SM expected rate at the 95% confidence level. Possible modifications of the H trilinear coupling λ HHH with respect to the standard model (SM) value are investigated. The coupling modifier κ λ , defined as λ HHH divided by its SM prediction, is constrained to be within the observed (expected) range −8.8 (−9.8) < κ λ < 13.4 (15.0) at 95% confidence level.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid and other centres for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC, the CMS detector, and the supporting computing infrastructure provided by the following funding agencies:     [7] ATLAS Collaboration, "Search for Higgs boson pair production in the γγbb final state using pp collision data at √ s = 8 TeV from the ATLAS detector", Phys. Rev. Lett. 114 (2015) 081802, doi:10.1103/PhysRevLett.114.081802, arXiv:1406.5053.