Measurement of the Higgs boson production via vector boson fusion and its decay into bottom quarks in proton-proton collisions at √ s = 13 TeV

A measurement of the Higgs boson (H) production via vector boson fusion (VBF) and its decay into a bottom quark-antiquark pair (bb) is presented using proton-proton collision data recorded by the CMS experiment at √ s = 13 TeV and corresponding to an integrated luminosity of 90.8 fb − 1 . Treating the gluon-gluon fusion process as a background and constraining its rate to the value expected in the standard model (SM) within uncertainties, the signal strength of the VBF process, defined as the ratio of the observed signal rate to that predicted by the SM, is measured to be µ qqH Hbb = 1.01 + 0.55 − 0.46 . The VBF signal is observed with a significance of 2.4 standard deviations relative to the background prediction, while the expected significance is 2.7 standard deviations. Considering inclusive Higgs boson production and decay into bottom quarks, the signal strength is measured to be µ incl.Hbb = 0.99 + 0.48 − 0.41 , corresponding to an observed (expected) significance of 2.6 (2.9) standard deviations. Published


Introduction
Following the discovery of a scalar boson with a mass near 125 GeV by the ATLAS and CMS Collaborations [1][2][3] at the CERN LHC, extensive studies of the properties of this particle have been pursued.A multitude of measurements performed thus far in various production and decay modes support the hypothesis that the discovered particle is the Higgs boson (H) described by the standard model (SM) that arises as a consequence of the Brout-Englert-Higgs mechanism of the electroweak (EW) symmetry breaking [4][5][6][7][8][9].
The SM Higgs boson with mass of 125 GeV decays most frequently into a bottom quark-antiquark (bb) pair with a branching fraction of about 58% [10,11].However, it is challenging to explore this decay mode experimentally.In the dominant gluon-gluon fusion (ggH) production mode, the H → bb signal is overwhelmed by a background consisting of bb pairs produced through the strong interaction, referred to as quantum chromodynamics (QCD) multijet events.At the LHC, a moderate sensitivity to the H → bb decay in the ggH process can be achieved by exploiting Lorentz-boosted production of the Higgs boson [12,13].The most promising production mechanism to study the H → bb decay is the Higgs boson production in association with a leptonically decaying W or Z boson (VH).Although the VH production cross section is more than an order of magnitude smaller than that of ggH, leptonic decays of W and Z bosons provide a handle to reduce backgrounds, thereby making the H → bb decay accessible for detection.The VH production mode has contributed with the largest sensitivity to the observation of the H → bb decay by the CMS and ATLAS Collaborations.The AT-LAS Collaboration has measured the H → bb signal yield relative to the SM prediction to be µ incl.Hbb = 1.02 ± 0.12 (stat) ± 0.14 (syst), corresponding to a significance of 6.7 standard deviations (σ) [14] using 139 fb −1 of pp collisions data collected at the center-of-mass energy ( √ s) of 13 TeV.The measurement by CMS is µ incl.Hbb = 1.04 ± 0.14 (stat) ± 0.14 (syst) and corresponds to a significance of 5.6 σ [15] by combining VH along with other production modes using the data collected at √ s = 7, 8 and 13 TeV in 2011-2017.Both the ATLAS and CMS Collaborations have also pursued studies of the Higgs boson production in association with top quark-antiquark pairs, followed by the H → bb decay.The rate of this process relative to the SM prediction has been measured to be 0.35 +0.36  −0.34 by the ATLAS Collaboration [16] and 0.33 ± 0.26 by the CMS Collaboration [17].This paper considers the vector boson fusion (VBF) production mechanism for the detection of the H → bb signal.The VBF production of the Higgs boson has the second-largest cross section at the LHC and attracts particular attention because it involves large momentum transfer and provides a sensitive probe of momentum-dependent anomalous couplings [18].
The VBF production followed by the H → bb decay, qqH → qqbb, gives rise to a four-jet final state as depicted in Fig. 1.Two of the jets, from the H → bb decay, typically lie in the central region of the detector.The other two jets, from the light quarks, are produced mainly in the forward and backward directions relative to the beam line and, consequently, have a large rapidity separation between them, as well as high dijet invariant mass.We refer to the latter as VBF jets.As the interaction proceeds via the exchange of colorless particles (V = W or Z), the color connection between the outgoing light-flavor quarks is suppressed, leading to a relatively small amount of hadronic activity in the rapidity interval between the VBF jets and b-tagged jets originating from the Higgs boson decay.These distinct features allow for the suppression of the large QCD multijet background and the identification of the signal process.
The previous measurement of the qqH → qqbb process by the CMS Collaboration is based on a data set of proton-proton (pp) collisions at √ s = 8 TeV, corresponding to an integrated luminosity of about 20 fb −1 [19].The signal strength was observed to be µ qqH Hbb = 2.8 +1.6 −1.4 with a significance of 2.2 σ, while the expected significance was 0.8 σ.The ATLAS Collaboration recently reported a measurement of the qqH → qqbb process using data collected in pp colli- sions at √ s = 13 TeV, corresponding to an integrated luminosity of 127 fb −1 [20] and the signal strength was measured to be µ qqH Hbb = 0.95 +0.38  −0.36 , corresponding to an observed (expected) significance of 2.6 (2.8) σ.
In this paper, we present a measurement of the qqH → qqbb process using data collected by CMS at √ s = 13 TeV.The paper is organized as follows.In Section 2 the main features of the CMS detector are outlined.Section 3 describes the reconstruction of physics objects relevant for this analysis.Data sets and simulated samples used in the study are presented in Section 4, and Section 5 describes employed triggers.Details of the offline analysis, including the event selection and categorization, the improvement of the jet transverse momentum (p T ) resolution by regression techniques, background estimation methods, and signal extraction procedure are discussed in Section 6. Section 7 details the main systematic uncertainties affecting the analysis.The final results are discussed in Section 8 and a brief summary is given in Section 9.The tabulated results are provided in a HEPData record [21].

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T along the beam direction.Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections.The ECAL consists of 75 848 lead tungstate crystals, which provide coverage in pseudorapidity |η| < 1.48 in the barrel (EB) region and 1.48 < |η| < 3.00 in the two endcap (EE) regions.Preshower detectors consisting of two planes of silicon sensors interleaved with a total of 3 radiation lengths of lead are located in front of each endcap detector.Forward calorimeters extend the η coverage provided by the barrel and endcap detectors.Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.
Events of interest are selected using a two-tiered trigger system [22].The first level (L1), composed of custom hardware processors, uses information from the calorimeters and the muon subdetectors to select events at a rate of around 100 kHz within a fixed time interval of less than 4 µs.The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [23].
A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [24].

Event reconstruction
The reconstruction of the physics objects at CMS is based on the particle-flow (PF) algorithm [25], which aims to reconstruct and identify each individual particle (PF candidate) in an event, with an optimized combination of information from the various elements of the CMS detector.
For each event, hadronic jets are clustered from these reconstructed particles using the infraredand collinear-safe anti-k T algorithm [26,27] with a distance parameter of 0.4.The jet momentum is determined as the vectorial sum of all particle momenta in the jet, which is typically within 5 to 10% of the true momentum over the entire range of kinematically allowed p T and the detector acceptance.Additional pp interactions within the same or nearby bunch crossings (pileup) increase the track multiplicity and calorimetric energy depositions which, potentially affects the jet momentum.To mitigate this effect, charged particles identified to be originating from pileup vertices are discarded and an offset correction is applied to account for the remaining contributions.Jet energy corrections are derived from simulation to match the measured response of the reconstructed jets to that of particle level-jets on average.In-situ measurements of the momentum balance in dijet, photon+jets, Z+jets and multijet events are used to account for any residual differences in the jet energy scale between data and simulation [28].The jet energy resolution amounts typically to 15-20% at 30 GeV, 10% at 100 GeV, and 5% at 1 TeV [28].Additional selection criteria are applied to remove jets potentially dominated by anomalous contributions from various subdetector components or reconstruction failures.
At the HLT, identification of the jets resulting from the hadronization of b-quarks is performed with the combined secondary vertex algorithm (CSV) [29,30] and with its improved version using deep machine learning technique (DEEPCSV) [30].These algorithms combine the information from track impact parameters and secondary vertices identified within a given jet, and provide a continuous discriminator output.A jet with a CSV or DEEPCSV discriminant value above a certain threshold is considered to be b-tagged.The efficiency for tagging b jets and the rate at which light-flavor, gluon, and charm jets are misidentified as b jets depend on the chosen threshold, as well as on the jet p T and η.The thresholds are chosen to ensure an online selection of b jets with an efficiency of 60-70%, corresponding to a misidentification rate of 2-3%.Identification of b jets in the offline analysis is performed with the DEEPJET tagger [31,32], which combines features of the individual jet constituents, properties of the reconstructed secondary vertices, and the global jet variables into a multiclass deep neural network (DNN) targeting three separate classes for the b jets: i) jet with one B hadron, ii) jet with two or more B hadrons, and iii) jet with semileptonic decays of B hadrons.The sum of three b tag class DNN outputs quantifies the consistency of a jet with the b jet hypothesis.A jet is considered to be b-tagged in the offline selection, if this DNN output, referred to as the b tag score, exceeds a certain threshold.These thresholds are chosen such that the b jet selection efficiency is 70-80% depending on data-taking periods, while the probability of light and gluon jets being misidentified as b jets is about 1% and the probability of charm jets being misidentified as b jets is about 10%.
To minimize the discrepancy in the b tagging performance between data and simulation both at the HLT and in the offline analysis, separate jet flavor dependent calibration factors are applied to each jet in simulated events as a function of jet p T , η, and the b tag discriminator score.These corrections have been derived using control samples of the semileptonic tt decays, Z+jets events, and the inclusive sample of QCD multijet events [30].Determination of corrections at the HLT is facilitated by the control and prescaled single-and multi-jet triggers.

Data set and simulated samples
This analysis uses data collected by the CMS experiment in pp collisions at √ s = 13 TeV corresponding to an integrated luminosity of 90.8 fb −1 .The total analyzed data volume comprises two sets recorded under different experimental conditions.The first set was collected in 2016 and corresponds to an integrated luminosity of 36.3 fb −1 .The second set was collected in 2018, when the LHC delivered pp collisions at higher instantaneous luminosity compared to 2016, and corresponds to an integrated luminosity of 54.5 fb −1 .No trigger path suitable for this analysis was available during the data-taking periods in 2017.
The analysis relies on simulated Monte Carlo (MC) samples for the estimation of the signal acceptance and efficiency and determination of the trigger efficiency.Additionally, the contributions from various subdominant background processes are determined from simulation.
The qqH → qqbb signal is generated using POWHEG 2.0 [33][34][35] at next-to-leading order (NLO) precision in the QCD coupling constant α S [36].A dipole parton shower model [37] incorporated in PYTHIA 8.212 [38] is used for emulating of the initial-final state color flow that takes into account the color connection between the incoming and the outgoing partons.An alternative qqH → qqbb sample is also prepared using the POWHEG matrix element generator interfaced with HERWIG 7 v-7.1 [39] for fragmentation and hadronization.This sample is used only to assess the systematic uncertainty related to the choice of the showering and hadronization model.The ggH production of the Higgs boson with at least two accompanying jets has a nonnegligible contribution to the kinematic phase space considered in this analysis.The ggH process is generated using the MINLO event generator at next-to-NLO (NNLO) [40,41] precision in α S [42], including finite top quark mass effects.An NLO process that may include loops needs to be hadronized with a parton shower model beyond LO.As a result, jet matching must take into account additional hard radiative jets.This is implemented, for this analysis, by employing the FxFx scheme [43].Contributions from the weak gauge boson associated (VH) and top-quark-pair associated (ttH) production of the Higgs boson are found to be negligible.
The dominant continuum background for this analysis is QCD multijet production.To assess properties of these events and validate the analysis strategy, QCD multijet events are generated with MADGRAPH5 aMC@NLO [44] at leading order (LO) precision in α S .The matrix element is matched to the parton showers generated by PYTHIA using the MLM prescription [45].Generated samples of QCD multijet production are also used to derive corrections applied to simulated samples to account for differences in the trigger efficiency between data and simulation.
The dominant resonant background is the inclusive production of Z bosons (Z+jets).About 70% of the time, Z bosons decay via a quark-antiquark pair of the same flavor, including a 15% branching fraction into a pair of bottom quarks.Hence, the main component of the resonant background in the event sample corresponds to the Z → bb decay mode, although there are contributions from charm and light-flavor quark jets being misidentified as b jets.There are two different mechanisms of Z+jets production: via quark-antiquark annihilation (qq → Z) and fusion of W bosons (WW → Z).Though the latter has a much lower rate, the event topology is the same as for the signal.W bosons can be produced in association with jets in similar ways.The inclusive W production is generated using MADGRAPH5 aMC@NLO at LO precision in α S .The same generator is used to produce mutually exclusive qq → Z and WW → Z samples.The qq → Z process is simulated considering Feynman diagrams involving direct electroweak coupling of the Z boson to quark-antiquark pairs.In the generation of the WW → Z process, only the Feynman diagrams that contain the WWZ triple gauge boson coupling are considered in the computation of the matrix element.As in the case of the qqH → qqbb signal, the WW → Z process is simulated using a dipole parton shower model implemented in PYTHIA.
For the Z+jets samples, correction factors have been applied to match the generator-level p T distributions with analytic predictions available with the highest order accuracy in the perturbative expansion [46].Two individual correction factors are applied to the qq → Z process for each p T bin; first to emulate the spectrum predicted by perturbative QCD with NNLO accuracy and then further reweighting is done to incorporate the higher-order EW effects.For the WW → Z mechanism, only higher-order EW correction factors are applied.
Other important background contributions in the signal region arise from inclusive single top quark (t/t+X) and top quark pair (tt+X) production.These are modeled by POWHEG [47,48] at the NLO QCD precision.For the tt process, all possible combinations of the decay modes of the two W bosons from the top quark and antiquark decays are considered leading to hadronic, leptonic, and semileptonic final states.
All the simulated samples except the HERWIG signal sample are interfaced with PYTHIA 8.212 for parton showering and fragmentation with the standard p T -ordered parton shower scheme.The underlying event is modeled with PYTHIA, using the CP5 tune [49] for both of the years.The parton distribution functions (PDFs) are taken from the sets of NNPDF3.0 [50] and NNPDF3.1 [51] for the 2016 and 2018 samples respectively.The response of the CMS detector is modeled using the GEANT4 [52] package.The event reconstruction is performed with the same algorithms as are used for the data.Additional pileup interactions in each bunch crossing are generated with PYTHIA and added to the simulated samples following a Poisson distribution with the mean value determined in the data.The simulated events are weighted such that the pileup distribution in the simulation matches the one observed in the data.

Triggers
Events are selected with dedicated L1 and HLT selections optimized separately for 2016 and 2018.At the L1 stage, events are required to have at least three jets with p T above certain thresholds that were optimized according to the instantaneous luminosity.The p T thresholds of 90, 76, and 64 GeV (100, 80, and 70 GeV) were imposed for 2016 (2018).The presence of a fourth jet is not required at the L1 stage.
An event is accepted by the HLT if it contains at least four jets reconstructed online with the PF algorithm.Jets are required to have p T greater than 92, 76, 64, and 15 (105, 88, 76, and 15) GeV in 2016 (2018).Two complementary online requirements (HLT paths), as explained below, are also implemented to select events in each of the two sets.In the following, we refer to these HLT paths as HLT Tight and HLT Loose.
The HLT Tight path selects events with at least one b-tagged jet among six leading jets according to p T .The b tagging is performed with the CSV (DEEPCSV) algorithm in 2016 (2018).The working point chosen for the online b tagging in the HLT Tight path corresponds to b jet efficiency of roughly 60 to 70% and a misidentification rate for light-flavor quark and gluon jets of 3 to 4%.In addition to the b-tagging selection there is also a VBF-selection implemented at the HLT path, which considers only four p T leading jets.Among these four jets, the one with the highest CSV (DEEPCSV) discriminant is considered to be a jet from the Higgs boson decay, and out of the three remaining jets, a jet pair with the largest ∆η jj is chosen as the pair of VBF-tagged jets.The last remaining jet is considered to be the other jet from the decay of the Higgs boson.The HLT Tight path imposes very stringent conditions on the VBF-tagged jets: they must be separated in pseudorapidity by ∆η jj > 4.1 (3.5) and have invariant mass m jj > 500 (460) GeV in 2016 (2018).For the other two jets, which are assigned to the H → bb decay, the separation in azimuthal angle is required to be ∆ϕ bb < 1.6 (1.9) for 2016 (2018).
In contrast to the HLT Tight path, the HLT Loose path selects events with at least two btagged jets but imposes comparatively lenient requirements on the VBF-tagged jets: m jj > 240 (200) GeV and ∆η jj > 2.3 (1.5) for 2016 (2018); while ∆ϕ bb < 2.1 (2.8) in 2016 ( 2018) is required for b jets.Requirements imposed by the HLT Tight and HLT Loose paths in the 2016 and 2018 data-taking periods are summarized in Table 1.
The efficiency of the trigger selection is measured with the tag-and-probe method [53] using events selected with control triggers.Trigger scale factors, representing the ratio of the trigger efficiency in data to that in simulated events, are measured as a function of jet p T and |η| for each leg of the four-jet requirement using control and prescaled single-jet triggers.For the online b tagging, the scale factors are also derived as a function of the DEEPJET discriminants of the offline tagged jets employing auxiliary and prescaled four-jet triggers.The applied corrections range from 2 to 7% for each leg of the four-jet requirement and from 3 to 10% per jet for the online b tag requirement.
The absolute trigger efficiencies, computed for the inclusive simulated signal samples, are 3.1 (2.3) and 3.5 (2.5)% in 2016 (2018), for the HLT Tight and HLT Loose paths, respectively.The inefficiencies of the trigger paths are mainly driven by the relatively high p T thresholds imposed on the three leading jets.

Analysis procedures
Events recorded online by the HLT paths described in Section 5, are reconstructed with improved information related to the detector conditions and calibrations.

Event selection
An event is discarded if it contains any isolated muon or electron identified through selection criteria that correspond to a selection efficiency of 95% and a misidentification rate of 1-2%.This requirement suppresses contributions from tt and t/t events with leptonic decays of the W bosons.The event is selected if it contains four jets with p T greater than 95, 80, 65, and 30 GeV (110, 90, 80, and 30 GeV) in 2016 (2018).Reconstructed physics objects are matched to the corresponding trigger objects.At least two jets are required to be b tagged with the DEEPJET algorithm as described in Section 3. Out of the four p T -leading jets, the two jets with the highest DEEPJET b tag scores are used to reconstruct the H → bb decay candidate.After the Higgs boson candidate is selected, the two remaining jets are considered as VBF-tagged jet candidates.The jet with p T < 50 GeV must also pass an identification criterion designed to reduce the number of selected jets originating from pileup interactions [54].
Depending on the data-taking periods and the trigger they pass, events are split into four nonoverlapping samples that are analyzed independently.Events from 2016 (2018) that pass the HLT Tight path are assigned to the Tight 2016 (Tight 2018) sample; events failing the HLT Tight path but pass the HLT Loose path comprise the Loose 2016 (Loose 2018) sample.The selection criteria applied to the VBF-and b-tagged jets in each sample are: • Tight 2016: ∆ϕ bb < 1.6, ∆η jj > 4.2, m jj > 500 GeV; • Loose 2016: ∆ϕ bb < 2.1, ∆η jj > 2.5, m jj > 250 GeV; • Tight 2018: ∆ϕ bb < 1.6, ∆η jj > 3.8, m jj > 500 GeV; and The lower section of Table 1 outlines the selection criteria used in the offline analysis.

Regression of b jet energy
The energy of a b-tagged jet is likely to be underestimated often because of the unmeasured neutrino produced in the semileptonic decay of a b hadron.This is partially remedied by the energy regression of a b jet [55] that improves the mass resolution of the Higgs boson candidate constructed from the two b-tagged jets.The regression is based on a DNN trained on simulated b jets from tt events.The algorithm's input features include properties of the secondary vertices and tracks associated with the jet, jet constituents, soft muons and electrons from the semileptonic decays of b hadrons, as well as additional variables related to the jet energy.After the application of the b jet energy regression, the jet energy scale and resolution of b jets are corrected in simulated events to match the jet energy scale and resolution observed in data.Corrections are devised using events where a b-tagged jet recoils against a Z boson that decays into leptons.The Z boson p T is reconstructed with high precision due to the excellent resolution of the lepton momentum.As the p T of the Z boson is balanced with the jet p T , the ratio of the momentum of the reconstructed jet to the Z boson momentum is used to measure the jet energy scale and resolution in the data and simulated samples and derive corresponding corrections to the energy scale and resolution of b jets in the simulated events.The correction factors constitute about 0.5-2% (1-4%) of the jet p T for the resolution (scale) of the regressed jets.As an illustration, Fig. 2 shows the effect of b jet energy regression on the invariant mass m bb of the reconstructed b jet pair in simulated qqH → qqbb signal events for the Tight  [56] is used to fit the distributions, where µ and σ are the peak position and half width of the core part of the CB function, respectively.

Event categorization
To discriminate the signal from the major background sources, a multivariate analysis (MVA) technique is applied; the key properties of the qqH → qqbb signal are combined in a boosted decision tree (BDT) algorithm implemented with the TMVA package [57].The BDT response is then used to separate the signal from the background processes.Two different MVA strategies are applied in the Tight and Loose samples.
The Tight 2016 and Tight 2018 samples are characterized by lower background yields and higher signal-to-background ratios.In particular, the contributions from the resonant Z → bb background and ggH production are significantly suppressed by more stringent re- quirements imposed on the VBF-tagged jets.The largest background in these event classes originates from the QCD multijet production.Two separate binary classifiers are trained to discriminate the qqH → qqbb signal from the QCD multijet background for each individual year.The following variables are used as inputs to the BDT: • Properties of the VBF-tagged jets: invariant mass (m jj ), difference in the pseudorapidity (∆η jj ), and difference in the azimuthal angle (∆ϕ jj ); • The minimal opening angle between the momentum vector of any of the two VBFtagged jets and the momentum vector of the dijet system composed by the VBFtagged jet pair (α jj ); • The quark-gluon discriminator [58,59] score of the two VBF-tagged jets.This is a likelihood discriminator constructed using the information on the charged and the neutral constituents of a given jet and the jet shape variables to distinguish between the jets originating from quarks and from gluons; • The DEEPJET b-tagging scores of the two selected b-tagged jets assigned to the H → bb decay; • The ratio of the magnitude of the transverse momentum vector of the selected fourjet system to the scalar p T -sum of the four selected jets (|∑ ⃗ p T |/ ∑ p T ); • The longitudinal component of the momentum vector of the selected four-jet system (∑ p z ); • The difference in the azimuthal angle between the Higgs boson candidate and the dijet system composed of the VBF-tagged jets (|ϕ bb − ϕ jj |); and • The multiplicity and p T sum of extra jets with p T > 30 GeV and |η| < 2.4, excluding the selected b-tagged and VBF-tagged jets.
The training is performed using half of the simulated VBF events and 2.5% of the data sample after event selection as a proxy for the dominant QCD multijet background, the rest of the simulated VBF sample as well as another 2.5% of the data is used for the validation of the training.Dedicated studies were performed to assess the impact of using data events in the BDT training.In the first study, several other disjoint data subsets corresponding to 5% of the total sample were employed in the training and validation of the training.In the second study, the subset of data used in the training was removed from the data set used for the extraction of signal.The impact of these alternative analysis strategies on the measured signal rate and its uncertainty was found to be below 2%. Figure 3 shows the distributions of the BDT output score (D) normalized to unity for the VBF, ggH, and Z → bb processes and the data as an approximation to the QCD multijet background in the Tight 2016 and Tight 2018 samples.Based on their BDT scores, events are classified into multiple exclusive categories, targeting the VBF, ggH, and Z+jets processes.Details of the event categorization and the naming convention of the event categories are given in Table 2.The BDT thresholds defining each category, are optimized by maximizing the quantity S/ √ B, where S is the number of expected events of the targeted process and B is the number of QCD multijet background events approximated as the observed data events in the m bb interval populated by the targeted process.The m bb interval of 80-100 GeV is used for the Z+jets process and 104-146 GeV for the VBF and ggH processes.In total, 18 categories are introduced, three in each of the two Tight analysis samples, and six in each of the two Loose analysis samples.
The aim of introducing distinct categories sensitive to the production of the Z boson is twofold.Firstly, these categories are intended to establish the signal from the Z → bb production, thereby validating the analysis techniques employed in this study.Secondly, the tail of each m bb distribution in the Z+jets sample extends to the region partially populated by the signal events, thus affecting the precision of the measurement.It is therefore important to constrain the background from the Z+jets process with dedicated categories, thus improving the sensitivity to the signal.Further, separate categories targeting the ggH process improves the sensitivity of the analysis to inclusive Higgs boson production.
The numbers of selected events in data along with the expected background and signal yields in each analysis category are detailed in Tables 3 and 4. The signal contribution is significantly smaller than the backgrounds in all analysis categories.

Signal extraction
The test statistic chosen to determine the signal yield is based on the profile likelihood ratio [60].The signal is extracted from the simultaneous binned maximum likelihood fit of the m bb distribution in all categories obtained from the data.In each category, the m bb distribution is fitted with a superposition of three parametric analytic functions accounting for: (i) the continuum background, dominated by QCD multijet events, (ii) resonant Z+jets background and, (iii) signal.Contributions from other processes are found to have a negligible effect on the results.The addition of separate templates accounting for the W+jets, tt, and t/t backgrounds, changes the extracted signal yield and its uncertainty by less than 1%.The aggregate contribution from other decays of the Higgs boson (H → cc, H → ττ) and from the other production modes (ttH and VH) is evaluated to be almost two orders of magnitude lower than the contributions from the VBF and ggH processes, hence their contributions are neglected in the signal extraction procedure.The expected yields of the VBF, ggH, and Z+jets processes in a given category  background are individually modeled with a superposition of a one-sided Crystal Ball (CB) function [56] for the resonant part and a second-order Bernstein polynomial accounting for the contribution from events where the jets are misassigned to the decay of the Z or the Higgs boson.The parameters of the CB function and the Bernstein polynomial are extracted from the fit to the m bb spectrum in the respective simulated sample, and separately for each sample.
For the signal, as well as the Z+jets processes, the variations of the shapes across the different event categories within the same analysis sample are found to be well within the statistical uncertainties of the MC samples, and significantly smaller than the variations caused by the experimental uncertainties associated with the b jet energy scale and resolution.For the VBF and ggH processes, which have similar shapes, a common analytic function is used to model the m bb distribution in each analysis sample, namely Tight 2016, Loose 2016, Tight 2018 and Loose 2018.The same approach is pursued for the qq → Z and WW → Z processes.Figure 5 illustrates the modeling of the m bb shape in samples of the simulated VBF, ggH, and Z+jets events assigned to the Tight 2016 analysis sample.
For all analysis samples, the fitted peak positions of the CB functions describing the m bb distribution in signal, are found to be consistent within fit uncertainties with the nominal value of the Higgs boson mass.For the resonant Z → bb background, we observe a bias in the peak position of about 3.0-3.5GeV above the nominal value of the Z boson mass.The effect is caused by the relatively high p T thresholds imposed on jets in the HLT and offline selections.These thresholds, optimized primarily for the selection of H → bb decays, result in an upward shift of the peak position in the Z → bb sample, where a larger fraction of low-p T b jets is expected compared to H → bb decays.A dedicated simulation study has demonstrated that no bias is present in the Z → bb sample if no trigger requirements are applied as well as if jet p T thresholds are relaxed in the offline selection.
The shape and normalization of the continuum background are estimated from the fit to data.The shape of the continuum background is modeled individually in each event category i by a product of exponential and polynomial functions of order n, The family of functions given by Eq. ( 1) includes the case corresponding to the zeroth order polynomial.In each category, the choice of the polynomial function is guided by the combined fit of two sideband regions, 80 < m bb < 104 GeV and 146 < m bb < 200 GeV.Sequential fits with increasing polynomial order are performed followed by a Fisher F-test [61] to select the function with the fewest number of parameters necessary to fit the data.The contribution from the Z+jets background is accounted for by performing fits with the superposition of two functions, • tested analytic function from the family defined by Eq. 1 modeling continuum background dominated by the QCD multijet process; • and a combination of the CB function and a second order Bernstein polynomial for the Z+jets background as described earlier in this section.
First, the selection of functions for the modeling of the continuum background is performed in the categories targeting the Z+jets process: Loose Z1 and Loose Z2 in the 2016 and 2018 samples.Fits in each of the four aforementioned categories are performed by varying the normalization of the Z+jets process in an unconstrained manner.An additional degree of freedom is introduced to account for the unconstrained normalization of the Z+jets process.A dedicated statistical test demonstrated that the measured normalizations of the Z+jets process across the four categories are consistent.The p-value quantifying the compatibility of these measurements is found to be 0.33.Once the functional form of the continuum background modeling in each category targeting the Z+jets process is selected based on the F-test result, a combined fit is performed in these four categories with a common, unconstrained Z+jets normalization parameter.
The selection of functions from the family given in Eq. 1 in the categories targeting VBF and ggH processes is done by performing fits with a Gaussian constraint imposed on the normalization of the Z+jets component.This constraint is obtained from the combined fit in the four categories targeting the Z+jets process.
Ideally, the final result should not depend on the choice of the parametric fit function, but in reality, the choice of the function to model the background may bias the extraction of the signal.Therefore, a dedicated study is performed for each category to assess a possible bias in the signal extraction.For each of the selected functions, various alternative models of the continuum background are used to generate an ensemble of pseudo-data, including the injected signal and the contribution from Z+jets process.The resulting m bb spectra obtained from the pseudo-data are fitted using the nominal model of the continuum background and the distribution of the extracted signal yield is compared to the injected yield.The study is performed using the following alternative models: 1. Functions from the same family, given by Eq. ( 1), but with polynomials one or two orders higher than the nominal function.

Inverse polynomials, I(m
The injected signal yield is varied between 50-200% of the yield predicted by the SM to investigate a possible dependence of the bias on the signal strength.The background model chosen yields a maximum potential bias that does not exceed 10% of the statistical uncertainty on the fitted signal rate in each individual event category, and has negligible impact on the measurement.The systematic uncertainty related to the choice of the function used to model the continuum background has a subdominant effect on the final results.The functional forms employed in the modeling of the continuum background in each event category are reported in Table 5.

Systematic uncertainties
Several systematic uncertainties affect the overall normalizations and shapes of the m bb distributions for signal and background.Systematic uncertainties are incorporated in the signal extraction procedure via nuisance parameters with Gaussian or log-normal probability density functions (pdfs), which are treated according to the frequentist paradigm.These pdfs are included as additional factors in the likelihood function, which is maximized in the fit.All uncertainties are divided into two categories.Theoretical uncertainties arise from the limited precision in the computation of the inclusive and differential cross sections of the modeled processes and are fully correlated between event categories and data-taking periods.Experimental uncertainties comprise (i) uncertainties related to the imperfect simulation of the detector response and consequent inaccurate modeling of the reconstruction of physics objects and observables in the simulated samples, (ii) the uncertainty in the integrated luminosity estimation, and (iii) the uncertainty in the modeling of the pileup interactions.The dominant uncertainties affecting this measurement are briefly summarized in the following.

Theoretical uncertainties
• Parton shower uncertainty in PYTHIA: The uncertainty related to the modeling of the parton shower in PYTHIA is estimated by varying the scales of the initial-and finalstate QCD radiation.This uncertainty has a sizable effect on modeling of jet-related variables used as inputs to the BDT, thereby affecting acceptance of signal and background events in different event categories.The impact on the predicted event yields varies between 2 and 10% across event categories with larger effects observed for categories targeting VBF production.The modeling of key BDT input variables, such as m jj and ∆η jj , is validated with a control sample of events, in which a Z boson is produced in association with at least two jets and decays into a pair of muons.
Distributions of these variables in data and simulated samples agree within instrumental and theoretical uncertainties, including the uncertainty related to the modeling of the parton shower in PYTHIA.Variations in the m bb distribution within these uncertainties are found to be negligible in the simulated signal and Z+jets background samples.Inclusion of these shape-altering effects in the statistical inference procedure changes the uncertainty in the extracted rates of the signal and resonant Z → bb background by less than 0.5%.
• Parton shower and hadronization model for VBF production: Simulation of the color connection between the incoming and outgoing partons in the VBF process is particularly sensitive to the choice of the event generator.The related uncertainty is assessed by comparing the dipole parton shower model implemented in PYTHIA with the alternative one provided by the HERWIG event generator.Yields of the VBF sig-nal in each event category are compared between the two models and the differences are used to define a double-sided uncertainty with the PYTHIA prediction taken as the central estimate.The choice of the event generator only marginally affects the shapes of the m bb distributions in different event categories.The corresponding shape variations in the m bb spectra modify the expected uncertainty in the extracted rate of the VBF process by less than 1%.The choice of the generator also impacts the simulation of forward jets in the WW → Z sample.The validation study performed using the control sample of Z boson decays into muons, described above, demonstrates that properties of spectator jets accompanying the production of a Z boson, are well modeled by PYTHIA.Small discrepancies between data and simulation observed in distributions of key variables are fully covered by variations of the scales of the initial-and final-state QCD radiation.Therefore, no additional uncertainty is assigned to the parton shower modeling in the simulated WW → Z sample.
• Scale variations: Uncertainties arising from missing higher-order QCD corrections in the theoretical computations affect the overall production rates as well as the kinematics of the simulated processes.The impact of these uncertainties on the cross sections of the VBF and the ggH productions are estimated by varying the renormalization and factorization scales, yielding uncertainties of ±2.1% and +4.6 −6.7 %, respectively [11].The impact on the acceptance of VBF and ggH events across the event categories is evaluated by applying event weights that reflect the effect of the scale variations on the kinematics of the simulated events.The variations in event yields across categories are typically between 1 and 4%, with larger effects observed in categories targeting the VBF process.
• Uncertainties in PDFs and α S : The PDF and α S uncertainties in the cross sections of the VBF and ggH processes are ±0.4% and ±3.2%, respectively [11].
• Uncertainty in the branching fraction: The theoretical uncertainty in the branching fraction of the Higgs boson decay to bottom quarks is 0.65% at m H = 125 GeV [11].
• Uncertainties from Z+jets NLO K-factors: The MC samples of qq → Z and WW → Z events are simulated with MADGRAPH5 aMC@NLO at the LO accuracy with MLM matching.To improve the modeling of these processes, additional p T -dependent K-factors are applied following the prescription in Ref. [46] to match the differential distribution of the Z boson p T as predicted by the NNLO QCD and NLO EW calculations.The K-factors related to the NLO EW corrections are applied to both qq → Z and WW → Z samples.The K-factors associated with the NNLO QCD corrections are applied only to the qq → Z sample.The uncertainties in these K-factors are incorporated via six nuisance parameters.Three of these nuisance parameters correspond to the NLO EW corrections and affect both qq → Z and WW → Z samples, whereas another three parameters are related to the NNLO QCD corrections and have an impact only on simulated qq → Z events.Variations of these nuisance parameters change the acceptance of Z+jets events in different event categories by 1-7%.Uncertainties related to missing higher order QCD corrections in the simulated WW → Z sample are evaluated by varying renormalization and factorization QCD scales, modifying the acceptance of Z+jets events by 1-3% across different event categories.

Experimental uncertainties
• Jet energy scale and resolution corrections: The uncertainties in the jet energy scale and resolution originate from different sources with limited correlations [28,59].The uncertainties depend on the jet kinematics and are typically larger in the forward regions.Variations of the jet four-momenta caused by these uncertainties have an impact on the acceptance in various event categories of between 5 and 10%.
• b tagging: The uncertainties in b tagging efficiency are evaluated with the control samples of the semileptonic tt decays, Z+jets events, and the inclusive QCD multijet events [30].The uncertainties associated with the selection of the working point of the DEEPJET score for the tagger vary between 4-8% depending on jet flavor, p T , and η.The DEEPJET discriminants of the two b-tagged jets are also used as an input to the BDT classifiers.An assessment of control samples of QCD multijet and leptonic tt events revealed only small differences in the DEEPJET discriminant distribution between data and simulated events above the threshold defining the working point.These differences are found to have subdominant effects on the response of BDT classifiers, leading to a migration of events between categories at a percent level.
• Trigger efficiency: The corrections to the trigger efficiency are discussed in Section 5.
The uncertainties in the trigger scale factors are predominantly statistical.The impact of these uncertainties on the acceptance of signal and background events in various analysis categories is estimated to range from 5 to 10%.
• Corrections to b jet energy regression: The scale and smearing corrections to the regressed b jet energy are derived as described in Section 6.2.The uncertainties in the corrections are propagated to the parameters of the CB functions modeling the core of the m bb spectrum in the simulated samples of H → bb and Z → bb decays.Furthermore, these uncertainties change the shape of several observables used as inputs to the BDT classifiers, and consequently affect the estimate of the acceptance in various event categories.
• Integrated luminosity: The uncertainty in the luminosity measurement has both correlated and uncorrelated components, which vary across different data-taking periods [62,63].The uncertainties of the individual years are 1.2 and 2.5% for 2016 and 2018, respectively.The uncertainty of the combined 2016 and 2018 integrated luminosity is 1.7% (calculated using the full correlation scheme and per-year luminosities of 36.3 fb −1 and 54.5 fb −1 , respectively).
• Pileup modeling: The number of primary interactions per bunch crossing varies with the instantaneous luminosity during the data-taking operation.In order to match the distribution of pileup in data with that in the simulated samples, weights are applied that are determined by studying minimum-bias data sets.A normalization uncertainty is derived by altering the pileup weights obtained by changing the total inelastic cross section by ±4.6% [64] of its nominal value.
• Pileup jet identification: This uncertainty is estimated by comparing the pileup jet identification score [54] in events with a Z boson decaying into a pair of electrons or muons and one balanced jet in data and simulation.The assigned uncertainty depends on p T and |η|, and is designed to cover all differences between data and simulation in the distribution.
The impact of the most significant uncertainties is presented in Table 6.

Results
This analysis is primarily sensitive to the VBF Higgs boson production followed by H → bb decay.The outcome of the measurement depends on how the contribution from the ggH pro- cess is accounted for, and three scenarios are presented that differ in the way this process is treated in the signal extraction procedure.In each scenario, the best fit value of the parameters of interest and their confidence level (CL) intervals are extracted following the procedure described in Section 3.2 of Ref. [65].

Measurement of inclusive Higgs boson production
In the measurement of the inclusive Higgs boson production rate, the ggH process is considered as part of the signal.The fit is performed with an unconstrained signal strength modifier Hbb that simultaneously scales the yields of VBF and ggH events in all categories.The parameter µ incl.
Hbb is the product of the inclusive production cross section of the Higgs boson and the H → bb branching fraction relative to the SM expectation.In the measurement, we allow the overall normalization of the Z+jets process µ Zbb to vary in an unconstrained manner.
The analytic function used to fit the m bb spectrum in the ith category is given by The function includes the following category-dependent components: • N QCD i : the normalization of the QCD multijet background extracted from the fit.
• F QCD i (m bb |⃗ α i ): analytic parametric function normalized to unity modeling the shape of the QCD multijet background.The parameters ⃗ α i of the function are obtained from the fit.The types of functions used to model the QCD multijet background are discussed in Section 6.4.
• N qqH,ggH,Zbb i ( ⃗ θ): predicted yields of the VBF, ggH, and Z+jets processes; these yields depend on the nuisance parameters ⃗ θ that incorporate systematic uncertain- ties in the fit.

• F
Hbb ,Zbb i (m bb | ⃗ θ S ): analytic functions normalized to unity modeling the shape of the m bb distribution in the samples of H → bb and Z → bb decays.The parameters of the analytic functions, described in Section 6.4, are influenced by the nuisance parameters ⃗ θ S associated with the uncertainties in the scale and resolution of the b jet energy regression.
The H → bb signal, including contributions from the VBF and ggH processes, is observed with a statistical significance of 2.6 σ, compared to the expected significance of 2.9 σ.The sensitivity of the measurement is driven by the Tight categories, while the Loose categories constrain the Z+jets background.The expected uncertainty in µ incl.Hbb is determined from the fit to the Asimov dataset [66], where cross sections of the Z and Higgs boson production are set to the values predicted by the SM, whereas parameters of pdfs describing the continuum background, are set to the values obtained from fits to sideband regions.The expected systematic and statistical uncertainties are both found to be about 10% smaller than the respective observed ones.As a consequence, the expected significance is about 10% higher than the observed one, despite the central observed value of µ incl.
Hbb being close to unity.The linear correlation coefficient between µ incl.
Hbb and µ Zbb is found to be 0.37.In the fit, yields of H → bb and Z → bb events are both anti-correlated with the yield of a large continuum background.Hence, a positive correlation between µ incl.
Hbb and µ Zbb is expected and confirmed by the fit result.Figures 6 and 7 show the results of the fit in the Tight 2016 and Tight 2018 categories, and Fig. 8 is for the Loose 2016 Z2 and Loose 2018 Z2 categories.In these Figures, χ 2 quantifies the consistency of the m bb distribution observed in data with the fitted parametric analytic function, ndf is the number of degrees of freedom in the fit, and the p-value is defined as the probability of observing the χ 2 value larger than the actual one.

Measurement of VBF production when ggH production is constrained to SM expectations
The measurement of the exclusive VBF production rate is performed with the contribution from the ggH process constrained within theoretical and experimental uncertainties to the SM expectation.In this case, the analytic function employed to fit the m bb spectrum in category i  The m bb distribution after weighted combination of all categories in the analysis weighted with S/(S + B).A complete description is given in Fig. 6. is modified to be The fit is performed with two unconstrained parameters: the signal strength modifier for the VBF process (µ qqH Hbb ) and µ Zbb .The measurement yields µ qqH Hbb = 1.01 +0.39 −0.24 (syst) ± 0.39 (stat), and The VBF signal is observed with a significance of 2.4 σ.The expected significance is 2.7 σ.The best fit values of the signal strength modifiers for the different processes are also shown in Fig. 10.

Independent measurement of VBF and ggH production
Independent measurement of the VBF and ggH production rates is performed by fitting the m bb spectra with three unconstrained parameters, µ Zbb , µ qqH Hbb , and µ ggH Hbb , with the last one being the signal strength for the ggH process.In this measurement, the m bb spectrum in the ith category is fitted with the function

Summary
A measurement of the Higgs boson (H) production via vector boson fusion (VBF) process and its decay to a bottom quark-antiquark pair (bb) was performed on proton-proton collision data sets collected by the CMS experiment at √ s = 13 TeV corresponding to a total integrated luminosity of 90.8 fb −1 .The analysis employs boosted decision trees (BDTs) to discriminate the signal against major background processes: QCD-induced multijet production and Z+jets events.The BDTs exploit kinematic properties of the VBF jets, information of the b-tagged jets assigned to the H → bb decay, and global event shape variables.Based on the BDT response, multiple event categories are introduced, targeting the VBF, gluon-gluon fusion (ggH), and Z+jets pro- cesses to achieve a maximum sensitivity for the signal.While the VBF categories have the highest signal-to-background ratio, the Z+jets categories constrain the largest resonant background.The ggH categories enhance the sensitivity to the inclusive production of the Higgs boson in association with two jets.
The VBF Higgs boson production rate has been measured in its decay to bottom quark-antiquark pairs with the ggH contribution constrained within the theoretical and experimental uncertainties to the standard model prediction.The signal strength of the VBF Higgs production, followed by the H → bb decay, defined as the rate of the signal process relative to the value predicted in the standard model, is measured to be, µ qqH Hbb = 1.01 +0. 55 −0.46 .The signal was observed with a significance of 2.4 standard deviations, compared to the expected significance of 2.7 standard deviations.In addition, inclusive Higgs boson production in association with two jets, followed by H → bb decay, was measured by treating the ggH contribution as part of the signal.The inclusive signal strength was measured to be µ incl.Hbb = 0.99 +0.48 −0.41 , corresponding to an observed (expected) significance of 2.6 (2.9) standard deviations.The measurements are consistent within uncertainties with the prediction from the standard model.[17] CMS Collaboration, "Measurement of the ttH and tH production rates in the H → bb decay channel with 138 fb −1 of proton-proton collision data at √ s = 13 TeV", CMS Physics Analysis Summary CMS-PAS-HIG-19-011, 2023.

Figure 1 :
Figure 1: Representative Feynman diagram of the leading order (LO) VBF production of a Higgs boson, followed by its decay to a pair of b quarks.

2016Figure 2 :
Figure 2: The invariant mass m bb of the b jet pair in simulated qqH → qqbb events before (orange dashed line) and after (blue dashed line) the application of the b jet energy regression in the Tight 2016 (left) and Loose 2016 (right) samples.A one-sided Crystal Ball function [56] is used to fit the distributions, where µ and σ are the peak position and half width of the core part of the CB function, respectively.

Figure 3 :
Figure 3: The unit normalized distributions of the VBF BDT outputs in data and simulated samples in the Tight 2016 (left) and Tight 2018 (right) analysis samples.Data events (points), dominated by the QCD multijet background, are compared to the VBF (red solid line), ggH (blue dashed line), and Z+jets (green hatched area) processes.In the Loose samples, where larger contributions from ggH and Z+jets events are expected compared to the Tight sample, a multiclass BDT is trained to separate four processes simultaneously: (1) VBF Higgs boson production, (2) ggH Higgs boson production, (3) Z+jets, and (4) QCD multijet events.The training is performed using simulated samples of VBF, ggH, and Z+jets events and 5% of the data in the Loose 2016 and Loose 2018 samples.The BDTs trained in the Loose sample use the same set of input variables as the binary BDT classifiers

Figure 4 :
Figure 4: The unit normalized distributions of the BDT outputs: D ggH (upper), D VBF (middle), and D Z (lower) in data and simulated samples in the Loose 2016 (left) and Loose 2018 (right) analysis samples.Data events (points), dominated by the QCD multijet background, are compared to the VBF (red solid line), ggH (blue dashed line), and Z+jets (green hatched area) processes.are estimated from simulation.The m bb distributions for the VBF and ggH signal and Z+jets

Figure 5 :
Figure 5: The m bb distributions from simulation with overlaid parametric fits (solid blue lines) for the Tight 2016 analysis sample.Left: The fitted m bb distribution in the signal combining the VBF (yellow histogram) and ggH (orange) contributions.The black points refer to the total Higgs boson contribution from VBF and ggH production modes.Right: The fitted m bb distribution in simulated Z+jets background (black points) combining the WW → Z (dark green histogram) and qq → Z (light green histogram) production modes.The black points refer to the total Z+jets contribution from qq → Z and WW → Z modes.The dotted lines represent the second-order Bernstein polynomial components used to approximate the contributions from the wrong jet pairing.

Figure 6 :
Figure 6: The m bb distributions in three event categories: Tight 2016 1 (left), Tight 2016 2 (center), and Tight 2016 3 (right).The black points indicate data, the blue solid curve corresponds to the fitted nonresonant component of the background, dominated by QCD multijet events, and the shaded (cyan) band represents the ±1 σ uncertainty band.The total signalplus-background model includes contributions from Z → bb, H → bb, and the nonresonant component; it is represented by the magenta curve.The lower panel compares the distribution of the data after subtracting the nonresonant component with the resonant contributions of the Z → bb background (red curve) and H → bb signal (green curve).

Figure 9 Figure 7 :Figure 8 :
Figure9shows the m bb spectrum combining all 18 analysis categories.Each category enters the combination with a weight S/(S + B), where S is the total H → bb signal yield (VBF and

Figure 9 :
Figure9: The m bb distribution after weighted combination of all categories in the analysis weighted with S/(S + B).A complete description is given in Fig.6. µ

Figure 10 :
Figure 10: The best fit values of the signal strength modifier for the different processes.The horizontal bars in blue and red colors represent the ±1 σ total uncertainty and its systematic component.The vertical dashed line shows the standard model prediction.The fit yields

Figure 11 :
Figure 11: The best fit values of the signal strength modifier for the different processes, the horizontal bars in blue and red colors represent the ±1 σ total uncertainty and its systematic component and the vertical dashed line shows the SM prediction (left).The two-dimensional likelihood scan of µ qqH Hbb and µ ggH Hbb , the red (blue) solid and dashed lines correspond to the observed (expected) 68 and 95% CL contours in the (µ qqH Hbb , µ ggH Hbb ) plane (right).The SM predicted and observed best fit values are indicated by the blue and red crosses.

Table 1 :
The HLT and offline selection requirements in the four analyzed samples.

Table 1 )
. As a consequence, in the Loose 2018 sample, a larger part of phase space, where m jj and ∆η jj are particularly sensitive to the VBF signal, is removed in comparison with the Loose 2016 sample, thus making the BDT in the Loose 2018 sample less performant in discrimination between the VBF signal and background processes than in the Loose 2016 sample.

Table 2 :
Event categorization used in the analysis for a total of 18 categories.The names of the categories are given in the first column.The BDT score boundaries defining each category are given in the second column and the targeted process is indicated in the third column.ggH < 0.50, 0.80 < D VBF < 0.85 VBF Loose V2 D ggH < 0.50, 0.85 < D VBF VBF Loose Z1 D ggH < 0.50, D VBF < 0.80, 0.60 < D Z < 0.75 Z+jets Loose Z2 D ggH < 0.50, D VBF < 0.80, 0.75 < D Z Z+jets Loose Z1 D ggH < 0.55, D VBF < 0.50, 0.60 < D Z < 0.70 Z+jets Loose Z2 D ggH < 0.55, D VBF < 0.50, 0.70 < D Z Z+jets

Table 3 :
Event yields for various categories of the analyzed 2016 data corresponding to 36.3 fb −1 , compared to the expected number of events from the simulated samples of signal and background other than the QCD multijet process.The quoted uncertainties are statistical only.

Table 4 :
Event yields for various categories of the analyzed 2018 data corresponding to 54.5 fb −1 , compared to the expected number of events from the simulated samples of signal and background other than the QCD multijet process.The quoted uncertainties are statistical only.

Table 5 :
The functional forms used to fit the continuum component of the background in various analysis categories.The notation "exp" stands for the exponential function, "exp•pol1 (pol2)" denotes the product of an exponential function and a first-order (second-order) polynomial.

Table 6 :
The impact of the dominant systematic uncertainties on the observed signal strength for inclusive Higgs boson production followed by decay to bottom quarks.