Search for the decay of the Higgs boson to a pair of light pseudoscalar bosons in the final state with four bottom quarks in proton-proton collisions at $\sqrt{s}$ = 13 TeV

A search is presented for the decay of the 125 GeV Higgs boson (H) to a pair of new light pseudoscalar bosons (a), followed by the prompt decay of each a boson to a bottom quark-antiquark pair, H $\to$ aa $\to$ $\mathrm{b\bar{b}b\bar{b}}$. The analysis is performed using a data sample of proton-proton collisions collected with the CMS detector at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 138 fb$^{-1}$. To reduce the background from standard model processes, the search requires the Higgs boson to be produced in association with a leptonically decaying W or Z boson. The analysis probes the production of new light bosons in a 15 $\lt$ $m_\mathrm{a}$ $\lt$ 60 GeV mass range. Assuming the standard model predictions for the Higgs boson production cross sections for pp $\to$ WH and ZH, model independent upper limits at 95% confidence level are derived for the branching fraction $\mathcal{B}$(H $\to$ aa $\to$ $\mathrm{b\bar{b}b\bar{b}}$). The combined WH and ZH observed upper limit on the branching fraction ranges from 1.10 for $m_\mathrm{a} =$ 20 GeV to 0.36 for $m_\mathrm{a} =$ 60 GeV, complementing other measurements in the $\mu\mu\tau\tau$, $\tau\tau\tau\tau$ and bb$\ell\ell$ ($\ell = $ $\mu$, $\tau$) channels.


Introduction
A Higgs boson with a mass of 125 GeV was discovered [1][2][3] in 2012 by the ATLAS and CMS Collaborations at the CERN LHC.Studies [4,5] of this particle, here denoted H, have indicated its compatibility with the Higgs boson of the standard model (SM) of particle physics.Notwithstanding this compatibility, present experimental limits allow a significant branching fraction of the H boson to beyond-the-SM particles up to 16% [5].Observation of such H boson decays could point the way for the SM to be extended to a more fundamental theory.
Many models of physics beyond the SM predict an extended Higgs sector.Examples include two-Higgs-doublet models (2HDM) [6], the 2HDM with a scalar singlet field (2HDM+S) [7], and the next-to-minimal supersymmetric SM (NMSSM) [8].In the 2HDM+S model, the Higgs sector contains seven physical scalar and pseudoscalar particles.One of the scalar states can be identified with the 125 GeV H boson while one of the pseudoscalar states, denoted a, can be light enough to allow H → aa decays.These decays can occur with a significant branching fraction (≳10%) if they are kinematically allowed [7].For virtually all 2HDM+S scenarios, the dominant pseudoscalar decay mode is a → bb (with b a bottom quark) so long as the pseudoscalar mass m a is at least twice the bottom quark mass.
In this paper, we present a search for the Higgs boson decay H → aa → bbbb.The decays are assumed to be prompt, i.e., not leading to long-lived new particles.The analysis is based on a sample of proton-proton (pp) collision events at a center-of-mass energy of 13 TeV recorded with the CMS detector in 2016-2018, corresponding to an integrated luminosity of 138 fb −1 .The search makes use of V = W or Z boson associated production, in which the V boson decays leptonically, W → ℓν or Z → ℓ + ℓ − , with ℓ an electron or muon.The leptonic signature suppresses SM background processes such as those with four b jets produced through quantum chromodynamics (QCD) interactions (multijet background).
Backgrounds to H → aa → bbbb from SM processes arise from several sources.The dominant background arises from tt +jets production in which one or both top quarks decay leptonically.The next-largest backgrounds are from Z +jets and W +jets production with a V boson that decays leptonically: the main background process in this case is from V + bb production plus one or two mistagged b jets.Smaller backgrounds arise from various electroweak processes.These include events with diboson VV (′) or triboson VVV (′) production, events with a single top quark, and rare electroweak processes such as events with ttV or tt γ production.The SM associated production of the Higgs boson in the VH channel, with H → bb, is also expected to comprise a background component in this search.Background from multijet production arises when jets or other reconstructed objects are erroneously identified as leptons or when genuine leptons are produced through hadron decays.
Tabulated results for the study are provided in the HEPData record for this analysis [28].

Detector and trigger
A detailed description of the CMS detector, along with a definition of the coordinate system and pertinent kinematic variables, is presented in Ref. [29].Briefly, a cylindrical superconducting solenoid with an inner diameter of 6 m provides a 3.8 T axial magnetic field.Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL).The tracking detectors cover the pseudorapidity range |η| < 2.5.The ECAL and HCAL, each with a barrel and two endcap sections, cover |η| < 3.0.Forward calorimeters extend the coverage to |η| < 5.0.Muons are detected within |η| < 2.4 by gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.The detector is nearly hermetic, permitting accurate measurements of missing transverse momentum.The CMS trigger system is described in Refs.[30,31].For this analysis, signal event candidates are selected using a combination of triggers requiring either at least one electron or muon, or, with lower transverse momentum (p T ) thresholds on the leptons, at least two electrons or muons.The trigger efficiency for signal events is measured in data and is found to exceed 95% for events satisfying the event selection criteria described below.

Event simulation
With the exception of the multijet background, which is determined from data, the background from SM processes is evaluated from simulation using Monte Carlo (MC) event generators.The tt +jets background is simulated at next-to-leading order (NLO) accuracy with the POWHEG2.0 [32][33][34] generator.A reweighting is applied to the top quark p T to improve the modeling in the tt +jets simulation.The reweighting factors are defined as the ratio of the top quark p T measured in data, unfolded to the parton level, with respect to the theory predictions at NNLO, and are taken from Ref. [35].The V+jets background is simulated at leading order (LO) accuracy with the MADGRAPH5 aMC@NLO generator (version 2.2.2 for the simulation of the 2016 data and version 2.4.2 for the 2017-2018 data) [36], with cross sections normalized to NLO.For the V +jets processes, we apply NLO electroweak corrections as a function of the V boson p T at the generator level, as documented in Ref. [37].Diboson (triboson) processes are simulated at LO (NLO) accuracy with the PYTHIA8.212(POWHEG2.0)generator.Single top quark production and the rare electroweak processes are simulated at NLO with a combination of the POWHEG2.0 and MADGRAPH5 aMC@NLO generators.Samples generated at NLO with the MADGRAPH5 aMC@NLO generator employ the FxFx matching and merging scheme [38], while those generated at LO with the same generator use the MLM matching and merging scheme [39].
Signal events, namely events with associated VH production followed by the decay H → aa, are simulated at LO using the MADGRAPH5 aMC@NLO (version 2.6.5)generator.The cross sections are normalized to the values presented in Refs.[40][41][42][43] and include NNLO QCD and NLO electroweak corrections.The W and Z bosons are required to decay through the e, µ, or τ leptonic channels, while the pseudoscalar bosons are required to decay via a → bb.We require at least one lepton (e or µ) with p T greater than 10 GeV and |η| less than 2.5: this results in a filter efficiency of about 61% for the WH channel and 75% for the ZH channel.Signal event samples are generated for pseudoscalar mass values m a = 15, 20, 25, 30, 40, 50, and 60 GeV.
The parton distribution function (PDF) set used in the simulations is NLO NNPDF3.1 for the NLO samples and LO NNPDF3.1 for the LO samples [44].Parton showering and hadronization are described for all MC samples using the PYTHIA8 generator.For the simulation of the 2016 data, the CUETP8M1 tune [45,46] is used.For the 2017-2018 data, we use the CP5 tune [47].The detector response is evaluated using GEANT4 [48].The contribution of additional pp interactions within the same or nearby bunch crossing (pileup) is modeled by adding simulated minimum bias events to the MC samples, with the distribution of additional interactions per event adjusted to agree with observation.

Event reconstruction
The particle-flow (PF) algorithm [49] is used to reconstruct and identify individual photons, electrons, muons, charged hadrons, and neutral hadrons (PF candidates) by combining information from individual subdetectors in an optimal fashion.To improve the quality of the reconstruction, requirements beyond the PF criteria are imposed on electron and muon candidates.For electron candidates, these requirements are based on the shower shape of the associated energy cluster in the ECAL, the angular matching of the electron track candidate with the cluster, and the consistency of the track with originating from the primary pp interaction vertex (PV) [50].Electrons compatible with photon conversion in the detector material are removed.For muon candidates, these requirements are based on the number of energy deposits recorded for the candidate in the silicon tracker and muon system, the fit quality of the candidate, and the consistency of the candidate with originating from the PV [51].Electron candidates are restricted to |η| < 2.5 and muon candidates to |η| < 2.4.
The PV is taken to be the vertex corresponding to the hardest scattering in the event, evaluated using tracking information alone as described in Section 9.4.1 of Ref. [52].The PV is required to lie within 24 cm of the center of the detector in the direction along the beam axis and within 2 cm in the plane transverse to that axis.
Leptons from V boson decays are generally isolated from the jets in an event, unlike leptons from sources such as hadron decays.We therefore require electron and muon candidates to be isolated.An isolation variable I iso is defined as the sum of the scalar p T of charged hadron, neutral hadron, and photon PF candidates within a cone of radius ∆R = √ (∆η) 2 + (∆ϕ) 2 = 0.3 around the electron or muon candidate direction, where ϕ is the azimuthal angle.A correction is applied to account for the contribution of pileup.For electrons, this correction is performed using the jet area method described in Ref. [53].For muons, the correction is performed by subtracting half the sum of scalar p T of charged-particle tracks within the cone, but using only those tracks that do not originate from the PV.We require the ratio of I iso to the p T value of the muon to be less than 0.15.For electrons, the I iso selection requirements are optimized via a multivariate analysis, and are chosen to be 0.059 (0.029 + 0.506p T ) and 0.057 (0.044 + 0.963p T ) in the ECAL barrel and endcap, respectively, for the 2016 (2017-2018) data-taking period.
The combined reconstruction, identification, and isolation efficiency for electrons and muons is determined in data using Z boson decays with a tag-and-probe method [54].The efficiencies range from 70 to 90% for electrons and from 90 to 97% for muons, depending on the lepton p T and η.
Jets are reconstructed by clustering PF candidates using the anti-k T jet algorithm [55,56] with a distance parameter of 0.4.Charged-particle tracks not associated with the PV are excluded from the clustering.The jet energies are corrected as explained in Ref. [57] to account for the nonlinear response of the detector and for the expected contributions of neutral particles from pileup.The jets are required to satisfy p T > 20 GeV and |η| < 2.5.The number of selected jets in an event is denoted N j .
Bottom quark jets are identified by applying the DEEPCSV algorithm [58] to the selected jets (the p T threshold is maintained at 20 GeV).The medium working point of this algorithm is used, for which the tagging efficiency is 68% for b quark jets with p T ≈ 30 GeV.The corresponding misidentification probability for light jets (i.e., gluon and light quark jets) is 1.6%, while that for charm quark jets is 13%.The number of b-tagged jets in an event is denoted N b .
The missing transverse momentum vector ⃗ p miss T is defined as the projection onto the plane perpendicular to the beam axis of the negative vector sum of the momenta of all reconstructed PF candidates in an event.Its magnitude is referred to as p miss T .Calibration and pileup effects are accounted for as described in Ref. [59].The p miss T variable is used to measure the undetected momentum in an event, such as is carried by neutrinos in leptonic W boson decays or as can arise spuriously from the misreconstruction of jet p T .
For the WH channel we also consider the transverse mass m T , defined by where p T,ℓ is the transverse momentum of the lepton and ϕ p miss T ,ℓ is the azimuthal angle between ⃗ p miss T and the p T,ℓ vector.For WH events, m T represents the W boson mass calculated using transverse momentum only [60].

Baseline selection
A set of baseline selection criteria are applied in order to identify events that are consistent with the signal processes considered.
Single-lepton triggers are used to identify events in the WH channel.The p T threshold for electrons was 27 GeV in 2016 and 32 GeV in 2017-2018.The corresponding thresholds for muons were 24 and 27 GeV, respectively.Dilepton triggers are used to identify events in the ZH channel.For electrons, the p T thresholds on the two leptons were 23 and 12 GeV, while for muons they were 17 and 8 GeV, for all data-taking periods.For muons, the triggers in 2017-2018 differ from those in 2016 by the additional requirement that the dimuon invariant mass be larger than 3.8 GeV.The triggers require the leptons to satisfy stringent identification criteria and to be isolated from other tracks and energy deposits in the calorimeters.
For the WH channel, events must contain exactly one electron or muon, and no other electron (muon) with p T above 15 (10) GeV.For the 2016 (2017-2018) data, electrons must have p T > 30 (35) GeV and muons p T > 25 (30) GeV.Higher thresholds are used for the 2017-2018 data to account for the increase in the online trigger thresholds.To suppress the multijet background, WH events must satisfy p miss T > 25 GeV and m T > 50 GeV.
For the ZH channel, events must contain exactly one e + e − or µ + µ − pair and no other electron or muon with p T above 15 (10) GeV.For electron pairs, one of the two legs must have p T > 25 GeV and the other p T > 15 GeV.For muon pairs, the corresponding requirements are p T > 20 GeV and p T > 10 GeV.To reduce the background from tt +jets events, the invariant mass Table 1: Signal region (SR) and control region (CR) requirements in (N b , N j ) for the WH and ZH channels, where N b is the number of b-tagged jets in an event and N j is the total number of jets in an event.

Label
(N b , N j ) For both WH and ZH events, we require N j ≥ 3 and N b ≥ 2.

Analysis overview
The selected events are categorized in terms of N b and N j .Events in the signal region (SR) are divided into two categories depending on whether they have N b = 3 (denoted the 3b category) or N b = 4 (denoted the 4b category).The 4b category corresponds to events in which all four b jets in H → aa → bbbb decays have been reconstructed.The 3b category accounts for events in which two b quarks are merged in a single jet or one of the b quarks is rather soft and fails to be reconstructed.The latter case is expected to have a small impact on the Higgs boson kinematics.
For WH events, the signal region is defined by (N b , N j ) = (3b, 3-4j) and (4b, 4j), where 3-4j refers to events with either exactly three or exactly four jets.For ZH events, the SR is defined by (3b, ≥3j) and (4b, ≥4j).These regions are summarized in Table 1.The H boson is formed from the four b-tagged jets in the 4b category and from the three b-tagged jets in the 3b category.
For WH events, the W boson is defined by the electron or muon and the ⃗ p miss T vector.For ZH events, the Z boson is defined by the e + e − or µ + µ − pair.
A multivariate discriminator based on a boosted decision tree (BDT) [61] is used to help separate signal and background events.Different BDTs are constructed for the WH and ZH channels, and the 3b and 4b event categories.The input variables for the BDT are: • p H T : the vector sum of the transverse momenta of the three or four b-tagged jets forming the H boson candidate; • m H : the invariant mass of the three or four b-tagged jets forming the H boson candidate; • p V T : for W boson candidates, the magnitude of the vector sum of the p T of the electron or muon and p miss T ; for Z boson candidates, the p T of the dilepton pair; • H T : the scalar sum of the p T of the three or four b-tagged jets that define the H boson candidate; • ⟨∆R(b, b ′ )⟩: the separation in the η-ϕ plane between any two b-tagged jets in an event, averaged over all such combinations; • |∆ϕ(V, H)|: the azimuthal angle between the directions of the W or Z and the H boson candidates; • |∆ϕ(j, p miss T )| min : the smallest azimuthal angle between the ⃗ p miss T direction and a jet; • p ℓ T : for WH events, the p T of the electron or muon; for ZH events, the p T of the leading electron or muon; • p miss T : the magnitude of the missing transverse momentum; • m T (defined only for WH events): the transverse mass defined in Section 4; | min (defined only for the 4b category): the minimum difference between two dijet masses formed from all possible combinations with the four b-tagged jets.The four b-tagged jets can be grouped in three different ways to form a pair of a → bb candidates.The ∆m min bb variable represents the grouping that results in the smallest difference between the masses of the two pseudoscalar candidates.
The BDTs are trained on signal events, combining all pseudoscalar boson mass points, and on background events, combining all considered SM processes.The shapes of the BDT distributions are not strongly dependent on the assumed value of m a , allowing the same BDT to be used for all signal mass values.
We perform a maximum likelihood fit of the signal and background components to the observed distribution of the BDT discriminant.The fit is performed simultaneously across the SRs and control regions (CRs).The CRs are described in Section 5.3.The templates for signal events are derived from simulation.The templates for background events are determined as described in Section 5.3.

Background modeling and control regions
A major source of background in our study is from tt +jets production, as mentioned in the introduction.To evaluate this background, simulated tt +jets events are categorized based on the parton-level flavor content of additional particle-level jets not originating from the decay of the tt +jets system.Events that contain at least one additional particle-level jet matched to a b hadron are labeled as tt +bb .Events that contain at least one additional particle jet matched to a c hadron and no b hadron are labeled as tt +cc .The tt +cc and tt +bb categories, with heavy-flavor (HF) production, are referred to as the "tt +HF " background.This background mostly arises from events with gluon splitting, g → cc or bb.The remaining tt +jets events are referred to as the "tt +LF " background, where "LF" refers to light flavor.The tt +LF category also includes events with no additional particle-level jets.One source of this background is tt +qq (q = u, d, s) events in which light-flavored jets are erroneously identified as b jets.
The template for the tt +LF background is taken from simulation, both for its shape and normalization.In contrast to tt +LF events, the rate of tt +HF events is not well predicted by simulation because it is dominated by the gluon splitting process, which is not yet well understood to adequate precision [62].Therefore, while the shapes of the templates and the relative rate of tt +bb to tt +cc production are taken from simulation, the overall normalization is determined from the fit to data.
The W +jets and Z +jets backgrounds mostly arise when a W or Z boson is produced in association with b quark jets.Like tt +HF production, the rates of these processes are difficult to model [63,64].Thus, for the W+jets background to the WH channel and the Z+jets background to the ZH channel, the shapes of the templates are taken from simulation while the nor-malizations are determined in the fit.There is a small Z+jets background to the WH channel.The template for this background is taken from simulation for both the shape and normalization.The W+jets background to the ZH channel is negligible and is therefore ignored.
The CRs used to constrain the normalizations of the tt +HF , W+jets, and Z+jets processes are obtained by selecting events with one less b-tagged jet with respect to the SRs.This requirement enables a collection of events predominantly governed by these background processes, while simultaneously depleting the signal content but closely aligning with the SRs.The (N b , N j ) criteria for the CRs are as listed in Table 1.In the (2b, 3j) and (2b, 4j) categories, any variable that depends on N b is constructed by adding "fake" b tags using the untagged jet(s) with highest DEEPCSV discriminator value(s).For the WH channel, we define a W/tt +LF CR in the (2b, 3j) category and a tt +LF CR in the (2b, 4j) category.For the ZH channel, we define a Z+jets CR in the (2b, 3j) and (2b, 4j) categories.The modeling of the tt +HF background is validated in signal-depleted regions of the SR bins corresponding to BDT scores ≲0.1.The modeling of the various backgrounds has been studied through comparison with observations in the CRs.The degree of agreement for shape variations between CR data and pre-fit estimates is validated to within ≈10%.
For the small backgrounds from diboson, triboson, single-top quark, rare electroweak processes, and the SM VH process with H → bb, both the shape and normalization are taken from simulation.Because of the presence of two isolated leptons, the multijet background to the ZH channel is negligible and is ignored.In the WH channel, for which only one isolated lepton is required, the multijet background is expected to be small.We estimate this background contribution from data, as described below.
To evaluate the multijet background to the WH channel, we select data events by inverting the lepton isolation requirements of Section 4. All other selection criteria are the same as for signal events.We call this sample the inverted-isolation sample.The multijet background template is given by the measured BDT distribution of events in the inverted-isolation sample, normalized by a scale factor F calculated in a low-p miss T data sideband.The low-p miss T sideband is defined by p miss T < 25 GeV and m T < 50 GeV.We choose a sideband in p miss T and m T because, for multijet background events, these variables are largely uncorrelated with the lepton isolation variable.
We thus consider the following four quantities: • D A : The BDT distribution of multijet background events in the SR; • D B : The BDT distribution of multijet background events in the inverted-isolation sample; • N C : The number of multijet background events in the low-p miss T sideband that satisfy the lepton isolation requirements; • N D : The number of multijet background events in the low-p miss T sideband that fail the lepton isolation requirements.
The multijet background estimate is given by D A = F D B with F ≡ N C /N D .The contributions of nonmultijet background events to D B , N C , and N D are subtracted using simulation before calculating D A .Because the multijet background is expected to be small compared to other backgrounds, and because the statistical precision in this background estimate is limited, the estimate is assigned a conservative uncertainty of 50%.The closure or self-consistency of the estimate is tested in the 3b and 4b CRs, where the contribution of multijet events is small.The predicted multijet background in these CRs is found to be in reasonable agreement with the data, to within the assigned uncertainty, once the nonmultijet contributions in data have been subtracted.
The same procedure is used to account for the multijet background contributions to the tt +HF and W+jets CRs for the WH channel.

Systematic uncertainties
The overall uncertainty in the determination of the integrated luminosity over the 2016-2018 data-taking period is 1.6% [65][66][67].Systematic uncertainties related to the event simulation include those associated with the PDFs, the value of the strong coupling strength α S , and the renormalization and factorization scales.We evaluate these uncertainties by altering the PDFs and scales of the simulated background samples according to the PDF4LHC prescription of Ref. [68].Their effect on the yields is estimated to be 4-5%.The uncertainties in the signal cross section arising from the perturbative QCD calculations, PDF, and the α S scales have a negligible effect on the shape of the signal.Their effects on the yield correspond to 0.7 and 1.9%, respectively, for WH production, and to 3.8 and 1.6%, respectively, for ZH production [40], and are taken into account by introducing nuisance parameters with log-normal distributions into the fit.We assign the following theoretical uncertainties on the cross sections for those simulated backgrounds whose normalization is not determined in the fit: tt +LF events (6%) [69], single top quark events (5%) [70,71], diboson and triboson events (50%) [72,73], Z+jets events in the WH channel (2%) [74], and rare electroweak processes (15%).
Tables 2 and 3 summarize the other systematic uncertainties that are considered in the analysis.For those, the shape variations are taken into account in the template fit, while their effect on the background and signal yield has been evaluated for illustration purposes.For signal events, the estimate of these uncertainties are shown for two choices of the pseudoscalar mass: m a = 20 and 60 GeV.
The uncertainties related to the trigger efficiency and the lepton identification and isolation requirements are evaluated using data (this latter uncertainty, at 2%, is the same for electrons and muons).The uncertainties associated with the lepton energy scale, the jet energy scale, and the jet energy resolution are evaluated as a function of lepton or jet p T and η.The uncertainty associated with the lepton energy scale is found to be negligible.In Tables 2 and 3, the uncertainty quoted for the jet energy scale is the maximum that is found for the various p T and η bins.A related uncertainty, relevant for WH channel, is in the energy scale of unclustered p miss T , where unclustered p miss T refers to signals in the detector that are not associated to reconstructed physics objects, namely muons, electrons, photons, hadronically decaying taus and jets [59].The uncertainties in the efficiencies for b tagging and for the mistagging of c quark and light-flavored jets are evaluated as described in Ref. [58].Sources of systematic uncertainty that affect the data-to-simulation corrections of the b-tagging efficiency and misidentification rate are the jet energy scale, the cross contamination between light flavor or gluon jets with heavy flavor jets, and the statistical fluctuations in the data and MC.These uncertainties are split into 15 p T and η bins and consist of 9 independent sources.Except for their statistical component, they are treated as correlated between the three data-taking periods.To evaluate the uncertainty associated with pileup, the total inelastic pp cross section is varied by ±5% [75].The uncertainty from the mismodeling of the top quark p T is assessed for the tt +jets MC by comparing results with and without reweighting.
To account for the finite sizes of the simulated event samples, each bin of the simulated signalplus-background templates is allowed to vary within its statistical uncertainty, independently from the other bins in the distribution.A bin-by-bin statistical uncertainty is considered by assigning a Poisson nuisance parameter per bin to distributions in those samples [76].

Maximum likelihood fit
A maximum likelihood fit of the signal and background terms to the observed distribution of the BDT discriminant is performed simultaneously across the SRs and CRs.The fits are performed separately for the WH and ZH channels and also with the WH and ZH channels combined.
For the WH channel, there are two SRs and two CRs, as indicated in the upper portion of Table 1, for a total of four independent data regions.In addition, the electron and muon samples are treated as separate event categories in the fit.We also treat each of the three years of data collection separately.There are therefore 24 data regions that are fitted simultaneously.The fitted parameters are the signal strength modifier, a global scale factor for tt +HF production, a global scale factor for W+jets production, and various nuisance parameters, where the nuisance parameters represent the systematic uncertainties from Section 5.4.The signal strength modifier µ is defined by the equation: s = µs 0 , where s is the observed signal yield and s 0 is the expected signal yield for a SM VH production cross section times a "benchmark" branching fraction B(H → aa → bbbb ) = 1.
For the ZH channel, there are also two SRs and two CRs, as indicated in the lower portion of Table 1, leading to 24 independent data regions in the fit as for the WH channel.The fitted parameters are the signal strength modifier, a global scale factor for tt +HF production, a global scale factor for Z+jets production with three b-tagged jets, a global scale factor for Z+jets production with four b-tagged jets, and several nuisance parameters.We implement separate scale factors for Z+jets production with three and four b-tagged jets because the relative rates of Z boson production in association with two or more b quarks are not yet well established [63,64].For W+jets events, a common scale factor is used for the three and four b-tagged categories be-     cause the large statistical uncertainties do not allow discrimination between the two categories.
Systematic uncertainties in the signal and background normalizations are treated as log-normal distributions.The SR and CR distributions in data and the background estimates are used in the maximum likelihood fit to search for H → aa → bbbb decays.

Results
No evidence is found for a significant excess of events above the SM background prediction.Model independent upper limits are placed, at 95% confidence level (CL), on the branching fraction B(H → aa → bbbb ) as a function of m a .The results for the signal are normalized to the NNLO SM cross sections of the Higgs boson production σ(pp → WH) = 1.37 pb and σ(pp → ZH) = 0.86 pb.The results are dominated by the statistical uncertainties.The dom-    inant systematic uncertainties are associated with the b jet identification.For m a = 60 GeV, the b-tagging uncertainties on the yield vary between 6 and 9% for the major background processes.For the signal process, this uncertainty is around 8%.Other uncertainties lie below 5%.
Figures 1 and 2 show the post-fit shapes of the BDT score variable for the WH and ZH channels, respectively, where "post-fit" means that the parameter values determined in the likelihood fit are used.The event yields for the various SM backgrounds are presented in Tables 4 and 5 for the 3b and 4b categories, respectively.The results are presented in the bins of the BDT variable shown in Figs. 1 and 2 (five bins for the 3b category and four bins for the 4b category).For purposes of illustration, signal yields corresponding to the expectation for SM VH production and B(H → aa → bbbb ) = 1, for m a = 20 or 60 GeV are also shown.The background uncertainties account for both systematic and statistical sources.

Process
e [1,3] e [4,4] e [5,5] µ upper limits on the quantity σ VH B(H → aa → bbbb )/σ SM as a function of m a , for the WH and ZH channels separately as well as their combination.The limits are obtained using the CL s criterion and an asymptotic approximation to the distribution of the profiled likelihood ratio test statistic [77,78], assuming SM predictions for the Higgs boson production cross section.
The limit curves are plotted by interpolating smoothly across the mass points.
In the WH channel, the observed upper limit on the B(H → aa → bbbb ) is found to be 0.48 (1.59) for m a = 60 (20) GeV, compared to an expected limit of 0.51 ± 0.16 (1.73 ± 0.56).In the ZH channel, the corresponding observed upper limit is 0.48 (1.40), with an expected result of 0.38 ± 0.13 (1.16 ± 0.38).Our search is optimized for m a values between 20 and 60 GeV, with signal sensitivity falling rapidly below 30 GeV.For m a ≲ 30 GeV, the sensitivity is degraded in both the WH and ZH channels because of the signal efficiency loss when reconstructing collimated b jet pairs from highly Lorentz-boosted pseudoscalar bosons.

Summary
A search for exotic decays of the 125 GeV Higgs boson (H) to a pair of new light pseudoscalar bosons (a), followed by decay to four b quark jets, H → aa → bbbb, is presented, using data recorded with the CMS detector.The analysis is based on an integrated luminosity of 138 fb [11] ATLAS Collaboration, "Search for Higgs boson decays into two new low-mass spin-0 particles in the 4b channel with the ATLAS detector using pp collisions at √ s = 13 TeV", Phys.Rev  [17] ATLAS Collaboration, "Search for new phenomena in events with at least three photons collected in pp collisions at √ s = 8 TeV with the ATLAS detector", Eur.Phys.J. C 76 (2016) 210, doi:10.1140/epjc/s10052-016-4034-8,arXiv:1509.05051.

Figure 1 :
Figure 1: Post-fit BDT distributions in the WH channel extracted with the m a = 60 GeV signal hypothesis.Signal regions for the 3b (upper) and 4b (lower) event categories are shown separately for the electron (left) and muon (right) channels.The dotted lines WH 20 GeV , WH 60 GeV , illustrate the shapes of the signal template normalised to the SM cross section times a branching fraction B(H → aa → bbbb ) = 1 and scaled by the factors indicated in the figure.The horizontal error bars indicate the bin width.

Figure 2 :
Figure 2: Post-fit BDT distributions in the ZH channel extracted with the m a = 60 GeV signal hypothesis.Signal regions for the 3b (upper) and 4b (lower) event categories are shown separately for the electron (left) and muon (right) channels.The dotted lines ZH 20 GeV and ZH 60 GeV , illustrate the shapes of the signal template normalised to the SM cross section times a branching fraction B(H → aa → bbbb ) = 1 and scaled by the factors indicated in the figure.The horizontal error bars indicate the bin width.
Figures1 and 2show the post-fit shapes of the BDT score variable for the WH and ZH channels, respectively, where "post-fit" means that the parameter values determined in the likelihood fit are used.The event yields for the various SM backgrounds are presented in Tables4 and 5for the 3b and 4b categories, respectively.The results are presented in the bins of the BDT variable shown in Figs.1 and 2(five bins for the 3b category and four bins for the 4b category).For purposes of illustration, signal yields corresponding to the expectation for SM VH production and B(H → aa → bbbb ) = 1, for m a = 20 or 60 GeV are also shown.The background uncertainties account for both systematic and statistical sources.Figures 3 present model independent

− 1 Figure 3 :
Figure3: Model independent 95% CL upper limits on σ VH B(H → aa → bbbb )/σ SM for the WH channel (upper), the ZH channel (middle), and the combination of both channels (lower), where "a" is a new pseudoscalar particle decaying through a → bb, and σ SM is the SM Higgs boson production cross section.

Table 2 :
Summary of systematic uncertainties and their effect on the background and signal event yields in the WH channel.Uncertainties that are negligible are indicated with a dash (-).

Table 3 :
Summary of systematic uncertainties and their effect on the background and signal event yields in the ZH channel.Uncertainties that are negligible are indicated with a dash (-).

Table 5 :
Signal-plus-background fit results for the 4b WH and ZH signal regions extracted with the m a = 60 GeV signal hypothesis.The lepton flavor (e or µ) and BDT bin range (in square brackets) are indicated in the column headings.Signal yields corresponding to the expectation for SM VH production and B(H → aa → bbbb ) = 1 are shown for the m a = 20 and 60 GeV hypotheses.The background uncertainties account for both systematic and statistical sources.