Search for beyond the standard model Higgs bosons decaying into a $\mathrm{b\overline{b}}$ pair in pp collisions at $\sqrt{s} =$ 13 TeV

A search for Higgs bosons that decay into a bottom quark-antiquark pair and are accompanied by at least one additional bottom quark is performed with the CMS detector. The data analyzed were recorded in proton-proton collisions at a centre-of-mass energy of $\sqrt{s} =$ 13 TeV at the LHC, corresponding to an integrated luminosity of 35.7 fb$^{-1}$. The final state considered in this analysis is particularly sensitive to signatures of a Higgs sector beyond the standard model, as predicted in the generic class of two Higgs doublet models (2HDMs). No signal above the standard model background expectation is observed. Stringent upper limits on the cross section times branching fraction are set for Higgs bosons with masses up to 1300 GeV. The results are interpreted within several MSSM and 2HDM scenarios.


Introduction
In the standard model (SM), a Higgs boson at a mass of 125 GeV has a large coupling to b quarks via Yukawa interactions. Its production in association with b quarks and subsequent decay into b quarks at the CERN LHC is, however, difficult to detect because of the high rate of heavy-flavour multijet production. There are, nevertheless, models beyond the SM that predict an enhancement of Higgs boson production in association with b quarks, which motivate the search for such processes.
Prominent examples of models beyond the SM are the two Higgs doublet model (2HDM) [1], which contains two scalar Higgs doublets, as well as one particular realization within the minimal supersymmetric extension of the SM (MSSM) [2]. These result in two charged Higgs bosons, H ± and three neutral ones, jointly denoted as φ. Among the latter are, under the assumption that CP is conserved, one CP -odd (A), and two CP -even (h, H) states, where h usually denotes the lighter CP -even state. For the purpose of this analysis, -1 -JHEP08(2018)113 the boson discovered in 2012 with a mass near 125 GeV [3][4][5][6] is interpreted as h, whose mass is thus constrained to the measured value. The two heavier neutral states, H and A, are the subject of the search presented here.
In the 2HDM, flavour changing neutral currents at tree level can be suppressed by introducing discrete symmetries, which restrict the choice of Higgs doublets to which the fermions can couple. This leads to four types of models with natural flavour conservation at tree level: • type-I : all charged fermions couple to the same doublet; • type-II : up-type quarks (u, c, t) couple to one doublet, down-type fermions (d, s, b, e, µ, τ ) couple to the other. This structure is also implemented in the MSSM; • lepton-specific: all charged leptons couple to one doublet, all quarks couple to the other; • flipped : charged leptons and up-type quarks couple to one doublet, down-type quarks to the other.
While until now the type-I and -II models have been most intensively tested, the flipped model is remarkably unexplored from the experimental side. The A/H → bb decay mode is ideally suited to constrain this model due to the large branching fraction of the Higgs boson into b quarks. The CP -conserving 2HDMs have seven free parameters. They can be chosen as the Higgs boson masses (m h , m H , m A , m H ± ), the mixing angle between the CP -even Higgs bosons (α), the ratio of the vacuum expectation values of the two doublets (tan β = v 2 /v 1 ), and the parameter that potentially mixes the two Higgs doublets (m 12 ). For cos(β−α) → 0, the light CP -even Higgs boson (h) obtains properties indistinguishable from the SM Higgs boson with the same mass in all four types of models listed above [1].
The MSSM Higgs sector has the structure of a type-II 2HDM. The additional constraints given by the fermion-boson symmetry fix all mass relations between the Higgs bosons and the angle α at tree level, reducing the number of parameters at this level to only two. These parameters are commonly chosen as the mass of the pseudoscalar Higgs boson, m A , and tan β. After the Higgs boson discovery at the LHC, MSSM benchmark scenarios have been refined to match the experimental data and to reveal characteristic features of certain regions of the parameter space [7,8]. Considered in this analysis are the m mod+ h , the hMSSM [9], the light stau ( τ ), and the light stop ( t) scenarios [7]. The m mod+ h scenario is a modification of the m max h scenario, which was originally defined to give conservative exclusion bounds on tan β in the LEP Higgs boson searches [10][11][12]. It has been modified such that the mass of the lightest CP -even state, m h , is compatible with the mass of the observed boson within ±3 GeV [13,14] in a large fraction of the considered parameter space [7]. The hMSSM approach [9,15,16] describes the MSSM Higgs sector in terms of just m A and tan β, given the experimental knowledge of m Z and m h . It defines a largely model-independent scenario, because the predictions for the properties of the MSSM Higgs bosons do not depend on the details of the supersymmetric sector [17].

JHEP08(2018)113
Further variations of the supersymmetric sector are implemented in the light τ and light t scenarios [7], which are also designed such that the light scalar h is compatible with the measured Higgs boson mass [18].
For tan β values larger than one, the couplings of the Higgs fields to b quarks are enhanced both in the flipped and the type-II models, and thus also in the MSSM. Furthermore, there is an approximate mass degeneracy between the A and H bosons in the MSSM for the studied range of m A . For the 2HDM scenarios considered in this paper, such a degeneracy will be imposed. These effects enhance the combined cross section for producing these Higgs bosons in association with b quarks by a factor of up to 2 tan 2 β with respect to the SM. The decay A/H → bb is expected to have a high branching fraction, even at large values of the Higgs boson mass and |cos(β − α)| [19].
In the φ → bb decay mode, searches have initially been performed at LEP [10] and by the CDF and D0 Collaborations [31] at the Tevatron collider. At the LHC, the only analyses in this channel with associated b jets have also been performed by the CMS Collaboration using the 7 and 8 TeV data [32,33]. In the absence of any signal, limits on the pp → bφ(→ bb) + X cross section have been derived in the 90-900 GeV mass range. The combined 7 and 8 TeV data analyses translate into upper bounds on tan β between 14 and 50 in the Higgs boson mass range of 100-500 GeV, assuming the m mod+ h scenario of the MSSM.
The ATLAS and CMS Collaborations have performed extensive 2HDM interpretations of measurements in different production and decay channels, in particular also in the A → Zh, h → bb decay mode [34][35][36]. The ATLAS interpretation [35] also covers the flipped scenario, and the 2HDM interpretations reported in this paper are compared to these.
With the proton-proton (pp) collision data set corresponding to an integrated luminosity of 35.7 fb −1 collected at a centre-of-mass energy of √ s = 13 TeV in 2016, the sensitivity to key model parameters with respect to previous CMS searches is significantly extended. The analysis focuses on neutral Higgs bosons A and H with masses m A/H ≥ 300 GeV that are produced in association with at least one b quark and decay to bb, as shown by the diagrams in figure 1. The signal signature therefore comprises final states characterized by at least three b quark jets ("b jets"), and the dominant background is multijet production. A fourth b jet is not explicitly required, since due to the process topology the majority of the signal events are found to have at most three b jets within the acceptance of this analysis. Events are selected by dedicated triggers that identify b jets already during data taking. This helps significantly to suppress the large rate of multijet production, while maintaining sensitivity to the signal process. The analysis searches for a peak in the invariant mass distribution, M 12 , of the two b jets with the highest transverse momentum p T values, which originate from the Higgs boson decay in about 66% of all cases at m A/H = 300 GeV, increasing up to 75% for m A/H ≥ 700 GeV. The dominant background is the production of heavy-flavour multijet events containing either three b jets, or two b jets plus a third jet originating from either a charm quark, a light-flavour quark, or a gluon, which is misidentified as a b jet.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the field volume, the inner tracker is formed by a silicon pixel and strip tracker. It measures charged particles within the pseudorapidity range |η| < 2.5. The tracker provides a transverse impact parameter resolution of approximately 15 µm and a resolution on p T of about 1.5% for particles with p T = 100 GeV. Also inside the field volume are a crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter. Forward calorimetry extends the coverage provided by the barrel and endcap detectors up to |η| < 5. Muons are measured in gasionization detectors embedded in the steel flux-return yoke, in the range |η| < 2.4, with detector planes made using three technologies: drift tubes, cathode strip chambers, and resistive-plate chambers. Matching muons to tracks measured in the silicon tracker results in a p T resolution between 1 and 10%, for p T values up to 1 TeV. A detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in ref. [37].

Event reconstruction and simulation
A particle-flow algorithm [38] aims to reconstruct and identify all particles in the event, i.e. electrons, muons, photons, and charged and neutral hadrons, with an optimal combination of all CMS detector systems. The reconstructed vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex. The physics objects chosen are those that have been defined using information from the tracking detector, including jets, the associated missing transverse momentum, which is taken as the negative vector sum of the p T of those jets, and charged leptons.
Jets are clustered from the reconstructed particle-flow candidates using the anti-k T algorithm [39,40] with a distance parameter of 0.4. Each jet is required to pass dedicated quality criteria to suppress the impact of instrumental noise and misreconstruction. Contributions from additional pp interactions within the same or neighbouring bunch crossing -4 -

JHEP08(2018)113
(pileup) affect the jet momentum measurement. To mitigate this effect, charged particles associated with other vertices than the reference primary vertex are discarded before jet reconstruction [41], and residual contributions (e.g. from neutral particles) are accounted for using a jet-area based correction [42]. Subsequent jet energy corrections are derived from simulation, and are confirmed with in situ measurements of the energy balance in dijet, multijet, and Z/γ + jet events [43].
For the offline identification of b jets, the combined secondary vertex (CSVv2) algorithm [44] is used. This algorithm combines information on track impact parameters and secondary vertices within a jet into an artificial neural network classifier that provides separation between b jets and jets of other flavours.
Simulated samples of signal and background events were produced using different event generators and include pileup events. The MSSM Higgs boson signal samples, pp → bbφ+X with φ → bb, were produced at leading order (LO) in the 4-flavour scheme with pythia 8.212 [45]. Comparing this prediction to computationally expensive next-to-leading order (NLO) calculations [46] generated using MadGraph5 amc@nlo in version 2.3.0 [47,48], we find a very good agreement in the shapes of the leading dijet invariant mass distribution, M 12 , while the selection efficiency is up to 10% lower when using the NLO prediction. We correct the NLO effect by applying mass-dependent correction factors to the LO signal samples and assign a corresponding systematic uncertainty in the final results. Multijet background events from quantum chromodynamics (QCD) processes have been simulated with the MadGraph5 amc@nlo event generator [49,50] using the 5-flavour scheme and MLM merging [51]; they are used for studying qualitative features but not for a quantitative background prediction. The NNPDF 3.0 [52] parton distribution functions (PDFs) are used in all generated samples. For all generators, fragmentation, hadronization, and the underlying event have been modelled using pythia with tune CUETP8M1 [53]. The response of the CMS detector is modelled with the Geant4 toolkit [54].

Trigger and event selection
A major challenge to this search is posed by the huge hadronic interaction rate at the LHC. This is addressed with a dedicated trigger scheme [55], especially designed to suppress the multijet background. Only events with at least two jets in the range of |η| ≤ 2.4 are selected. The two leading jets are required to have p T > 100 GeV, and an event is accepted only if the absolute value of the difference in pseudorapidity, ∆η, between any two jets fulfilling the p T and η requirements, is less than or equal to 1.6. The tight online requirements on the opening angles between jets are introduced to reduce the trigger rates, while preserving high efficiency in the probed mass range of the Higgs bosons. At trigger level, b jets are identified using the CSVv2 algorithm with slightly tighter requirements than for the offline analysis. At least two jets in the event must satisfy the online b tagging criteria.
The efficiency of the jet p T requirements in the trigger is derived from data collected with prescaled single-jet triggers with lower threshold. The efficiency in data and simulation is measured as a function of jet p T and η. The differences between the two are corrected for in the analysis of the simulated samples. The online b tagging efficiencies relative to -5 -JHEP08(2018)113 the offline b tagging selection are obtained from data using prescaled dijet triggers with a single b-tag requirement. A tag-and-probe method is employed to determine the efficiency as a function of p T and η of the jets. Both leading jets are required to pass offline selection criteria including b tagging requirements similar to the final event selection described below. The second-leading b jet must always pass the online b tagging requirement to ensure that it has fired the trigger. The fraction of the first leading b jets that also satisfy the online b tagging requirements is a direct measure of the relative online b tagging efficiency. Relative efficiencies are found to range from above 80% for p T ≈ 100 GeV to around 50% for p T ≈ 900 GeV, averaged over η.
The offline selection requires at least two jets with p T > 100 GeV and another one with p T > 40 GeV, which all need to satisfy |η| ≤ 2.2. The η selection is applied to benefit from optimal b tagging performance. The three leading jets have to pass the CSVv2 b tagging requirement of the medium working point [44]. This working point features a 1% probability for light-flavour jets (attributed to u, d, s, or g partons) to be misidentified as b jets, and has a b jet identification efficiency of about 70%. The separation between the two leading jets in η has to be less than 1.55, and a minimal pairwise separation of ∆R > 1 between each two of the three leading jets is imposed to suppress background from bb pairs arising from gluon splitting. This sample is referred to as "triple b tag" sample in the following.

Signal modeling
A signal template for the M 12 distribution is obtained for each Higgs boson mass considered by applying the full selection to the corresponding simulated signal data set, for nominal masses in the range of 300-1300 GeV. The sensitivity of this analysis does not extend down to cross sections as low as that of the SM Higgs boson. Thus, a signal model with a single mass peak is sufficient. This is in contrast to the φ → τ τ analysis [25], where the signal model comprises the three neutral Higgs bosons of the MSSM, one of which is SM-like.
The signal efficiency for each Higgs boson mass point is obtained from simulation and shown in figure 2. A scale factor for the efficiency of the kinematic trigger selection has been derived with data from control triggers, as described in section 4, and is applied as a weight for each event. Correction factors to account for the different b tagging efficiencies in data and simulation [44] are also applied. The total signal efficiency ranges between 0.5 and 1.4% and peaks around 500 GeV. The efficiency first increases due to the kinematic selection and then decreases for masses beyond 500 GeV due to the requirement of three b-tagged jets, and the fact that the b tagging efficiency decreases at high jet p T .
For nominal masses between 300 and 500 GeV, each signal shape is parameterized by a bifurcated Gaussian function, which has different widths on the right-and left-hand side of the peak position, continued at higher masses with an exponential function to describe the tail. The function has five parameters. The signal of the 600 GeV mass point requires one additional Gaussian function on each side of the peak position to be able to describe the tails of the distribution. This function has nine parameters in total. For nominal masses in the range 700-1300 GeV, a Bukin function as defined in appendix A, which  has five parameters, is used. All parameterizations provide a very good modelling of the M 12 spectra.
The distributions of the invariant mass of the two leading b jets, M 12 , of the signal templates and parameterizations of the probability density function for different Higgs boson masses are shown in figure 3. The natural width expected for an MSSM Higgs boson in the considered mass and tan β region is negligible compared to the detector resolution.

JHEP08(2018)113
For example, in the m mod+ h scenario at a mass of 600 GeV and tan β = 60, the natural width of the mass peak is found to be only about 19% of the full width at half maximum of the reconstructed mass distribution. The shape of the mass distribution is thus dominated by the experimental resolution, and the possibility of the two leading jets used to compute M 12 not being the daughters of the Higgs boson, which we refer to as wrong jet pairing. Pronounced tails towards lower masses are attributed to cases of incomplete reconstruction of the Higgs daughter partons, for example due to the missing momentum of neutrinos in semileptonic decays of hadrons containing bottom and charm quarks. The wrong jet pairing gives rise to tails in both directions. For the lower mass points, however, the tails towards lower masses are suppressed because of the jet p T threshold.

Background model
The main background for this analysis originates from multijet production, with at least two energetic jets containing b hadrons, and a third jet that satisfies the b tagging selection but possibly as a result of a mistag. Top quark-antiquark production exhibits a shape very similar to the multijet process. It is found to be negligible, but nevertheless is implicitly covered by our background model.
The relevant features of the multijet background are studied in a suitable control region (CR) in data, which is obtained from the triple b tag selection by imposing a b-tag veto on the third leading jet. This veto rejects jets that would satisfy a loose b tagging requirement, defined by a 10% probability for light-flavour jets to be misidentified as b jets, and has a b jet identification efficiency of about 80%. This CR has no overlaps with the triple b tag signal region (SR), while it preserves similar kinematic distributions for the three leading jets. In addition, the signal contamination in the CR is negligible.
A suitably chosen analytic function is used to model the multijet background. This function is extensively validated in the b tag veto CR. In order to improve the background description and reduce the potential bias related to the choice of the background model, the M 12 distribution is divided into the three overlapping subranges [200,650], [350,1190], and [500, 1700] GeV. Their borders are chosen to largely cover the signal shapes of the associated mass points of [300, 500], [500, 1100], and [1100, 1300] GeV, respectively (as discussed in section 5).
In the first subrange, the selection criteria introduce a kinematic edge (turn-on) in the M 12 distribution. The chosen function is a product of two terms. The first term is a turn-on function, represented by a Gaussian error function in the form of: and the parameters p 0 and p 1 describe the slope and point of the turn-on, respectively. The falling part of the spectrum is described by an extension of the Novosibirsk function originally used to describe a Compton spectrum [56], defined as: -8 -

JHEP08(2018)113
where p 2 is a normalization parameter, p 3 the peak value of the distribution, p 4 and p 5 are the parameters describing the asymmetry of the spectrum, and p 6 is the parameter of the extended term. The variable σ 0 is defined as: In the second and third subranges, we choose a nonextended Novosibirsk function (p 6 ≡ 0) without turn-on factor. Figure 4 shows the fits of the chosen functions to the CR data, which have been prescaled to give similar event count as in the SR. In the first subrange, M 12 = [200, 650] GeV, the turn-on effect due to the jet p T threshold at trigger level is clearly visible. In the other two mass subranges, the spectrum shows only the expected falling behaviour with M 12 . The values of the parameters p 0 and p 1 used to model the turn-on obtained in the CR are also used for the SR fit since the turn-on behaviour in the two regions is found to be very similar. The other function parameters are allowed to vary independently in the CR and SR fits.
Different families of alternative probability density functions such as Bernstein polynomials and the so-called dijet function as defined in ref. [57] are studied to estimate the possible bias from the choice of the background model. For each family, a systematic bias on the extraction of a signal with mass m A/H is determined: the alternative function is fit to the observed data, from which toy experiments are drawn. Using the nominal background model in the respective subrange, a maximum-likelihood fit of signal and background is performed for each pseudo-experiment. The difference in the extracted and injected number of signal events is divided by the statistical uncertainty of the fit. The resulting pull distribution is considered to represent the systematic bias on the signal strength due to the choice of the background function and our insufficient knowledge of the background processes. We infer a bias of 100, 20, and 25% in units of the statistical uncertainty of the signal strength for the first, second, and third subranges, respectively.

Systematic uncertainties
The following systematic uncertainties in the expected signal and background estimation affect the determination of the signal yield or its interpretation within the MSSM or generic 2HDM models.
The signal yields are affected by the following uncertainties: • a 2.5% uncertainty in the estimated integrated luminosity of the data sample [58]; • the uncertainty in the online b tagging efficiency scale factor, which results in an overall uncertainty in the range of 0.8-1.3% for Higgs boson masses of 300-1300 GeV; • a 5% uncertainty in the correction of the selection efficiency comparing to the NLO prediction; • the effect due to the choice of PDFs and the value of α s (1-6%), following the recommendations of the LHC Higgs Cross Section Working Group [59] when interpreting the results in benchmark models; • the uncertainty in the normalization and factorization scales (1-10%) when interpreting the results in benchmark models.
Uncertainties affecting the shape as well as the normalization of the signal templates are: • the uncertainty in the jet trigger efficiencies, ranging between subpercent values and 7% per jet depending on its η and p T ; -10 -

JHEP08(2018)113
• the uncertainty in the offline b tagging efficiency (2-5% per jet depending on its transverse momentum) and the mistag scale factors (<0.3%); • the jet energy scale (JES) and jet energy resolution (JER) uncertainties (1-6%): their impact is estimated by varying the JES and JER in the simulation within the measured uncertainties; • the uncertainty in the total inelastic cross section of 4.6% assumed in the pileup simulation procedure [60].
For the background estimation, the bias on the extracted signal strength, as reported in section 6, is considered as an additional bias term to the background fitting function. This poses the largest uncertainty for the analysis.

Results
The number of potential signal events is extracted by performing a maximum-likelihood fit of the signal plus background parameterizations to the M 12 data distribution. Initially, a fit with only the background parameterizations is performed. Results of this background-only fit in all three subranges are given in figure 5. A good description of the data is observed. The normalized differences between data and fit together with the post-fit uncertainties are shown for each subrange.
In a second step, a combined fit of signal and background to the data is performed. No significant excess over the background-only distribution is observed and upper limits at 95% confidence level (CL) on the cross section times branching fraction σ(pp → bA/H + X)B(A/H → bb) are derived. For the calculation of exclusion limits, the modified frequentist criterion CL s [61][62][63] is adopted using the RooStats package [64]. The test statistic is based on the profile likelihood ratio. Systematic uncertainties are treated as nuisance parameters and profiled in the statistical interpretation using lognormal priors for uncertainties affecting the signal yield, while Gaussian priors are used for shape uncertainties.
Model-independent upper cross section times branching fraction limits are shown as a function of the mass of the A/H bosons in figure 6 up to a mass of 1300 GeV. The visible steps in the expected and observed limits at 500 and 1100 GeV are due to the transitions between the mass subranges as explained in section 6. The limits range from about 20 pb at 300 GeV, to about 0.4 pb at 1100 GeV. The limits are also summarized in table 1 in appendix B.

Interpretation within the MSSM
The cross section limits shown in figure 6 are translated into exclusion limits on the MSSM parameters tan β and m A . The cross sections for b + A/H associated production as obtained with the four-flavour NLO [65,66] and the five-flavour NNLO QCD calculations implemented in bbh@nnlo [67] were combined using the Santander matching scheme [68].

Interpretation within the 2HDM
Cross sections and branching fractions for the bbH and bbA processes within different 2HDM models have been computed at NNLO using SusHi version 1.6.1 [74], 2hdmc version 1.7.0 [75] and lhapdf version 6.1.6 [76]. The 2HDM parameters have been set according to the "Scenario G" proposed in ref. [77]. Specifically, the heavier Higgs bosons are assumed to be degenerate in mass (m A = m H = m H ± ), and the mixing term has been set to m 2 12 = 0.5m 2 A sin 2β. The choice of such an MSSM-like parameterization allows using the same signal samples as for the MSSM analysis.
The results for the type-II and flipped models are displayed in figure 8 as upper limits for tan β as a function of cos(β − α). Observed upper limits derived from the ATLAS A → Zh analysis [24] at a centre-of-mass energy of 13 TeV are shown as well. The results for the flipped model presented here provide competitive upper limits in the central region of cos(β − α) and strong unique constraints on tan β. Figure 9 shows the upper limits for tan β as a function of cos(β − α) in the type-II and flipped models for m A/H = 500 GeV.

Summary
A search for a heavy Higgs boson decaying into a bottom quark-antiquark pair and accompanied by at least one additional bottom quark has been performed. The data analyzed correspond to an integrated luminosity of 35.7 fb −1 , recorded in proton-proton collisions at a centre-of-mass energy of √ s = 13 TeV at the LHC. For this purpose, dedicated triggers using all-hadronic jet signatures combined with online b tagging were developed. The signal is characterized by events with at least three b-tagged jets. The search has been -13 - performed in the invariant mass spectrum of the two leading jets that are also required to be b-tagged.
No evidence for a signal is found. Upper limits on the Higgs boson cross section times branching fraction are obtained in the mass region 300-1300 GeV at 95% confidence level. They range from about 20 pb at the lower end of the mass range, to about 0.4 pb -14 -JHEP08(2018)113

A Definition of Bukin function
The Bukin function as implemented in ROOT version 6.06/01 [78] is defined as: where ρ i = ρ 1 and x i = x 1 for M 12 ≤ x 1 , ρ i = ρ 2 and x i = x 2 when M 12 ≥ x 2 , and: The parameters x p and σ p are the peak position and width, respectively, and ξ is an asymmetry parameter.