Search for a charged Higgs boson decaying into top and bottom quarks in events with electrons or muons in proton-proton collisions at $\sqrt{s} =$ 13 TeV

A search is presented for a charged Higgs boson heavier than the top quark, produced in association with a top quark, or with a top and a bottom quark, and decaying into a top-bottom quark-antiquark pair. The search is performed using proton-proton collision data collected by the CMS experiment at the LHC at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 35.9 fb$^{-1}$. Events are selected by the presence of a single isolated charged lepton (electron or muon) or an opposite-sign dilepton (electron or muon) pair, categorized according to the jet multiplicity and the number of jets identified as originating from b quarks. Multivariate analysis techniques are used to enhance the discrimination between signal and background in each category. The data are compatible with the standard model, and 95% confidence level upper limits of 9.6-0.01 pb are set on the charged Higgs boson production cross section times branching fraction to a top-bottom quark-antiquark pair, for charged Higgs boson mass hypotheses ranging from 200 GeV to 3 TeV. The upper limits are interpreted in different minimal supersymmetric extensions of the standard model.


Introduction
Since the discovery of a Higgs boson [1][2][3] with a mass of 125 GeV [4,5], the ATLAS and CMS Collaborations have actively searched for additional neutral and charged Higgs bosons. Most theories beyond the standard model (SM) of particle physics enrich the SM Higgs sector; a simple extension is the assumption of the existence of two Higgs doublets [6][7][8][9]. Such models are collectively labeled as two-Higgs-doublet models (2HDM), and are further classified into four categories according to the couplings of the doublets to fermions. In Type-I models, only one doublet couples to fermions, while in Type-II models one doublet couples to the up-type quarks and the other to the down-type quarks and the charged leptons. In lepton-specific models one doublet couples only to the leptonic sector and the other couples to quarks, while in flipped models the first doublet couples specifically to the down-type quarks and the second one to the up-type quarks and charged leptons.
The two-doublet structure of the 2HDM Higgs sector gives rise to five physical Higgs bosons through spontaneous symmetry breaking: a charged pair (H ± ) and three neutral bosons, namely the light (h) and heavy (H) scalar Higgs bosons, and one pseudoscalar boson (A). Supersymmetric (SUSY) models have a Higgs sector based on 2HDMs [10][11][12][13][14][15]. Among the SUSY models, a popular one is the minimal supersymmetric extension to the SM (MSSM) [16,17], whose Higgs sector is described by a Type-II 2HDM. In the MSSM, the production and decay of these particles are described at tree level by two free parameters, which can be chosen as the mass of the charged Higgs boson (m H ± ) and the ratio of the vacuum expectation values of the neutral components of the two Higgs doublets (tan β).
Some variants of the 2HDM achieve consistency with the 125 GeV Higgs boson via a Gildener-Weinberg scalon scenario which stabilizes the Higgs boson mass and alignment [18].
Charged Higgs bosons with a mass below the top quark mass are dominantly produced in top quark decays, whereas charged Higgs bosons with a mass larger than the top quark mass are produced in association with a top quark. Charged Higgs boson production at finite order in perturbation theory is accomplished in association with a top and a bottom quark in the so-called four-flavor scheme (4FS) and in association with a top quark in the five-flavor scheme (5FS) [19], as illustrated in Fig. 1.
In this paper, only charged Higgs bosons with a mass larger than the mass of the top quark (heavy charged Higgs bosons) are considered, and charge-conjugate processes are implied. The signal is produced in the 4FS, and the eventual presence of a 5FS production is accounted for in the search region definition. The normalization of the signal processes accounts for both the 4FS and the 5FS.
The decay of a heavy charged Higgs boson can occur through several channels, among them H + → τ + ν τ and H + → tb have the highest branching fractions, respectively at low (about 200 GeV) and high (about 1 TeV) m H ± for a large range of tan β values and a large variety of theoretical models [20].
The detection of a charged Higgs boson would unequivocally point to physics beyond the SM. Model-independent searches for charged Higgs bosons are of utmost interest for the CERN LHC program because they allow one to disentangle the Higgs sector physics from the specificity and complexity of the theoretical model by assuming unity branching fraction in each mode.
Direct searches for charged Higgs bosons have been performed by the CERN LEP and the Fermilab Tevatron experiments, and indirect constraints on H ± production have been set from fla- vor physics measurements [21][22][23][24][25][26][27][28][29][30]. Searches for a charged Higgs boson decaying into a top and a bottom quark have been performed by the D0, ATLAS, and CMS Collaborations in protonantiproton collisions at a center-of-mass energy of √ s = 1.96 TeV [31] and in proton-proton (pp) collisions at √ s = 8 TeV [32, 33] and √ s = 13 TeV [34]. In this paper we improve the sensitivity to model-independent production of a charged Higgs boson, as well as the sensitivity to relevant MSSM scenarios. The ATLAS and CMS Collaborations have also conducted searches for the production of a charged Higgs boson in the τ + ν τ [32,[35][36][37], cs [38], and cb [39] decay channels at √ s = 8 and 13 TeV.
Searches for charged Higgs bosons produced via vector boson fusion and decaying into W and Z bosons, as predicted by models containing Higgs triplets [40][41][42], and searches for additional neutral heavy Higgs bosons decaying to a pair of third-generation fermions tt, bb, and τ + τ − [42-46] extend the program of the ATLAS and CMS Collaborations to elucidate the extended Higgs sector beyond the SM.
This paper describes a search for a heavy charged Higgs boson produced in association with a top quark or with a top and a bottom quark and decaying into a top and a bottom quark performed using pp collision data collected at √ s = 13 TeV in 2016. The data correspond to an integrated luminosity of 35.9 fb −1 . The final state contains two W bosons, one from the decay chain of the heavy charged Higgs boson and the other from the decay of the associated top quark. One or both of the W bosons can decay into leptons, producing single-lepton and dilepton final states, respectively. The leptonic decays of tau leptons from the W boson decay are considered as well. The single-lepton final state is characterized by the presence of one isolated lepton (e, µ) that is used to trigger the event, while the dilepton final state contains events with two isolated opposite-sign leptons (e + e − , e ± µ ∓ , µ + µ − ). This leads to the suppression of several backgrounds. The signal process (tbH + + tH + ) has furthermore a large b jet multiplicity; an additional classification of the events is therefore achieved based on the number of jets identified as originating from b quarks.
Multivariate analysis (MVA) techniques are used to enhance the discrimination between signal and background. Signal-rich regions are analyzed together with signal-depleted regions in a maximum likelihood fit to the MVA classifier outputs, which simultaneously determines the contributions from the tbH + + tH + signal and the backgrounds.
Model-independent upper limits on the product of the charged Higgs boson production cross section and the branching fraction into a top-bottom quark-antiquark pair, σ H ± B(H ± → tb) = , as a function of m H ± , are presented in this paper. Results are also interpreted in specific MSSM benchmark scenarios, where many free parameters of the model are fixed to values corresponding to interesting phenomenological assumptions.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed by a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid. Events of interest are selected using a two-tiered trigger system [47]. The first level, composed by specialized hardware processors, uses information from the calorimeters and muon detectors, while the second level consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [48].

Event simulation
Signal events are simulated using the MADGRAPH5 aMC@NLO 2.3.3 [49] generator at nextto-leading order (NLO) precision in perturbative quantum chromodynamics (QCD) using the 4FS for a range of m H ± hypotheses between 200 and 3000 GeV; the complete list of masses is [200, 220, 250, 300, 350, 400, 500, 650, 800, 1000, 1500, 2000, 2500, 3000] GeV. The 4FS is expected to provide a better description of the observables, while shape effects from 5FS production are expected to be negligible, because eventual additional b quarks would be radiated with low transverse momentum by the beam remnants [20].
Normalization effects induced by the presence of 5FS are accounted for by computing the MSSM production cross sections for the heavy charged Higgs boson signals both in the 4FS and 5FS; the two cross sections are then combined to obtain the total cross section using the Santander matching scheme [19] for different values of tan β. The 4FS and 5FS cross sections differ for all mass point by about 20%, and the Santander-matched cross section lies inbetween the two; typical values are of the order of 1 pb for a mass of 200 GeV, down to about 10 −4 pb for a mass of 3 TeV [20,[50][51][52][53][54].
Branching fractions B(H + → tb ) are computed in the chosen scenarios with the HDECAY 6.52 package [55]. These cross sections are used in Section 7 only for the model-dependent results, and don't affect the model-independent results.
The main background to this analysis originates from SM top quark pair production. Other backgrounds are the production of W and Z/γ * with additional jets (referred to as V+jets), diboson and triboson processes, single top quark production, tt production in association with W, Z, γ, or H bosons (collectively labeled tt+V), as well as four top quark production (tttt) and QCD multijet events.
The MADGRAPH5 aMC@NLO 2.2.2 generator [49] is used at leading order (LO), with the MLM jet matching and merging [59], to generate vector boson events in association with jets, sin-gle top quark events in the s-channel, and four top quark production. The associated production of tt events with a vector boson and with a γ is simulated at NLO using MAD-GRAPH5 aMC@NLO 2.2.2 with FxFx jet matching and merging [60].
In all cases, the NNPDF3.0 [61] set of parton distribution functions (PDFs) is used, and the parton showers and hadronization processes are performed by PYTHIA 8.212 [62] with the CUETP8M1 [63] tune for the underlying event, except for the tt sample where the tune CUETP8M2T4 [64] provides a more accurate description of the kinematic distributions of the top quarks and of the jet multiplicity.
The simulated tt events are further separated based on the flavor of additional jets that do not originate from the top quark decays in the event and are labeled according to their content in b-and c-originated hadrons. The tt+b(b) (tt+c(c)) label is attributed to the events that have at least one b jet (c jet and no b jet) from the event generator within the acceptance. Events that do not belong to any of the above processes are enriched in light-flavor jets and therefore denominated as tt+LF. This partition of the simulated tt sample is based on matching heavyflavor generator-level jets to the originating partons and hadrons and is introduced to account for different systematic uncertainties affecting the corresponding cross section predictions. The procedure is detailed in Refs. [77,78].
All generated events are passed through a detailed simulation of the CMS apparatus, based on GEANT4 v9.4 [79]. The effects of additional pp interactions occurring in the same or in neighboring bunch crossings (pileup) are modelled by adding simulated minimum bias events to all simulated processes. In the data collected in 2016 an average of 23 pp interactions occurred per LHC bunch crossing. In simulation, the difference in the number of true interactions is accounted for by reweighting the simulated events to match the data in the multiplicity distribution of pileup interactions.

Event reconstruction
Events are reconstructed using the particle-flow (PF) algorithm [80], which aims to reconstruct and identify each individual particle in an event, with an optimized combination of information from the various elements of the CMS detector. The energy of photons is obtained from the ECAL measurement. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. The momentum of muons is obtained from the curvature of the corresponding track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for zero-suppression effects and for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energy. The reconstructed vertex with the largest value of summed physics-object squared transverse momentum (p 2 T ) is taken to be the primary pp interaction vertex [81]. The physics objects are the jets, clustered using the jet finding al-gorithm [82,83] with the tracks assigned to the vertex as inputs, and the associated missing transverse momentum ( p miss T ), taken as the negative vector sum of the p T of those jets.
Electrons are identified using an MVA-based identification algorithm [84]. Working points are defined [85] by setting thresholds for the classifier values to mitigate efficiency losses for highp T electrons observed particularly in high-mass signal events; such working points are labeled Tight (≈88% efficiency for tt events) and Loose (≈95% efficiency for tt events). They result in an efficiency in selecting high-mass signal events of ≈90%, approximately flat across the electron high-p T range. Muon identification uses the algorithm described in Ref.
[86] and two working points, referred to as Medium and Loose, with efficiencies of about 97 and 100%, respectively. Thresholds in p T and η for electrons and muons depend on whether they are used for selecting or vetoing events and are detailed in Section 5.
Electrons and muons are required to be isolated from other particles. Their relative isolation is measured as the ratio between the scalar p T sum of selected PF particles within a cone of a radius ∆R(p T ( )) and the p T of the particle; ∆R is defined as √ (∆η) 2 + (∆φ) 2 and ∆η and ∆φ are the distances in the pseudorapidity and azimuthal angle. The ∆R(p T ( )) cone decreases with the lepton p T [87, 88] according to the formula ∆R(p T ( )) = 10 GeV min max(p T ( ), 50 GeV), 200 GeV .
Efficiencies in triggering, reconstruction, identification, and isolation of leptons are estimated both in data and simulation. Those efficiencies are used to determine correction factors, depending on p T and η, and are applied to simulated events on a per-lepton basis.
Jets are reconstructed from the PF particles clustered by the anti-k T algorithm [82,83] with a clustering radius of 0.4. To mitigate the effect of pileup interactions, charged hadrons that do not arise from the primary vertex are excluded from the clustering. Furthermore, jets originating from pileup interactions are removed by means of an MVA identification algorithm [89]. The jet momentum is then corrected in simulated events to account for multiple effects, including the extra energy clustered in jets arising from pileup. In situ measurements of the momentum balance in dijet, photon+jet, Z+jet, and multijet events are used to determine any residual differences between the jet energy scale in data and in simulation, and appropriate corrections are applied [90]. Jets are selected if they satisfy p T > 40 GeV and |η| < 2.4. Loose identification criteria are applied to the jets, in order to distinguish them from well-identified stable particles. Finally, jets are required to be separated from the selected leptons by ∆R > 0.4.
Jets from the hadronization of b quarks are identified (b tagged) using the combined secondary vertex algorithm [91]. For the chosen threshold of the tagging algorithm, the mistagging probability-the fraction of jets that arise from the fragmentation of light partons (u, d, s, and g) and c jets misidentified by the algorithm as b jets-is approximately 1 and 15%, respectively, while the efficiency to correctly identify a b jet is about 70%. The difference in b tagging and mistagging efficiencies between data and simulation is corrected by applying correction factors dependent on jet p T and η.
The missing transverse momentum vector is defined as the projection of the negative vector sum of the momenta of all reconstructed PF particles in an event onto the plane perpendicular to the beams. Its magnitude is referred to as p miss T . The p miss T reconstruction is improved by propagating the effect of the jet energy corrections to it. Further filtering algorithms are used to reject events with anomalously large p miss T resulting from instrumental effects [92].
Hadronically decaying τ leptons (τ h ) are reconstructed using the hadron-plus-strips algorithm [93], based on the identification of the individual τ decay modes. The τ h candidates are required to be separated from reconstructed electrons and muons by ∆R > 0.4. Tau candidates are further selected by means of a multivariate discriminator combining isolation and lifetime information [93]. Jets originating from the hadronization of quarks and gluons misidentified as τ h are suppressed by requiring that the τ h candidate is isolated. The τ h identification efficiency depends on p τ h T and η τ h , and is on average 50% for p τ h T > 20 GeV with a probability of approximately 1% for hadronic jets to be misidentified as a τ h . The isolation variable is constructed from the PF particles inside a cone of ∆R = 0.3. The effect of neutral PF candidates from pileup vertices is estimated using charged hadrons associated with those vertices and subtracted from the isolation variable.

Event selection and classification
Events are selected with single-lepton triggers characterized by transverse momentum (p T ) thresholds of 27 (24) GeV for electrons (muons). Additionally, several trigger paths with higher p T thresholds and looser identification requirements are included to maximize efficiency for high-p T electrons (muons), resulting in an overall efficiency in the plateau region close to 95 (100)%. Correction factors quantifying the difference between trigger efficiencies in data and simulated events are evaluated using a tag-and-probe technique [84, 86, 94, 95].
Events are required to have at least one electron (muon) with p T > 35 (30) GeV satisfying tighter identification and isolation criteria than the online requirements, effectively corresponding to the saturation point of the online trigger efficiencies. As briefly discussed in Section 1, the first classification is achieved by separating the events in five single-lepton and dilepton regions (e ± , µ ± , e + e − , e ± µ ∓ , µ + µ − ). In the single-lepton category, only events with exactly one lepton are accepted, whereas the presence of any additional lepton passing the loose identification requirements with p T > 10 GeV vetoes the event. Moreover, the presence of a τ h candidate with p T > 20 GeV and |η| < 2.3 vetoes the event. In the dilepton category, we accept events with exactly two oppositely charged leptons (electrons or muons); the second lepton is required to have p T > 10 GeV and pass looser identification criteria than the leading lepton. To reduce the Z/γ * background, we reject events with two leptons of the same flavor and opposite charge with an invariant mass m less than 12 or between 76 and 106 GeV.
The final states examined in this paper include neutrinos from the W boson decays; events are therefore required to have p miss T > 30 GeV. Additionally, in the single-lepton final state, events in which the p miss T is compatible with mismeasurement of electron or jet energy are rejected by requiring the azimuthal angle separation between the p miss T and any jet in the event to be ∆φ > 0.05.
Tree-level signal production processes are characterized by having five (three) jets at leading order in the single-lepton (dilepton) final state. The tt background has a lower jet multiplicity in the corresponding regions, but additional jets may be produced through initial-and finalstate radiation. Requiring a high multiplicity of reconstructed jets improves the discrimination of signal events from the background, while the regions depleted in signal processes constrain background estimates using data. Consequently, in the single-lepton and dilepton event regions, the presence of at least four and two jets, respectively, is required. The SM top quark pair production has final states similar to the charged Higgs boson signal production with fewer b quarks at tree level, while additional gluon splitting contaminates the high b jet multiplicity regions. Consequently, one or more of these jets is required to be b-tagged.
For a large H + mass range, the highest significance for both the single-lepton and dilepton final states is found in the regions having higher N jets and N b jets . The only exception are the H + signals with the mass around 200 GeV, where the low N jets and N b jets regions have higher sensitivity than the high multiplicity ones. Finally, events with two same-sign leptons are used to form control regions for the multijet background estimation.
A set of discriminant variables is selected to enhance the signal and background separation in each category and is summarized in Table 1.
Kinematic and topological shapes have different discrimination power for the different mass hypotheses of the charged Higgs boson. Each discriminant variable is studied and included in an MVA classifier if it improves the discrimination, or otherwise discarded. For both singlelepton and dilepton regions, the H T distribution, defined as the scalar sum of the p T of the selected jets, is one of the most sensitive variables. Additionally, the largest p T among the b jets, the p miss T , the minimum invariant mass between the lepton and the b jets, the maximum ∆η between two b-tagged jets, the smallest ∆R separation of the b jets, and the p T -weighted average of the b tagging discriminator calculated using the non-b-tagged jets are used as input variables to the MVA discriminators. Information about the event topology is incorporated via event shape variables, such as the centrality which is defined as the ratio of the sum of the transverse momenta of all jets to their total energy, and the second Fox-Wolfram moment [96] calculated using all jets.
In the single-lepton final states, the following variables are also included: the invariant mass of the three jets with largest p T , the transverse mass of the system constituted by the lepton and the p miss T , the angular separation between the lepton and the system constituted by the b jet pair with the smallest ∆R separation between the b jets, and the average separation between the b jet pairs. The event selection for the dilepton final state takes advantage of the presence of the second lepton. The lepton with largest p T (leading lepton) characterizes the decay of a Lorentz-boosted top quark that originates from the massive charged Higgs boson in the signal hypothesis. The following variables are also considered: the ∆R between the leading lepton and the leading b-tagged jet, the momentum of the leading lepton, the lepton p T asymmetry, the mass of the lepton+b-tagged jet system with the largest p T , and the smallest of the transverse masses constructed with the leading b jet and each of the two W boson hypotheses, where the W bosons are reconstructed using the p miss T and the lepton momenta.
Separate classifiers are constructed for the single-lepton and dilepton final states, using different technologies in order to fully exploit the different sets of features described above. For each of the suitable discriminating variables, it has been verified that the simulation models data correctly. Figure 2 shows some of the most important input variables in exemplary signal-region subcategories for the single-lepton (≥5j/≥2b) and dilepton final states (≥3j/≥1b).  For all the classifiers described below each signal and background sample is randomly divided into three equally populated parts; one third is used for training the classifiers, one third is used for testing the performance of the classifiers, and one third is used for evaluating the classifier in the context of the maximum-likelihood fit detailed in Section 7. The backgrounds are dominated by tt events, but all other SM contributions are also included in the training. Both in the single-lepton and the dilepton regions, the training process and possible sources of over-or under-training are verified by means of statistical tests.
A boosted decision tree (BDT) [97,98] classifier is trained using the TMVA package [99] to discriminate between signal and background in the single-lepton regions. The dependence of the kinematic signature on m H ± is accounted for by having a separate training for each m H ± hypothesis. The training process is optimized by targeting a region enriched in signal events by requiring N jets ≥ 5 and N b jets ≥ 2 (training region). The binned output distribution of the BDT classifier is calculated in all the single-lepton subcategories corresponding to the training region plus the (4j/≥3b) region and used in the maximum likelihood fit. In the other single-lepton subcategories, the inclusive event yields are used in the fit to infer additional information on the background normalization.
The dilepton final states exploit a novel technology based on deep neural network (DNN) classifiers [97], parametrized as a function of m H ± [100]. The TENSORFLOW (v1.4.0) backend [101] and the KERAS (v2.1.1) frontend [102] are used to train the classifier. The parametrization of the signal events as a function of m H ± enables a unique training for each signal mass hypothesis. The training process is optimized in the region enriched in signal events by requiring N jets ≥ 3 and N b jets ≥ 1. The jet and b-tagged jet multiplicities are used in extending the training parametrization to capture the characteristics of the signal and background processes in the different regions. In the regions characterized by a single b jet we use the non-tagged jet with the highest value of the b tagging discriminator as the second b jet for the purpose of computing the input variables. The binned DNN output is used in the maximum likelihood fit in all the dilepton subcategories to further enhance the separation between the different background processes.
The bin size for the MVA output in each of the subcategories of the analysis is chosen with a variable binning strategy such that the statistical uncertainty in signal and background event yields separately is less than 20% in each bin. In order to avoid possible biases in the binning strategy induced by the statistical fluctuations in the simulated samples, the bin boundaries are defined based on the events used for the MVA training.

Background estimation and systematic uncertainties
The leptonic decay of one or two of the W bosons in the tt process represents the main background of the analysis for both the single-lepton and dilepton final states. The tt production, as discussed in Section 3, is separated into tt+LF, tt+b(b), and tt+c(c) processes. The last two processes are commonly referred to as tt+heavy flavor (HF). The categorization strategy described in Section 5 populates the low b jet multiplicity regions with the tt+LF processes, while the regions enriched with the signal are characterized by a larger contribution from the tt+HF processes. Smaller background contributions arise from single top quark production, vector boson production in association with jets, multiboson production processes, tt production in association with electroweak bosons (W, Z, γ, H), and tttt production. Different sources of experimental and theoretical uncertainties are modelled as nuisance parameters in the fit and they are allowed to change the event yield, the migration of events among regions, and the distribution of the MVA output in each category [103]. Uncertainties that purely affect the yield within a category (rate uncertainties) are modelled via a nuisance parameter with a log-normal probability density function, while changes in shapes (shape uncertainties) are performed using a polynomial interpolation with a Gaussian constraint, and they can also change the event yields. All the sources of systematic uncertainty applied to the analysis are discussed below.
The uncertainty in the integrated luminosity measurement of the 2016 dataset amounts to 2.5% [104]. The uncertainty in the evaluation of the pileup in simulation is accounted for by varying the total inelastic pp cross section by ±5% and propagating the effect of the variation to the final yields. The difference between the nominal and the altered distributions is taken as the uncertainty and treated as a shape variation in the fit. Both the integrated luminosity and the pileup uncertainties are separately treated as fully correlated among all processes.
Each reconstructed jet is corrected via calibration factors in order to account for the response of the detectors, with dependencies on the geometry, the pileup conditions, and the kinematic properties of the jet [89]. The uncertainties in the jet energy scale and resolution are propagated by varying the jet momenta and, consequently, the missing transverse momentum. The events are reanalyzed in order to extract the appropriate rate and shape variations for the final distributions. An additional uncertainty accounts for the effect of the unclustered energy on p miss T . Each of these uncertainties is treated as fully correlated among all processes.
The b tagging and mistagging uncertainties are obtained by varying the corresponding perjet correction factors within their uncertainties [91]. The mistag efficiency uncertainties for jets originating from light partons (u, d, s, and g) are considered to be uncorrelated with the b tagging efficiency uncertainties, while the c quark jet mistag rate uncertainties are varied simultaneously with the b tagging efficiencies. The b tagging and mistagging efficiency uncertainties are conservatively doubled whenever they are extrapolated outside the p T /η range over which the correction factors were derived. Different sources of uncertainties are varied as independent nuisance parameters. The portion of the b tagging efficiency uncertainty that is correlated with the jet energy scale is evaluated within the overall jet energy scale uncertainty by shifting the b tagging scale factors in the same direction as the jet energy scale shift; the procedure reflects the correlation in the derivation of the correction factors.
The uncertainties in the lepton selection efficiency correction factors due to trigger, identification, and isolation efficiencies are applied depending on the lepton p T and η. The propagation of the correction factors on the shape of the MVA output impacts only the overall normalization. The squared sum of the variations due to the identification, isolation, and trigger efficiencies is therefore included as a single rate uncertainty amounting to 3 (4)% for electrons (muons), treated as correlated among all the final regions.
Small discrepancies between data and simulation are observed in control regions enriched in processes involving a vector boson with additional jets. The Z/γ * and W+jets H T distributions are matched to data using corrections derived in a region close to the mass of the Z boson and in the zero b jet control region, respectively. The uncertainties in the derivation of correction factors for the Z/γ * and W+jets processes in the H T distribution are accounted for in the final results. They are assumed to be uncorrelated between the two processes and correlated among the analysis regions.
The QCD multijet production is a minor background to the analysis, amounting to about 1% of the total background across all the signal regions, and is therefore ignored in the fit after the verification of the simulated prediction. For the single-lepton regions, the simulation has been checked in an orthogonal set of events requiring that the p miss T is aligned with the jets, while for the dilepton regions, the QCD multijet production is verified in the same-sign dilepton control regions for each category defined by N jets and N b jets .
Theoretical uncertainties related to the PDFs are applied as rate uncertainties to the simulated background samples and account for both the acceptance and the cross section mismodelling [105]. Uncertainties from factorization and renormalization scales in the inclusive cross sections are considered independently for each process for which they are non negligible. They are estimated by varying each scale independently from the others by factors of 0.5 and 2 with respect to the default values. The matching of the POWHEG NLO tt matrix element calculation with the PYTHIA parton shower (PS) is varied by shifting the parameter h damp = 1.58 +0. 66 −0.59 m t [106] within the uncertainties. The damping factor h damp is used to limit the resummation of higher-order effects by the Sudakov form factor to below a given p T scale [106].
An additional source of uncertainty arises from the modeling of additional jets by the event generator in top quark pair production. This uncertainty is estimated in each bin of jet and b jet multiplicity, based on the simulated tt samples which are enriched or depleted in initial-and final-state radiation. The initial-state radiation PS scale is multiplied by factors of 2 and 0.5 in dedicated simulated samples, whereas the final-state radiation PS scale is scaled up by √ 2 and down by 1/ √ 2 [63,106]. For each PS scale and h damp perturbation, the uncertainty is evaluated as the relative deviation with respect to the nominal event rates. A nuisance parameter is added for each category defined by N jets and N b jets and considered uncorrelated among regions with different N jets and also uncorrelated between the single-lepton and dilepton final states.
The normalization of the tt+HF processes, as determined by theoretical calculations [107] and experimental measurements, is affected by an uncertainty of 50% that is applied as a rate uncertainty, in addition to the other tt cross section uncertainties described above. This procedure allows the signal-depleted regions to determine the overall normalization factor, which includes the production cross section, detector acceptance, and reconstruction efficiencies.
The limited size of the background and signal simulated samples results in statistical fluctuations of the nominal yield prediction. The content of each bin of each final discriminant distribution is varied by its statistical uncertainty. The Barlow-Beeston lite approach [108,109] is applied by assigning, for each bin, the combined statistical uncertainty of all simulated samples to the process dominating the background yield in that bin. Since all bins are statistically independent, each variation is treated as uncorrelated with any other variation.
A summary of the effects of the systematic uncertainties on the event yields, summed over all final states and regions, is provided prior to the fit to data in Table 2.

Results
The statistical interpretation is based on a simultaneous fit of the MVA output discriminators and event yields in the different signal regions described in Section 5. The parameter of interest reflecting the signal normalization σ H ± B(H ± → tb) = σ(pp → H + tb + pp → H + t )B(H + → tb ) + σ(pp → H − tb + pp → H − t)B(H − → tb) and the nuisance parameters specified in Table 2: Effects of the systematic uncertainties as the variation (in percent) of the event yields prior to the fit to data, summed over all final states and regions. The column Shape reports whether a given uncertainty is considered a shape uncertainty or a rate uncertainty. Section 6 are encoded in the negative log-likelihood function and profiled in the minimization process. The log-likelihood ratio is used as test statistic to assess the agreement of data with the background-only hypothesis or the presence of the signal and the asymptotic approximation is used in the statistical analysis [103,110]. The statistical method used to report the results is the CL s modified frequentist criterion [111,112]. Figure 3 shows the event yields in the subcategories of the analysis after a background-only fit to data. In the regions where the shape of the MVA classifier output is used, the yields are obtained by integrating the distribution and the correlations across the bins are accounted for in the quoted uncertainties. The contribution of a hypothetical charged Higgs boson with a mass of 500 GeV and σ H ± B(H ± → tb) = 10 pb is also displayed. In the same configuration, Fig. 4 shows the MVA (BDT and DNN) outputs in exemplary signal-region subcategories for the single-lepton (5j/≥3b) and dilepton (3j/3b) final states.

Source of uncertainty
The data agree with the background distributions and no significant excess is observed. Exclusion limits are set at 95% confidence level (CL) on σ H ± B(H ± → tb) for m H ± hypotheses between 200 and 3000 GeV. The observed (expected) upper limits with single-lepton and dilepton final states combined are shown in Fig. 5 (left) and listed in Table 3. The single-lepton and dilepton regions have comparable sensitivity in the low-mass regime (≈200 GeV) while the single-lepton regions become increasingly dominant at higher values of the mass hypothesis; Figure 5 [17] is designed to give a mass of approximately 125 GeV for the light CP-even 2HDM Higgs boson over a wide region of the parameter space. The M 125 h (χ) scenario [113] is characterized by small gaugino and Higgs/higgsino superpotential masses which are also close to each other; this results in a significant mixing parameter between higgsinos and gauginos and in a compressed electroweakino mass spectrum. The phenomenology of the M 125 h (χ) scenario resembles therefore the Type-II 2HDM with MSSM-inspired Higgs couplings compatible with m h ≈ 125 GeV for large masses of the pseudoscalar boson, A. Figure

Summary
A search is presented for a charged Higgs boson decaying into a top-bottom quark-antiquark pair when produced in association with a top quark or a top and a bottom quark. The analyzed proton-proton collision data, collected at √ s = 13 TeV with the CMS detector at the LHC, correspond to an integrated luminosity of 35.9 fb −1 . The search uses events with a single isolated electron or muon or an opposite-sign electron or muon pair. Events are categorized according to the jet multiplicity and the number of jets identified as containing a b-hadron decay. Multivariate techniques are used to discriminate between signal and background events, the latter being dominated by tt production. Results are presented for a charged Higgs boson with a mass larger than the top quark mass. 95% confidence level upper limits of 9.6-0.01 pb are set on the product of the charged Higgs boson production cross section and the branching fraction into a top-bottom quark-antiquark pair,  institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses.         [88] CMS Collaboration, "Search for new physics in same-sign dilepton events in proton-proton collisions at √ s = 13 TeV", Eur. Phys. J. C 76 (2016) 439, doi:10.1140/epjc/s10052-016-4261-z, arXiv:1605.03171.