Search for the flavor-changing neutral current interactions of the top quark and the Higgs boson which decays into a pair of b quarks at $\sqrt{s}=$ 13 TeV

A search for flavor-changing neutral currents (FCNC) in events with the top quark and the Higgs boson is presented. The Higgs boson decay to a pair of b quarks is considered. The data sample corresponds to an integrated luminosity of 35.9 fb$^{-1}$ recorded by the CMS experiment at the LHC in proton-proton collisions at $\sqrt{s}=$ 13 TeV. Two channels are considered: single top quark FCNC production in association with the Higgs boson (pp $\to$ tH), and top quark pair production with FCNC decay of the top quark (t $\to$ qH). Final states with one isolated lepton and at least three reconstructed jets, among which at least two are associated with b quarks, are studied. No significant deviation is observed from the predicted background. Observed (expected) upper limits at 95% confidence level are set on the branching fractions of top quark decays, $\mathcal{B}$(t $\to$ uH) $<$ 0.47% (0.34%) and $\mathcal{B}$(t $\to$ cH) $<$ 0.47% (0.44%), assuming a single nonzero FCNC coupling.


Introduction
A recently discovered fundamental particle has properties that are consistent with the standard model (SM) predictions for the Higgs boson, H [1][2][3][4]. In the SM, flavor-changing neutral currents (FCNC) are forbidden at tree level and are strongly suppressed in loop corrections by the Glashow-Iliopoulos-Maiani (GIM) mechanism [5] with the SM branching fraction of t → qH predicted to be O(10 −15 ) [6][7][8]. Several extensions of the SM incorporate significantly enhanced FCNC behavior that can be directly probed at the CERN LHC [8,9]. The FCNC processes that correspond to tH interactions are described by the following effective Lagrangian: where g is the weak coupling constant, P L and P R are chirality projectors in spin space, κ Hqt is the effective coupling, f L Hq and f R Hq are left-and right-handed complex chiral parameters with a unitarity constraint of | f L Hq | 2 + | f R Hq | 2 = 1. The tH FCNC interaction is studied in this analysis in two channels: the associated production of a single top quark with the Higgs boson (ST), as well as in FCNC decays of top quarks in tt semileptonic events (TT). In this analysis, H → bb decays are considered. This is the first time that the analysis of the ST mode is presented. Representative Feynman diagrams of the studied processes are shown in Fig. 1. Earlier analyses by the ATLAS [10, 11] and CMS [12] Collaborations have probed κ Hqt in top quark decays in tt events. The ATLAS search at center-of-mass energy of 13 TeV investigated the t → qH decay with the Higgs boson decaying to two photons to set observed (expected) upper limits at 95% confidence level (CL) on the branching fractions B(t → uH) and B(t → cH) of 0.24% (0.17%) and 0.22% (0.16%), respectively [11]. The CMS analysis at √ s = 8 TeV utilized the Higgs boson decays into either boson or fermion pairs to set observed (expected) upper limits of 0.55% (0.40%) and 0.40% (0.43%) on B(t → uH) and B(t → cH), respectively [12].
For the signal processes, we consider the cross section times branching fraction with a specific signature for single top quark t(→ + νb)H(→ bb) and pair production t(→ + νb)t(→ u/cH(→ bb)), with = e, µ, or τ. The analysis also considers the charge-conjugate process. The predicted cross section at 13 TeV for single top quark and antiquark FCNC production in association with the Higgs boson under the assumption of coupling strengths κ Hut = 1, κ Hct = 0 (κ Hct = 1, κ Hut = 0) is 13.8 (1.90) pb, where the cross section calculation is based on the leading order (LO) set of NNPDF 2.3 parton distribution functions (PDFs) [13]. In the case of the production of tt semileptonic events with top quark FCNC decay, the predicted cross section is 37.0 pb and is independent of the type of the coupling. By exploiting a simultaneous analysis

Monte Carlo simulation
The generation of simulated signal events is done at LO with MADGRAPH5 aMC@NLO 2.3.3 [16,17]. Up to two additional partons are simulated by the Monte Carlo (MC) generator in the initial hard process for the top quark pair production mode. The MLM [18] matching scheme is used to match additional partons in the matrix-element calculations to the parton-shower description. No additional partons are included in the generation of events for the single top quark production process, as such inclusion would contain contributions from the top quark pair production process. A systematic variation in the normalization of the single top production process by 10% is considered in order to account for the differences in the generation of additional radiation of the two signal production modes. The Lagrangian terms from Eq. (1) are implemented by means of the FEYNRULES package [19] using the universal FEYNRULES output format [20]. The complex chiral parameters are fixed to f R Hq = 1 and f L Hq = 0. The SM top quark pair production is the dominant background process and is simulated to next-to-leading order (NLO) using POWHEG v2 [21][22][23][24]. The predicted cross section for this process is 832 +20 −29 (scale) ± 35(PDF) pb, as calculated with the TOP++ 2.0 program at next-to-nextto-leading order (NNLO), including soft-gluon resummation to next-to-next-to-leading-log order (see Ref. [25] and references therein), and assuming a top quark mass of m t = 172.5 GeV. Two systematic uncertainties that are shown in the prediction are considered. These are independent variations of the factorization and renormalization scales, µ F and µ R , and variations of the PDF and α s .
Single top quark production in the t channel is simulated with POWHEG v2 in the four-flavour scheme, while events for single top quark production in association with W bosons are generated with POWHEG v1 in the five-flavour scheme (5FS). The predicted NLO cross sections are 217 +9 −8 [26, 27] and 71.7 ± 3.9 pb [28], respectively. Single top quark production in the s channel is done at NLO with the MADGRAPH5 aMC@NLO generator in 5FS with a predicted cross section of 10.3 ± 0.4 pb. The uncertainties in the quoted cross sections correspond to independent variations of µ F and µ R , as well as to variations of the PDF and α s . Small contributions to the overall predicted background arise from several additional processes: W boson production and the associated production of tt with W and Z, both generated with MADGRAPH5 aMC@NLO, and from Drell-Yan and the associated production of tt with a Higgs boson generated with the MADGRAPH5 aMC@NLO and POWHEG v1 [29], respectively.
In the simulation of signal and background processes, the initial-and final-state radiation (ISR and FSR), as well as the fragmentation and hadronization of quarks, are modeled using PYTHIA 8.212 [30] with the underlying event tune CUETP8M1 [31]. For tt generation, the first emission is done at the matrix-element level with POWHEG v2. Generation of tt and single top quark production in the t channel uses the underlying event tune CUETP8M2T4 [32]. In the generation of all background processes the NNPDF3.0 PDF [33] set is used.
The detector response is simulated using GEANT4 v9. 4 [34]. In order to model the effect of multiple interactions per event crossing (pileup), generated minimum bias events were added to the simulated data. The number of extra multiple interactions were matched to agree with the rate observed in data. The number of pileup interactions in data is estimated from the measured bunch-to-bunch instantaneous luminosity and the total inelastic cross section (69.2 mb) [14].

Event selection
The particle-flow (PF) algorithm [35] reconstructs and identifies each individual particle with an optimized combination of information from the various elements of the CMS detector. The energy of photons is directly obtained from the ECAL measurement, corrected for zerosuppression effects. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. The momentum of muons is obtained from the curvature of the corresponding track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for zero-suppression effects and for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energy.
Jets are reconstructed by clustering PF candidates using the anti-k T algorithm [36, 37] with a distance parameter of 0.4. The jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found from simulation to be within 5 to 10% of the true momentum over the whole transverse momentum (p T ) spectrum and detector acceptance [38]. An offset correction is applied to jet energies to take into account the contribution from pileup. Jet energy corrections are derived from simulation and are confirmed with in situ measurements of the energy balance in dijet, multijet, γ+jet, and leptonic Z+jet events. Additional selection criteria are applied to each event to remove spurious jet-like features originating from isolated noise patterns in certain HCAL regions. The missing transverse momentum ( p miss T ) in an event is defined as the magnitude of the transverse projection of the vector sum of the momenta of all reconstructed PF candidates in an event.
The reconstructed vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex. The physics objects are the jets, clustered using the jet finding algorithm [36, 37] with the tracks assigned to the vertex as inputs, and the associated p miss T , taken as the negative vector p T sum of those jets, to represent the neutral particles.
This analysis selects events with exactly one isolated lepton (electron or muon). Events with one electron (muon) are recorded using a trigger that required at least one electron (muon) with p T > 32 (24) GeV selected within the detector acceptance (|η| < 2.1). Electron (muon) candidates are selected offline with |η| < 2.1 with p T > 35 (30) GeV. Electrons that are reconstructed in the transition region between the barrel and endcap regions of the ECAL, 1.44 < |η| < 1.57, are removed. Leptons are required to be isolated in terms of a relative isolation variable, I rel . This variable is defined as the ratio of the scalar p T sum of photons, charged hadrons, and neutral hadrons within a cone of angular radius ∆R = √ (∆η) 2 + (∆φ) 2 = 0.3 (0.4) of the reconstructed lepton candidate, where φ is azimuthal angle in radians, to the lepton p T . This isolation variable only includes the charged hadrons that emerge from the same vertex as the selected lepton and is corrected for energy deposits from neutral particles produced in pileup interactions. For electron (muon) candidates, I rel must be less than 0.06 (0.15). In order to suppress background processes with multilepton final states, events with additional leptons passing the looser isolation requirement of I rel < 0.25 and p T > 10 GeV are rejected.
At least three jets are required to be present in the event with p T > 30 GeV and |η| < 2.4. As signal events contain three b quarks produced in the final state at the tree level, we require at least two jets are identified as b quark jets by the combined secondary vertex v2 (CSVv2) b tagging algorithm [39]. This requirement corresponds to the selection of jets with the CSVv2 discriminant value greater than 0.85, and provides a b jet efficiency of ≈70%, with a misidentification rate for c jets and jets originating from light quarks and gluons of ≈10% and ≈1%, respectively.

Event reconstruction and multivariate analysis
In order to optimize sensitivity to the signal event selection, events are split into five categories based on the total number of reconstructed jets and on the number of b-tagged jets. Categories with exactly three jets of which two or three are identified as b jets are denoted as b2j3 and b3j3, respectively. Similarly, categories with at least four jets of which two, three, or four are identified as b jets are specified as b2j4, b3j4, and b4j4, respectively The longitudinal momentum of the neutrino is determined by assigning p miss T to the neutrino, and constraining the ν mass to the known mass of the W boson. With the use of the energy and momenta of all particles, a full kinematic reconstruction of the event is performed for several signal and background hypotheses: ST, TT, and background tt events, where one of the top quarks decays semileptonically, and the other one hadronically. The reconstruction is performed for all possible permutations of the b-tagged jets to be associated with the decay products of the Higgs boson or the top quark, and both solutions for the longitudinal momentum of the neutrino are considered. The reconstructed kinematic variables for each permutation are then fed into a multivariate analysis that uses a boosted decision tree (BDT) [40] approach, as implemented in the toolkit for multivariate analysis TMVA [41]. The input BDT variables that are used for the ST and TT hypotheses correspond to the reconstructed invariant mass of two b jets associated with the Higgs boson decay, the reconstructed invariant mass of a b jet (m bb ), lepton and neutrino associated with the top quark decay (m(t )), its transverse momentum (p T (t )), ∆R between the reconstructed Higgs boson and the top quark. In case of the hypothesis of the background tt events the following variables are used: m(t ), m(t h ), ∆R(t , t h ), and p T (t ), where t h corresponds to the reconstructed top quark hadronic decay from one b-tagged and two non b-tagged jets. The BDT classifier is trained to distinguish the correct from the wrong b jet assignments. The training and validation of the BDT is performed on statistically independent simulated samples. All reconstructed b jets in the event are considered, and the permutation with the highest BDT score is chosen as the correct one. The measured algorithm efficiency for correct assignment of the b-tagged jets to the jets reconstructed at generator level after applying the analysis selection criteria is ≈75%.
Kinematic variables from the event reconstruction are used to construct several BDTs to suppress backgrounds. The BDTs are trained for each jet multiplicity category to identify signal events that are generated either for κ Hut (Hut) or κ Hct (Hct) coupling against the sum of all background events. Separate trainings of the BDT for Hut and Hct are done in order to take into account the differences in kinematic properties of the reconstructed objects in the ST production mode, as well as the differences in the measured b tagging efficiencies for a charm and an up quark in the TT production channel. The most important variables that discriminate between signal and background events are: the charge of the lepton (considered only for the BDT that uses Hut signal events), the CSVv2 discriminant value of the b jet with the lowest p T from the Higgs boson decay, m bb , and the output discriminant value of the BDT used in the b jet assignment procedure. Distributions for these variables in data and MC simulation in the b3j3 category are presented in Fig. 2. The b4j4 category is not considered for Hut due to negligible improvement in the final sensitivity.
The simulated tt background events are split into subcategories defined by the flavor of additional particle-level jets produced in association with the top quark pair. These classes are referred to as tt+bb, tt+cc, and tt+lf, (where lf stands for light flavor). The tt+lf category includes events where no additional pair of b or c jets occurs. The other background processes are summed up and shown together in the prediction.
The final observable used to extract signal events is defined as the BDT score distribution in each jet category corresponding to either Hut or Hct signal training. Figures 3 and 4 show the comparison between data and simulation for this observable after the fit to data with all background processes constrained to the SM expectation.

Estimation of systematic uncertainties
Sources of systematic uncertainty that affect both the normalization and shape of the predicted signal and background events are considered in the analysis. All systematic uncertainties are treated as nuisance parameters in the derivation of the exclusion limit.
The dominant systematic uncertainty arises from the application of the b tagging requirement. The shape of the CSVv2 discriminant, the b tagging efficiency, and the misidentification rate in simulation are corrected to reproduce the data distributions [39]. The uncertainties that are associated with these correction factors are the statistical uncertainty due to the limited data sample from which the correction factors were derived, and the systematic uncertainty arising from the purity estimate of the sample as predicted by simulation. The overall effect of this systematic uncertainty leads to a variation of ≈8-30% in simulated event yields, with the largest effect observed in event categories with a large number of b-tagged jets.
The uncertainty associated with the choice of renormalization and factorization scales in the matrix element is estimated by changing each scale separately by a factor of 1/2 and 2. To estimate the systematic uncertainty at the parton-shower level, several special simulated samples of events are considered, where the scales used to determine the ISR and FSR emissions are varied. The uncertainty associated with the choice of PDF is estimated by using several PDFs and assigning the maximum differences as the quoted uncertainty, following the PDF4LHC prescription with the MSTW2008 68% CL NNLO, CT10 NNLO, and NNPDF2.3 5f FFN PDF sets (see Ref. [42] and references therein, as well as Refs. [13,43,44]). The overall uncertainty asso-  ciated with the simulation of the background processes contributes up to ≈20% in the variation of event yields.
Following the prescription in POWHEG [32], the matching of the high-p T partons, from matrixelement calculations and parton-shower emission, is regulated by damping the emission by the factor m 2 t /(p 2 T + m 2 t ). Additional simulated samples for tt are used that correspond to the variation of this factor within the considered uncertainty. For the tt and single top quark tchannel simulated samples the additional systematic uncertainties associated with the amount of multiparton interactions and color reconnection [45,46] are considered. These uncertainties were determined by varying them according to the uncertainties reported for the underlying event tune CUETP8M2T4, and lead to a systematic effect of ≈1-5%.
The uncertainty associated with the calibration of the jet energy scale and the jet energy resolution contributes up to ≈8% in the variation of the final event yields [47]. The identification, isolation, and trigger efficiency correction uncertainties for reconstructed leptons contribute up to 0.5% of the total uncertainty in the predicted yield. An uncertainty of 2.5% is assigned to the measured integrated luminosity value of the considered data sample [14].
The number of simulated pileup events is corrected to match the measured number of events in  Figure 4: The BDT discriminant distributions for different jet categories for Hct training after the fit to data. All background processes are constrained to the SM expectation in the fit. The shaded area corresponds to the total uncertainty in the predicted background. The data-tosimulation ratio is also shown. The distributions for the signal processes are normalized to the total number of events in the predicted background to ease the comparison of the shapes of the distributions. data. The uncertainty on the total inelastic cross section is taken as 4.6%. Its overall contribution to the total systematic uncertainty is found to be negligible.
The p T spectrum of individual top quarks in data is found to be softer than predicted by the simulation. A correction for the top quark p T spectrum in simulation is applied and the difference between the initial and the corrected shapes is taken as an additional systematic uncertainty [48]. This uncertainty also has a negligible impact on the final distributions.
Additionally, a systematic uncertainty of 50% in the predicted cross sections for tt+bb and tt+cc processes is assumed [49,50].

Results
A comparison between the number of selected events in data and simulation is shown in Tables 1 and 2. A 95% CL upper limit is computed for the production cross section of tH FCNC events times branching fractions of top quark semileptonic decay and Higgs boson decay to b quarks that uses the asymptotic approximation of the CL s method [51,52]. The profile likelihood ratio test statistic [53] is defined as q(µ) = −2 ln(L(µ,θ µ )/L(μ,θ)), where L is a binned likelihood function, µ is a signal strength modifier, θ is a set of nuisance parameters,θ µ is a set of nuisance parameters that maximize L for a given µ,μ andθ are the values of the corresponding parameters which simultaneously maximize L. Uncertainties due to normalization are included through nuisance parameters with log-normal prior distributions, while shape uncertainties are included with Gaussian prior distributions. The expected and observed 95% CL upper limits are derived on the signal production cross section separately for each event category, as well as for their combination (Fig. 5). In the latter case, a simultaneous binned maximum-likelihood fit to all categories is performed. The fit takes into account the statistical and systematic uncertainties in the final BDT score distributions in each jet category.
The resultant observed (expected) 95% CL exclusion limits on top quark FCNC decay branching fractions are B(t → uH) < 0.47% (0.34%) and B(t → cH) < 0.47% (0.44%). These upper limits on the branching fractions can be translated into upper limits on the coupling strengths using the relations: where the total top quark width is Γ t = 1.32 GeV [54], and the partial width for the FCNC decay process of the top quark is Γ Hut = Γ Hct = 0.184 GeV for κ Hut = κ Hct = 1. The resultant limits     Fig. 6 and 7, respectively. We define a signal strength parameter as µ = σ/σ sig , where σ is the cross section excluded at 95% CL and σ sig is the predicted cross section for signal. A maximum likelihood fit is performed for the signal strength, and is shown in Fig. 8. Inclusion of the associated production of a single top quark with a Higgs boson in this study provides a ≈20% relative improvement in the expected upper limit on B(t → uH) with respect to the results obtained in an analysis of only tt events with top quark FCNC decays.

Summary
A search for flavor-changing neutral currents in events with a top quark and the Higgs boson, corresponding to a data sample of 35.9 fb −1 collected in proton-proton collisions at √ s = 13 TeV, is presented. This is the first search to probe tH flavor-changing neutral current couplings in both associated production of a top quark with the Higgs boson and in top quark decays. Observed (expected) upper limits at 95% confidence level are set on the branching fractions of top quark decays, B(t → uH) < 0.47% (0.34%) and B(t → cH) < 0.47% (0.44%). These results provide a significant improvement over the previous limits set by CMS in the H → bb channel, as well as represent the best limits for B(t → uH) at CMS. [31] CMS Collaboration, "Event generator tunes obtained from underlying event and multiparton scattering measurements", Eur. Phys. J. C 76 (2016) 155, doi:10.1140/epjc/s10052-016-3988-x, arXiv:1512.00815.
[32] CMS Collaboration, "Investigations of the impact of the parton shower tuning in Pythia 8 in the modelling of tt at √ s = 8 and 13 TeV", CMS Physics Analysis Summary CMS-PAS-TOP-16-021, 2016.