Search for flavor-changing neutral current interactions of the top quark and the Higgs boson decaying to a bottom quark-antiquark pair at $\sqrt{s}$ = 13 TeV

A search for flavor-changing neutral current interactions of the top quark (t) and the Higgs boson (H) is presented. The search is based on a data sample corresponding to an integrated luminosity of 137 fb$^{-1}$ recorded by the CMS experiment at the LHC in proton-proton collisions at $\sqrt{s}$ = 13 TeV. Events containing exactly one lepton (muon or electron) and at least three jets, among which at least two are identified as originating from the hadronization of a bottom quark, are analyzed. A set of deep neural networks is used for kinematic event reconstruction, while boosted decision trees distinguish the signal from the background events. No significant excess over the background predictions is observed, and upper limits on the signal production cross sections are extracted. These limits are interpreted in terms of top quark decay branching fractions ($\mathcal{B}$) to the Higgs boson and an up (u) or a charm quark (c). Assuming one nonvanishing extra coupling at a time, the observed (expected) upper limits at 95% confidence level are $\mathcal{B}$(t $\to$ Hu) $\lt$ 0.079 (0.11)% and $\mathcal{B}$(t $\to$ Hc) $\lt$ 0.094 (0.086)%.


Introduction
The discovery of a new scalar boson with properties compatible with the standard model (SM) Higgs boson (H) [1][2][3][4] opened a new window to shed light on beyond-the-SM phenomena at the CERN LHC. The top quark (t) is the most massive known fermion and its measured Yukawa coupling is of order unity [5]. The large mass of the top quark (m t ) suggests that it may play an important role in electroweak symmetry breaking [6][7][8]. Since the Yukawa coupling of the Higgs boson to fermions is proportional to the mass of the fermion, it is of particular importance to study the coupling between the top quark and Higgs boson.
Flavor-changing neutral current (FCNC) interactions of the Hqt vertex (where q represents quarks) are key processes to search for new physics. The FCNC process is forbidden at tree level in the SM and is highly suppressed in loop corrections by the Glashow-Iliopoulos-Maiani mechanism [9], leading to a branching fraction of O(10 −17 ) for t → Hu and O(10 −15 ) for t → Hc [10][11][12] with the up (u) and charm quark (c) final states, respectively. On the contrary, many SM extensions, such as the minimal supersymmetric standard model [13], supersymmetric models with R-parity violation [14], and other two-Higgs-doublet models [15][16][17], predict highly enhanced t → Hq branching fractions [13,[17][18][19][20]. Values can be as high as O(10 −5 ) for t → Hu and O(10 −3 ) for t → Hc [10], which are within the experimental sensitivity of analyses performed with the LHC data at 13 TeV.
Several searches for anomalous couplings between the top quark and the Higgs boson have recently been carried out by the ATLAS [21-23] and CMS [24,25] Collaborations. The most recent analysis is performed by CMS using a data sample with an integrated luminosity of 137 fb −1 in the diphoton decay channel of the Higgs boson, yielding observed limits on the branching fractions of 0.019 and 0.078% for t → Hu and t → Hc, respectively [25].
In this paper, we present an extension of the CMS search for Hqt FCNC interactions in the H → bb final state [24], exploiting a 13 TeV data set corresponding to an integrated luminosity of 137 fb −1 . Proton-proton (pp) collision data collected by CMS during 2017 and 2018 (101 fb −1 ) are analyzed separately and combined with the earlier results in the H → bb channel based on the 2016 data-taking period. Events are selected in the final state with exactly one lepton ( = µ or e) and at least three jets, of which at least two must be identified as coming from the hadronization of a bottom quark (b jets). Two signal production modes arising from the anomalous Hqt interaction are considered: the associated production of a single top quark with the Higgs boson from an u or c quark (ST production mode) and the t → Hq FCNC decays of a top quark in top quark-antiquark pair (tt) events (TT production mode), as presented in Fig. 1. In addition to the increased integrated luminosity, the analysis utilizes advanced techniques including a deep neural network (DNN) and boosted decision trees (BDTs) to improve the performance of event reconstruction and signal extraction, and in the H → bb final state it places the most stringent limits to date on the t → Hu and t → Hc branching fractions. Tabulated results are provided in the HEPData record for this search [26].
The paper is organized as follows. We begin with a brief overview of the CMS detector in Section 2, followed by a description of the signal and background modeling in Section 3. The event selection and signal extraction using multivariate analysis techniques are detailed in Sections 4 and 5, respectively. Numerous systematic uncertainties contribute to the analysis-these are explained in Section 6. Finally, the results of the search are presented in Section 7 and we conclude with a summary in Section 8.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume, there are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid. Events of interest are selected using a two-tiered trigger system. The first level is composed of custom hardware processors and selects events at a rate of around 100 kHz [27]. The second level, known as the high-level trigger, is implemented in software running on a processor farm and reduces the event rate to around 1 kHz before data storage [28]. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [29].

Signal and background modeling
The signal samples are generated at leading order (LO) in perturbative quantum chromodynamics (QCD) with MADGRAPH5 aMC@NLO 2.6.0 (TT production mode) or 2.4.2 (ST production mode) [30]. The Lagrangian with a FCNC coupling (κ Hqt ) is implemented in the FEYNRULES package [31], and the universal FEYNRULES output format [32] is used to generate the signal model. The complex chiral parameters f L Hq and f R Hq have negligible impact on the kinematics, and are fixed to 0 and 1, respectively. Up to two additional partons are generated at the matrix-element level for the TT production mode, while none are considered for the ST mode as they would induce an overlap between the two processes. The MLM [33] matching prescription is used to interface the hard-process simulation with the parton-shower modeling. The ST and TT processes are generated for two scenarios (Hut or Hct), assuming the presence of only one nonvanishing FCNC coupling at a time (κ Hut or κ Hct ). The TT signal production cross section predicted to 46.5 pb for each coupling, obtained by multiplying the SM tt production cross section by the corresponding branching frac-tion of the t → Hq decay, computed by MADGRAPH5 aMC@NLO. The predicted tt production cross section is 832 pb, as calculated with the TOP++ program at next-to-next-to-LO (NNLO) in perturbative QCD including soft-gluon resummation to next-to-next-to-leading-logarithmic order [34]. The cross section of the subdominant ST production mode is 13.8 (1.9) pb when setting the Hut (Hct) coupling to unity, computed by MADGRAPH5 aMC@NLO with LO precision. It is known that the top quark transverse momentum (p T ) spectra in tt production using LO and next-to-LO (NLO) predictions differ from the higher-order theoretical prediction. The distribution and normalization of the top quark p T spectrum in the TT signal events are therefore corrected first from LO to NLO by utilizing simulated SM tt samples, described below, and then to the NNLO QCD + NLO electroweak prediction [35,36].
The dominant background process is SM tt production, which is simulated at NLO precision using POWHEG v2 [37-39]. The tt events are categorized based on the flavor of the additional particle-level jets (those not originating from the top quark decays): events where at least one additional b jet is present (ttbb); events where at least one additional c jet is present that are not categorized as ttbb events (ttcc); and other events (ttLF, where LF stands for light flavor). Similar to the TT signal samples, the p T spectrum in the SM tt sample is corrected from the NLO to the higher-order prediction. In addition, the single-leptonic and dileptonic decays of the SM ttbb process are corrected for observed underestimations in simulation of about 20% [40], and the corresponding uncertainties derived from the measurement are assigned. The subdominant single top quark background is considered in the t channel, s channel, and in association with a W boson (tW). The t-channel production is simulated using POWHEG v2, while the s-channel production is generated with MADGRAPH5 aMC@NLO and tW samples are The uncertainties associated with these predictions are described in Section 6. Smaller SM background contributions arise from the following processes: tt production in association with a vector boson (ttZ and ttW) or a Higgs boson (ttH), simulated with MADGRAPH5 aMC@NLO and POWHEG v1, respectively; Drell-Yan and W boson production, simulated with MADGRAPH5 aMC@NLO; and diboson (VV) processes generated with PYTHIA 8.230 [46]. In addition, multijet events are simulated with PYTHIA 8.230 and used in the phase space where the estimated multijet background is nonnegligible.
The parton distribution function (PDF) set used for the event generation is NNPDF3. 1 [47], while the handling of the initial-and final-state radiation, the parton showering, and the hadronization is performed by PYTHIA 8.230 with the CP5 tune [48]. Additional inelastic pp interactions within the same or nearby bunch crossings (pileup) are simulated using PYTHIA and added to the generated process of interest. The number of such interactions to be added is estimated based on the measured instantaneous luminosity and a total pp inelastic cross section of 69.2 mb [49]. The events are corrected to remove differences between the simulated and observed pileup distributions. All the outgoing particles are propagated through a detailed simulation of the CMS detector based on the GEANT4 [50] framework, which models the interactions with matter, the energy deposits in sensitive detector layers, and the resulting electronic responses.

Event reconstruction and selection
The particle-flow (PF) algorithm [51] aims to reconstruct and identify each individual particle in an event, with an optimized combination of information from the various elements of the CMS detector. The reconstructed vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. The muon p T is obtained from the curvature of the corresponding track built using the tracker and muon chambers. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits calibrated by the response function of the calorimeters to hadronic showers. The energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies. Finally, the missing transverse momentum vector p miss T is computed as the negative vector p T sum of all the PF candidates in an event, and its magnitude is denoted as p miss Jets are reconstructed from the PF candidates using the anti-k T clustering algorithm [53, 54] with a distance parameter of 0.4. The measured momentum in simulation agrees within 5-10% of the true momentum over the entire p T spectrum and detector acceptance [55]. To mitigate the pileup effect on the jet momentum, tracks identified to be originating from pileup vertices are discarded and an offset correction is applied to correct for remaining contributions. Jet energy corrections are derived from simulation and confirmed with in situ measurements of the energy balance in dijet, multijet, γ+jet, and leptonic Z+jet events. Only jets with p T > 30 GeV and |η| < 2.4 are considered and an additional selection is applied to remove spurious jet-like features originating from isolated noise patterns in certain HCAL regions [56]. Heavy-flavor jets are identified based on the DeepCSV algorithm [57] at a working point characterized by a bjet identification efficiency of 68%, a c-jet and a light-quark or gluon jet misidentification rate of 12 and 1%, respectively. An event-by-event correction based on the jet flavor content is applied to account for differences between the simulated and observed DeepCSV distributions [57]. The analysis requires the presence of at least three jets, among which two or more must be identified as b jets.
The analyzed events were recorded using a combination of muon and electron triggers. Events in the muon channel are selected using a trigger requiring at least one muon with p T > 27 (24) GeV for the 2017 (2018) data-taking period; in the electron channel, triggers used require either at least one electron with p T > 35 (32) GeV or at least one electron with p T > 28 GeV and |η| < 2.1, as well as a global scalar sum of jet transverse momenta exceeding 150 GeV. The corrections to the muon trigger efficiencies are derived using the "tag-and-probe" technique [58] in a data sample enriched with Z → events. To calculate the electron trigger efficiencies in a similar way without bias, orthogonal single-muon data sets are used together with tt events in the single-lepton and dilepton channels. Muon candidates are selected offline based on the track fit quality, the number of hits associated with the track, and the track impact parameter [59]. For electron candidates, offline requirements on the shower shape and track identification are applied [60]. The analysis only considers leptons with p T > 30 GeV and |η| < 2.4. Additionally, electrons reconstructed in the gap between the ECAL barrel and endcap (1.44 < |η| < 1.57) are rejected. To reduce the contribution from leptonic decays of heavy-flavor hadrons, a selection on the isolation parameter I rel is applied. This variable is defined as the ratio of the scalar p T sum of photons, charged and neutral hadrons within a cone of angular distance ∆R = √ (∆η) 2 + (∆φ) 2 = 0.4 (0.3) for muons (electrons) to the lepton p T . The muon candidates are required to have I rel < 0.15 while for electron candidates the requirement is defined in bins of p T and η. Events with additional leptons, defined using looser identification criteria, are rejected to reduce background contributions containing nonprompt leptons.

Signal extraction
The selected events are divided into five categories based on the jet and b-tagged jet multiplicities in order to optimize the analysis sensitivity. The number of jets can be exactly three or at least four, and the corresponding categories are denoted as j3 or j4, respectively. The events with up to four b jets are analyzed, considering misidentification of jets and additional partons in the matrix-element calculation. Similar to the jet multiplicity, the categories in terms of b jets are indicated as b2, b3, or b4 when the considered event has exactly two, three, or four b jets, respectively. Events with five or more b jets are not used due to their large uncertainty and small impact on the analysis. Following these conventions, the jet categories are identified as b2j3, b2j4, b3j3, b3j4, and b4j4. The analysis follows two steps of multivariate techniques to identify signal events over the SM prediction. Considering each jet category separately, events are reconstructed using a set of DNNs, one for each relevant signal and background process hypothesis. For each process hypothesis, the DNN selects the best combination of jets in each event. Thus, a given event may have more than one jet assignment, depending on the available process hypothesis. The BDTs are then utilized for signal event discrimination against the SM backgrounds using the reconstructed kinematic variables from the jet combinations from all possible process hypotheses. The obtained distributions for the simulated samples are then fit to the data using a binned maximum likelihood fit.
To maximize the sensitivity to the signal processes and to discriminate against the SM backgrounds, it is necessary to fully characterize the signal and background kinematic properties and gather discriminative features. Thus, one has to correctly assign the reconstructed jets to the generator-level jets in the signal or background simulated samples to build the best possible distributions of the kinematic variables. The event reconstruction is performed under multiple process hypotheses, namely the ST and TT signal hypotheses and the SM tt background hypothesis. Each hypothesis can be distinguished by the number of jets in the final state and their flavor, without considering jets from additional partons or radiation. The signal hypotheses commonly contain three b jets from the Higgs boson and the leptonically decaying top quark, and the TT signal process has an additional up or charm quark jet. For the SM tt hypothesis, there is one b jet from each top quark decay and two additional jets from the hadronic decay of a W boson. Consequently, the ST hypothesis is considered for events with three or more jets, while the TT and tt hypotheses are applied to the events with four or more jets.
Each reconstruction algorithm is constructed as a binary classifier DNN with several possible combinations of three (ST) or four (TT, tt) jets in the event as inputs. For each event, the signal combination is defined as the case in which all reconstructed jets in the combination are matched to the generator-level jets within an angular distance of 0.4. All the other combinations are labeled as background. To reduce the number of wrong combinations considered, the TT and ST signal hypothesis training for events with two (at least three) b jets is performed only for configurations in which the Higgs boson is associated with at least one (two) b jet(s). After optimization of the performance, and taking into account agreement between data and simulation, 32 (69) kinematic variables are constructed for the ST (TT and SM tt) hypotheses and used for the training, such as the angular distance, momentum, and η of jets, dijets, or trijet combinations as well as b tagging discriminants.
A residual network structure [61,62] is chosen to boost training speed and to prevent information loss that may occur for structures with a large number of hidden layers. In comparison to a simple DNN with the same number of hidden layers, the training of the residual network converges to its best performance more quickly, by up to 50%, for this analysis. The residual network consists of several residual blocks as well as input and output layers. In the residual block, the input data is branched off into two streams, where one stream is passed through two hidden layers while the other bypasses them. The two streams are then added into one tensor and fed into the next residual block. The network consists of 11 hidden layers in total, with five residual blocks.
The classifiers are first trained with all jet categories for the 2017 and 2018 simulated samples, separately. In addition, an exclusive set of classifiers are trained using only the jet permutations from the b4j4 category because of the limited size for the signal combinations and the large difference between the signal and background sample sizes. Once the classifiers are trained with the signal and background combinations, all combinations from the statistically independent data and simulated samples are predicted with the trained classifiers. The combination with the highest DNN score is selected for each event and process hypothesis. The performance of the classifier is measured in terms of the reconstruction efficiency, defined as the ratio of the number of events where all jets in the selected combination are correctly assigned to the number of events where the reconstructed jets are fully matched to generated jets within an angular distance of 0.4. The computed efficiencies for the ST signal hypothesis are ≈86% and ≈82% for the Hut and Hct couplings, respectively. In the TT signal scenario, the classifier trained with the Hut (Hct) coupling shows an ≈79 (≈77%) efficiency, while for the SM tt hypothesis it is ≈78%. By implementing the DNN, the reconstruction efficiencies are improved by 5-15% compared to the kinematic fit method that was used in the 2016 data analysis [24]. While the efficiencies of the exclusive training in the b4j4 category are comparable to the inclusive ones for the ST and TT hypotheses, in this phase space the tt reconstruction efficiency is improved by more than 40%. The efficiencies are computed with statistically independent samples with respect to the training ones. The improvements in the reconstruction algorithms benefit the analysis by helping to accurately construct the kinematic distributions. Example kinematic distributions for different process hypotheses are presented in Fig. 2.

Systematic uncertainties
The impact of experimental and theoretical systematic uncertainties is evaluated by means of nuisance parameters, described further in Section 7. The dominant source of experimental uncertainty comes from the imperfect knowledge of the b tagging (mis-)identification rate, which affects both the event yield in each analysis category and the discriminant distributions. Its magnitude is evaluated separately for b jets and light-quark or gluon jets, and is derived based on a tt (Z) enriched sample for the former (latter). The total magnitude of the b tagging uncertainty varies with the jet category and ranges from 12 to 25%.
The uncertainty arising from the jet energy scale (JES) and resolution (JER) also impacts both the distributions and the event yields for the signal and background processes. The JES uncertainties are calculated by splitting into each one of the uncertainty sources, such as pileup, detector response, discrepancies in jet p T and η, and flavor. The uncertainties are then grouped into seven sources defined according to their correlations and detector geometry. The uncertainty induced by the JER is assessed for different sources, such as the statistical uncertainty in the JER measurement, jet radiation, underlying event, out-of-cone showering, and pileup. A detailed description of the determination of these uncertainties can be found in Ref.
[55]. The uncertainties in the muon and electron scale factors related to their identification, isolation, and trigger efficiency are assessed when deriving these scale factors using the tag-and-probe technique, taking various systematic sources into account, including the limited size of the sample used, the possible bias from the tag selection, as well as run-by-run differences. The experimental uncertainties in the integrated luminosity measurements for the data recorded in 2017 and 2018, respectively, are estimated to be 2.3 and 2.5% [63,64], while the integrated luminosity of the combined 2017-2018 data set has an uncertainty of 2.0%. To account for the uncertainty in the pileup reweighting in the simulated events, the value of the total pp inelastic cross section, which is used to estimate the mean number of additional pp interactions, is varied by ±4.6% [49].       The following sources of theoretical uncertainty are taken into account. The leading source of theoretical uncertainties is coming from the arbitrary choice of renormalization and factorization scales and is obtained by taking the envelope of the distributions obtained with modified scales [65,66]. The scale variation uncertainty ranges between 10-17% and constitutes the second largest uncertainty source. The uncertainties in the initial-and final-state radiation are computed by varying the scale used to evaluate α S for each radiation by factors of 0.5 and 2.
The matching of matrix-element partons and parton showers is controlled in POWHEG by a model parameter h damp . The nominal value for the h damp parameter is set to 1.379 m t , with m t = 172.5 GeV [48]. The uncertainty associated with this prescription is estimated by varying the h damp value by 0.8738 m t and 2.305 m t . The impact of the uncertainty in the PDF is assessed by using various event weights that represent the usage of the uncertainty eigenvector sets of the PDF [67]. Dedicated samples related to the variation of the underlying event tune CP5 [48] of the SM tt background sample are used to estimate the associated uncertainty.
The calculation of the SM tt production cross section has an uncertainty of 5.5% [37][38][39]. An uncertainty of about 20% in the cross section of the ttbb process is considered to account for the correction of the underestimated simulation [40], and an uncertainty of 50% in the cross section of the ttcc process is included to cover the possible discrepancy between data and simulation. The uncertainties in the normalization of the other SM background yields have been found to be subdominant and are fixed to 10%. The uncertainty coming from the limited size of the simulated samples is also considered as a source of systematic uncertainty by assigning a single nuisance parameter that is constrained by the total uncertainty in each bin of the BDT output distribution [68,69].

Results
The BDT output distributions in Figs. 3 and 4 show no significant excess with respect to the SM background expectations. Upper limits at 95% confidence level (CL) are placed on the product of the signal production cross section and branching fraction. The limits are derived based on the asymptotic modified frequentist method (asymptotic CL s ) with the test statistic based on the profile likelihood ratio consists of binned likelihood as a function of signal strength and nuisance parameters [70][71][72], separately for the t → Hu and t → Hc couplings, i.e., assuming one nonvanishing coupling at a time. All the systematic uncertainties described in Section 6 are considered as nuisance parameters in the fit. The uncertainty sources affecting the global normalization are considered with log-normal probability distributions, while the ones affecting the shape of the BDT distributions are considered with Gaussian probability distributions. The BDT distributions from the five jet categories and from the three data-taking periods (the 2016 ones being taken from the analysis described in Ref. [24]) are used to derive the final results. Table 1 shows the number of events per process and jet category, derived from the simultaneous fit for the 2017+2018 data and simulation.
The expected and observed exclusion limits on the product of the cross section and branching fraction for the two couplings are shown in Fig. 5 separately for each jet category as well as their combination. The b3j4 category has the highest sensitivity to both couplings since it corresponds to the canonical jet content of the dominant TT signal contribution, while the SM tt contribution is suppressed due to the requirement of a third b jet. The b2j3 and b4j4 categories show reversed sensitivity between Hut and Hct couplings due to the much higher b jet misidentification probability of c jets compared to u jets.
The exclusion limit on the cross section can be interpreted in terms of an exclusion on the  anomalous coupling κ Hqt since the signal cross section scales linearly with κ 2 Hqt . Using the signal cross sections for κ Hqt = 1 from Section 3 leads to observed (expected) limits of κ Hut < 0.074 (0.087) and κ Hct < 0.081 (0.078) at 95% CL. This can in turn be interpreted in terms of a limit on the t → Hq branching fraction thanks to the relation with Γ t = 1.32 GeV at NNLO [73] and Γ κ Hut =1 Hut = Γ κ Hct =1 Hct = 0.19 GeV. The observed (expected) 95% CL exclusion limits on the branching fractions are B(t → Hu) < 0.079 (0.11)% and B(t → Hc) < 0.094 (0.086)%. The limits for a single nonvanishing coupling are interpolated by assuming a linear relationship between the branching fractions. The resulting two-dimensional limits for the branching fractions and couplings are shown in Fig. 6. The results presented here improve by a factor of 3-6 compared to those obtained in the same decay channel with the 2016 data set [24], and are comparable to the observed limits from the 13 TeV analysis in the diphoton channel [25]. [4] ATLAS and CMS Collaborations, "Measurements of the Higgs boson production and decay rates and constraints on its couplings from a combined ATLAS and CMS analysis of the LHC pp collision data at √ s = 7 and 8 TeV", JHEP 08 (    [28] CMS Collaboration, "The CMS trigger system", JINST 12 (2017) P01020, doi:10.1088/1748-0221/12/01/P01020, arXiv:1609.02366.