Exploring the Anomalous Top-Higgs FCNC Couplings at the electron proton colliders

We perform an updated analysis on the searches for the anomalous FCNC Yukawa interactions between the top quark, the Higgs boson, and either an up or charm quark ($\rm tqh,\ q=u,\ c$). We probe the observability of the FCNC top-Higgs couplings through the processes $\rm e^- p\rightarrow \nu_e \bar{t} \rightarrow \nu_e h \bar{q}$ (signal.I) and $\rm \ e^- p \to \nu_e h b$ (singal.II) at the proposed electron proton (ep) colliders, where the Higgs boson decays to a $\rm b\bar{b}$ pair. We find that at the high luminosity (1 $\rm ab^{-1}$) ep colliders where the electrons have a polarisation of $\rm 80\%$ and electron energy is typical 60 GeV, the 2$\sigma$ upper limit on $\rm Br(t\to uh)$ are $0.15\times 10^{-2}$($2.9\times 10^{-4}$) at the 7TeV@LHeC(50TeV@FCC-eh) for signal.I and $0.15\times 10^{-2}$($2.2\times 10^{-4}$) for signal.II. We also give an estimate on how the sensitivity (take signal.I as examples) would change when we reduce the electron beam energy from 60 GeV to 50 GeV or even 40 GeV due to the cost reason. The conclusion is that the discovery potential reduce $8.7\%$($29.4\%$) if the electron beam change from 60GeV to 50(40) GeV at the 7TeV LHeC, and $16.8\%$($19.8\%$) at the 50 TeV FCC-eh.


INTRODUCTION
The discovery of the Higgs boson at the Large Hadron Collider(LHC) [1][2] is a major step towards understanding the electroweak symmetry breaking(EWSB) mechanism and marks a new era in particle physics. The precise measurement of the Higgs boson and the top quark properties would provide the possibility of searching for the anomalous flavor changing neutral current(FCNC) Yukawa interactions between them and either an up or charm quark(tqh, q = u, c). According to the Standard Model(SM), FCNC processes are forbidden at tree level and very much suppressed at higher orders due to the Glashow-Iliopoulos-Maiani(GIM) mechanism [3]. For instance, the t → qh(q = u, c) branching ratio is of the order of ∼ 10 −10 or even below. In models beyond the SM(BSM), the GIM suppression can be relaxed, yielding effective tqh couplings orders of magnitude much larger than those of the SM and therefore being detectable using current experimental data. Observations of such anomalous top-Higgs couplings would provide a clear signal of new physics. Examples of such model extensions [4] are, for instance, the Minimal Supersymmetric Model(MSSM) with/without R-parity Violating [5] [31], the Warped Extra Dimensions Model [32] [33], the Alternative Left-Right symmetric Model(ALRM) [34], the Little Higgs with T parity Model(LHT) [35], the Quark Singlet model(QS) [36][37] [38], etc.
Considering the FCNC Yukawa interactions in the effective field theory(EFT) framework, the SM Lagrangian can be extended simply by allowing the following terms, L = κ tuht uh + κ tcht ch + h.c., (1) where κ tuh and κ tch are the real parameters and denote the flavor changing couplings of Higgs to up-type quarks. Now we have m t minus m h larger than m c , m u and m b . In addition to the usual decay mode t → w ± b, the top quark can also decay into a charm or up quark associated with a Higgs boson. Similarly, the new tqh interactions can also affect the width of the Higgs boson, through its additional decay into an off-shell top that subsequently leads to a single w, namely h → u(c)(t * → wb) where t * denotes off-shell top quark. Therefore, the total decay width of the top quark(Γ t ) and Higgs boson(Γ h ) are Γ t = Γ SM t→w − b + Γ t→ch + Γ t→uh .
Γ SM t→w − b is the normal top decay width in the SM. Its analytical formula up to next-toleading order(NLO) can be found in Ref. [39]. The t → u(c)h partial decay width is given as [40] Γ t→u(c)h = κ 2 where is the normal two body Higgs decay width in the SM.
The other terms related to the Higgs boson three-body decays are numerically estimated.
The branching ratio for t → qh is then given by where τ w = mw mt and G F is the fermi constant. With parameters that would present in the following we get Br(t → u(c)h) ≈ 0.512κ 2 tu(c)h which is used in the extraction of the couplings in our analysis. Since the analysis does not distinguish between the t → ch and t → uh final states which have similar acceptances, we use t → uh mode as reference throughout this work without other statement.
The searches for the anomalous FCNC top-Higgs coupling have been investigated at the LHC and the direct limits on the branching ratio are set from the collider experiments.
The most stringent constraint through direct measurements was reported by the CMS collaboration from a combination of the multilepton channel and the diphoton plus lepton channel [41]. The results are corresponding to 19.5 fb −1 data at the center-of-mass energy of 8 TeV. The 95% confidence level(C.L.) upper limit on Br(t → u(c)h) has found to be the upper limit of Br(t → qh) < 5 × 10 −3 can be obtained.
Studying the FCNC top-Higgs interactions is therefore important from a theoretical perspective. Many phenomenological studies are performed based on the experimental data, through a widely studied t → qh decays and a single top Higgs associated production mode. For instance, Ref. [45] performed a study on the top-Higgs couplings through several productions and found that the process pp → tjh(tjh) → b νjjj can be extremely useful in providing strong bounds on the tqh couplings of the order of 1 is not the most favored one. For h → τ + τ − channel, since the τ event reconstruction is challenge, thus making this decay channel less favored. In this paper we choose the h → bb mode which is more interesting than the other channels. In addition, the impact of light, charm jet mis-identification rates and B-tagging efficiencies on the discovery potential are quantified in our following discussions. Discovery potential for the 50 TeV high proton energy beam is also considered.
Our paper is organized as follows: Section 2 is arranged to present the analysis and numerical results in detail. There comes the subsections include signal and background analysis, simulation and the LHeC discovery potential. Typically, we perform our analysis in a Cut-based method and a Multivariate Analysis(MVA) based method. A comparison on their impact on the FCNC coupling discovery potential is performed. The expected limits are then obtained based on a event number counting method. Finally we summarize our conclusion in the last section.

Signal and Background Analysis
The considered signal productions with top-Higgs FCNC couplings can be written as where q=u,c. The Feynman diagram is plotted in Fig.1 which contain three QED couplings, are noted as "bakh" and "bakz" respectively, which contains two QED couplings and two QCD couplings, is noted as "bakg". Notice here and bellow, j = g, u,ū, d,d, c,c, s,s. One source of the most important potentially reducible backgrounds is due to a mis-identification of one or more of the final state light jets to B-jets. These processes contain two QED couplings and two QCD couplings as well. We refer them as "bakvej" (include the "bakg" background). Another source of reducible background is single top production. As can be seen, the signal process studied in our paper is essentially single top production at the e − p collider, followed by a particular decay chain.
This means that SM single top production and decay is an important background to our signal production under consideration. We refer these backgrounds as "bakt". Amoung them, production of is one example. The produced top quark will decay to a w boson and a B-jet. The hadronic decay of the w boson to non-B-jets final states, which might mis-tagged as a Bjet, make this production a dangerous background. We also consider some neutral current (NC) production backgrounds: These are NC multi-jet backgrounds and belong to reducible ones. We use "bkgej" as a simple notation to refer them. Finally, we include some Feynman diagrams for the most important backgrounds in Fig.2. Typically, Fig.2

Simulation
For the collider phenomenology, we use FeynRules [58] to extract the Feynman Rules from the Lagrangian in Eq.(1). The model is generated into Universal FeynRules Output(UFO) files [59] and then fed to the Monte Carlo event generator MadGraph@NLO [60] for the generation of event samples. We pass the generated parton level events on to PYTHIA6.4 [61] which handles the initial and final state parton shower, hadronization, heavy hadron decays, etc. When generating the parton level events, we set the renormalization and factorization scale at Z-boson mass. We set CTEQ6L [62] parton distribution function, with α s (the strong coupling constant) evaluated consistently at all stages(PDFs, hard scattering and decays). We take all the low flavored quarks, gluon and also the b-quark fluxes inside proton. We use FASTJET [63] for jet clustering instead of using the PYCELL routine in PYTHIA6.4. Jets are anti-kt clusted [64] with a cone of radius ∆R(j) = ∆η 2 + ∆φ 2 = 0.7. We do not include any matching algorithm used to interface MadGraph with Pythia. One can reproduce our sample with default parameters set in Pythia, except a special tune on MSTP (11). Considering our signal and background processes, we also apply B-jet tagging technique in our analysis. Final tagged B-jets are mainly resulted from three sources. One source is a jet that matches a B-flavored hadron and tagged to be a B-jet. The default B-jet tagging rate is taken to be b = 60% without other statements. The second source is a jet that matches a C-flavored hadron and mis-tagged to be a B-jet. We accept it as a B-jet only if it is mis-tagged with probability To estimate the event rate at parton level for the signal, we apply the following basic pre-selections: where ∆R = ∆Φ 2 + ∆η 2 is the separation in the rapidity-azimuth plane, p jet,b,  the very basic ones and might be changed later in our following discussion. In order to obtain the anomalous FCNC tqh coupling effects, we need to simulate the signal contributions precisely together with all the backgrounds. We use the same pre-selections for the backgrounds as well as the identical conventions and parameter sets as pre-described.
In Fig.3, we show the dependence of the cross sections σ as a function of the top-Higgs FCNC couplings κ tqh at the 7TeV, 14TeV and 50 TeV LHeC for three different cases: The cross section of 50 TeV can be 14 times larger than that of 7 TeV. As we said above, the analysis does not distinguish between the t → ch and t → uh final states, we use t → uh mode through our analysis. The total cross section for these reactions thus can be split into three contributions σ = a 0 + a 1 κ tuh + a 2 κ 2 tuh (13) where a 0 is the SM prediction, the term a 1 linear in κ tuh arises from the interference between SM and the anomalous amplitudes, whereas the quadratic term a 2 is the selfinterference of the anomalous amplitudes.

Event Selection and Resonance Reconstruction
Now we turn to discuss event selections step by step in more details. We define the notation of "cut1" first: cut1 stands for the basic generator level cuts in Eq. Results are listed in the first column in Table.1 (Table.2) for the 7TeV(50 TeV) LHeC. We see here the NC multi-jet QCD production, with the cross section of 50.5(244.6) pb, is the most largest one at this stage.
We define the notation of "cut2" here: cut2 means a selection that includes the following three conditions: First, a no-lepton selection is imposed to make sure that the passed events do not contain triggered lepton(s) in the final state. This condition helps to reject NC multi-jet QCD background efficiently. Secondly, we require the selected events must contain at least three    to the backgrounds. We see the events that contain three jets account for around 50% to the total, for both the signal and the backgrounds. More than 60% events are kept when we include events with more than three jets. Thirdly, we require that there must be at least two jets are tagged to be B-jets. The distribution of the number of the tagged B-jets(N Bjet ) is shown in the right panel of Fig4  Details can be found in the first four columns.
Although the selections and rejections are quite efficient, the backgrounds are still much larger than the signal. Our next step is to reconstruct the resonance in the selected events, typically, the reconstruction of the Higgs boson and the top quark. As can be seen, the Higgs boson can be easily reconstructed by considering the two tagged B-jets in the final state and summing over their 4-momentum. If more than two B-jets appear, we reconstruct the Higgs by checking which two can reconstruct the mass closer to the input Higgs mass. Well, events with more than two B-jets rarely happen in our case, as can be seen in the distribution of N Bjet in Fig.4[right panel]. The reconstructed Higgs mass is presented in Fig.5[first two panels], Again, the signal is present in the red zone. We display different backgrounds separately and each of them is unit normalized. The bakej backgrounds are already quite small after cut3, thus not shown in the figures here and following. We find that the reconstructed Higgs mass for the signal is peaked strongly at ∼115 GeV, however, 10 GeV lower than the real input mass ∼125 GeV. This is not strange. Indeed, the peak is always shown up to the left side of the actual mass due      mis-identification rates. A more detailed study of the dependence on them will appear in the following.

LHeC Discovery Potential
We will aim at finding the most efficient selections in order to allow the best separation between noise-related and signal-related events and show the search strategies for the anomalous tqh couplings. This requirement is satisfied when the signal significance is maximized. We adopt two different methods in our analysis: • Cut-based analysis: in this method, we simply apply different cuts on various kinematic variables in order to optimize the signal significance.
• Multivariate analysis: in this method, we employ multivariate techniques, use several different input variables, to give better signal to background discrimination, thus resulting better signal significance.
Here the signal significance is defined by We define the signal to background ratio by R = S B (16) and their purity by Here S and B relate to the numbers of the signal and background events respectively, under expected luminosity. These definitions are used in Tables.1 and 2.

Optimized selections and Cut-based method Analysis
Here we first study the Cut-based method. After applying the above pre-selections (cut1, ] . This selection is defined to be "cut5". With this requirement, the bakt background is reduced to 0.84(7.37) fb, same order of the signal now. We also impose "cut6": two B-jets invariant mass should out of the z boson mass window, which can reduce, for example, the bakz background. In addition, once a Higgs boson has been identified, its mass is required to lie within a fixed window around 115 GeV. Specifically we require the condition to be 100 GeV < m h < 130 GeV. Cross sections after imposing all these cuts are presented in columns 5-7 in Table.1 and Table.2.
Once all the above selections are met, the next question is whether there are more choices of cuts that can be used to improve the signal significance. Consider of signal and backgrounds, a forward jet cut is worth being considered. The forward jet lies very close to the direction of the incoming proton, i.e., like the z, Higgs production via the vector boson fusion(VBF) through the normal proton proton collision at the LHC. In contrast to VBF, instead of a jet with large rapidity gap with the forward jet, in our productions, we have the neutrino. The more massive the heavy particle is, the energy of the forward jet is becoming less, so it lies more close to the proton direction, i.e., larger rapidity. This is, however, a good characteristics if our signal is the SM Higgs production as discussed in Ref. [67] or heavy Higgs production as discussed in Ref. [68]. Please recall that in our case the SM Higgs production is one of the backgrounds. We thus expect the candidate light jets to be centrally produced. This selection cut(defined as "cut8") which is used to reduce the large rapidity light jets is called the forward jet veto. We introduce |η j | < 2.5(3) for 7 (50)TeV. It has severe impact on the processes of, i.e., SM Higgs jet associated production and z jet associated production, in the forward rapidity region.
The distribution of the forward jet rapidity is displayed in Fig.5[The third two panels].
We see the bakz and bakh backgrounds enhance at the range |η j | > 2.5 and |η j | > 3 for 7 TeV and 50 TeV respectively. The forward jet veto cut can reduce bakz(bakh) to 0.017(0.09) fb at the 7TeV and 0.05(0.33) fb at the 50 TeV. Typically, this cut can also reduce multi-jet QCD backgrounds, see in For a short summary and to be clear, we list all the Cut-based selections here: • cut1: the basic pre-selection cuts in Eq. (12).
• cut5: the reconstructed w boson mass window m w < 50GeV or m w > 90GeV.
• cut6: the reconstructed z boson mass window m z < 55GeV or m z > 95GeV.
• cut8: the forward light jet veto cut |η j | < 2.5 for 7TeV and |η j | < 3 for 50TeV (  One way to improve the signal significance is of course simply to impose more critical cut on the rejection of the reconstructed w mass. Since this is the most efficient way to reduce the bakt background. However it is not strange such a cut would reduce the signal events passing the selection as well. One of the other way might be the enhancement of the B-tagging efficiency together with the reduction of the jet mis-identification rates. In Table.3 and Table.4 we quantify the impact of light and charm jet mis-identification rates and the B-tagging efficiencies on the signal significance. This is also one part of updated results in the present study compare to our previous work in Ref. [57].  Table.3 and Table.4. These fake rate would have little impact on the signal events. Finally we can get the best signal significance of 8.37(33.75) with b = 0.8, c = 0.01 and light = 0.001 with the same value of the input parameters and kinematic cuts.

Optimized variables and TMVA framework Analysis
Now we turn to discuss Multivariate Analysis(MVA) based method. We need information the more the better in order to set limits on the cross section of the process(or the anoma-  Table 3: Table to Table 4: Table to  of predicted signal and background events selected by the analysis and from their relative shape. It will be useful to find a final "discriminate distribution" which shows the largest discrimination between signal and background events. Such a distribution can either be a kinematic variable, or a new distribution constructed for this task. MVA is such a method that designed to construct such a variable with a high discriminating power by exploiting predefined multiple variables. In order to perform a MVA, we use the Toolkit for Multivariate Analysis(TMVA) [69] package which provides a ROOT-integrated environment for the processing, parallel evaluation and application of multivariate classification technique.
Typically, the Boosted Decision Trees(BDT) technique is chosen in our analysis.
To perform TMVA, technically, the most important but challenge thing is to define suitable variables. Choices of suitable variables depend not only on signal and background characteristics but also on suitable pre-selection. This pre-selection could be some cuts, user defined but not critical, imposed after Mente Carlo event sample. In our case, in contrast to the full critical selections in the Cut-based analysis, we choose, i.e., the cut to reject of w mass window. Though the motivation is to separate the signal and background in advance as much as possible, we should find some balance between the potential of separating and statistical uncertainty before the final decision of the pre-selection.
Depending on our signal and background topological characteristics, we define a set of Φ z , Φ w , Φ top ; the difference in azimuthal angle between (∆Φ B 1 B 2 , ∆Φ B 1 J 1 , ∆Φ B 2 J 1 ) and the Higgs jet system(∆Φ hJ 1 ); the separation in the Φ − η plane between jets (∆R B 1 B 2 , ∆R B 1 J 1 , ∆R B 2 J 1 ) and the Higgs jet system(∆R hJ 1 ); the difference in |η| between jets (∆η B 1 B 2 , ∆η B 1 J 1 , ∆η B 2 J 1 ) and the Higgs jet system(∆η hJ 1 ); the missing transverse(MET) energy / E missing T ; and the vector sum of the Higgs MET system, the top MET system, the z MET system and the w MET system. Full list of 44 variables pertaining to both background and signal was narrowed down to 32 variables using TMVA in our analysis.
Indeed, we test and compare and find that 9 within 32 of them are the most powerful variables which are: if the KS probability lies in the range (0.1,0.9) [70]. For most cases, a critical value of the KS probability greater than 0.01 [71] implies that the samples are not overtrained. In our subsequent study we ensure that overtraining is not an issue over our parameter space.
In Fig.6 we present the output distributions for the signal and background sample for a given classifier(in this case BDT, i.e. Boosted Decision Trees). The signal in the blue area is for κ tuh = 0.1. The red area refers to the total backgrounds. Notice the plots are all normalized to equal area respectively, so more blue area(signal) outside the red area(background) means the variable is more discriminative against background. For the intent of our analysis, the lump in the area larger than zero should be worked around.
The BDT output is indeed such a constructed variable with a high discriminating power by exploiting predefined multiple variables. The normalized distributions of the signal and background as a function of these multiple variables can be output automatically from the TMVA framework. There we expect areas as widely separated as possible, in order to make a good signal background discrimination.
7TeV@LHeC C BDT > 0.09 C BDT > 0.11 C BDT > 0.13 C BDT > 0. 15    In Table.5 and Table.6  and SS are evaluated with 1000 fb −1 integrated luminosity at the LHeC. The colliding energy is the 7(50) TeV proton with a 60 GeV electron beam. These parameters are the same as in the Cut-based analysis. We find for C BDT > 0. 13 and non-Bjet mis-tagging rates ( c = 0.1 and light = 0.01 ) are taken into accounts.

Discovery Potential
We follow refs [72][73] exactly to obtain the sensitivity limits. A chi-square (χ 2 ) analysis is performed with the definition where σ tot is the cross section containing new physics effects and δ = 1 √ N is the statistical error with N = σ B × L × . The parameter sensitivity limits on anomalous tqH coupling as a function of the integrated luminosity can then be achieved. Fig.7 shows the upper limit of Br(t → uh) at 99.9, 95 and 68% C.L. as a function of the integrated luminosity at the LHeC. The red, blue and black curves represent the one, two and five standard deviation bands around the expectation. The limit from Cut-based analysis is present in the first two panels and is compared with the TMVA framework analysis which is shown in the second two panels. For the 7 TeV LHeC, we see the analysis is able to probe Br(t → uh) down to 0.113% at 1σ level, corresponding to an upper limit on the κ tuh coupling of 0.047, with 1000 fb −1 integrated luminosity with our Cut-based analysis. This is a little worse than our TMVA framework analysis with which we can probe Br(t → uh) down to 0.093% with the same value of the integrated luminosity.  And this is corresponding to an upper limit on the κ tuh coupling of 0.042. All these are not as good as 50 TeV high energy LHeC where we can probe Br(t → uh) down to 0.03(0.022)% with the same value of the integrated luminosity. And this is corresponding to an upper limit on the κ tuh coupling of 0.024(0.02) correspond to the Cut-based(TMVAbased) methods. We can see the 50 TeV high energy LHeC would improve the search sensitivities than the 7TeV one. We remind here the B-tagging efficiency is b = 0.6 and the jet mis-identification rates are the default value of c = 0.1 and light = 0.01 respectively.
Finally, a detailed comparison between our study and the critical limits obtained by the experiments and by the phenomenological studies are shown in Table.7 and Table.8.
In Table.7 we present the integrated luminosity (L) needed to get the upper bounds on the Br(t → qh) at 95% C.L. obtained from the experiments. While in Table.8 we present the L needed to get the upper bounds on the Br(t → qh) at 95% C.L. (or 3σ in [76]) obtained from the phenomenological studies. We thus have an overview of the searching potential of tqh anomalous couplings at the LHeC collider(s) compare to studies at the LHC or linear colliders.

CONCLUSION
In this paper we have investigated an updated analysis on searches for the anomalous flavor changing neutral current(FCNC) Yukawa interactions between the top quark, the Higgs boson, and either an up or charm quark(tqh, q = u, c). We probe the observability of the FCNC top-Higgs couplings through the process e − p → ν et → ν e hq at the Large   LHeC with a 60 GeV electron beam and 1000 fb −1 luminosity, the expected limit on Br(t → qh) can be probed down to 0.113(0.093)% with the Cut-based(MVA based) analysis at the 95% confidence level. For the 50 TeV LHeC, these limits can be probed down to 0.03(0.022)%. Here the B-tagging efficiencies, charm and light jet mis-identification rates are chosen to be 0.6, 0.1, and 0.01 respectively. We also quantify the impact of light, charm jet mis-identification rates and B-tagging efficiencies on the discovery potential. For instance, for b = 0.6, c = 0.01 and light = 0.001, the signal significance can be improved to be larger than SS ∼ 20 at 50 TeV LHeC. The limits are expected to improve if the MVA framework is applied. We also present a detailed comparison between our study and the critical limits obtained by the experiments and by some other phenomenological studies. Our results are better than the most stringent constraint of Br(t → qh) < 0.56% [41] and Br(t → qh) < 0.79% [42][43] at 95% C.L. from the CMS and ATLAS Collaborations. Furthermore, our results are comparable with those of other phenomenological studies. We thus give a overview of searching potential on the anomalous top-Higgs couplings at the LHeC.