Exotic Higgs Decay $h\rightarrow\phi\phi\rightarrow 4b$ at the LHeC

We study the exotic decay of the 125 GeV Higgs boson ($h$) into a pair of light spin-0 particles ($\phi$) which subsequently decays and results in a $4b$ final state. This decay mode is well motivated in the Next to Minimal Supersymmetric Standard Model (NMSSM) and extended Higgs sector models. Instead of searching at the Large Hadron Collider (LHC) and the High Luminosity Large Hadron Collider (HL-LHC) which are beset by large Standard Model (SM) backgrounds, we investigate this decay channel at the much cleaner Large Hadron Electron Collider (LHeC). With some simple selection cuts this channel becomes nearly free of background at this $ep$ machine, in stark contrast with the situation at the (HL-)LHC. With a parton level analysis we show that for the $\phi$ mass range $[20,60]GeV$, with $100\,fb^{-1}$ luminosity the LHeC is generally capable of constraining $C_{4b}^2\equiv\kappa_{V}^2\times\text{Br}(h\rightarrow\phi\phi)\times\text{Br}^2(\phi\rightarrow b\bar{b})$ ($\kappa_{V}$ denotes the $hVV(V=W,Z)$ coupling strength relative to the SM value) to a few percent level ($95\%$ CLs). With $1\,ab^{-1}$ luminosity $C_{4b}^2$ at a few per mille level can be probed. These sensitivities are much better than the HL-LHC performance and demonstrate the important role expected to be played by the LHeC in probing exotic Higgs decay processes, in addition to the already proposed invisible Higgs decay channel.


Introduction
The discovery of the 125 GeV Higgs boson (denoted as h) [1,2] not only deepens our understanding of the mechanism of electroweak symmetry breaking but also opens new avenues for searching for physics beyond the Standard Model (SM) which is required to clarify the unexplained theoretical and observational issues such as the problem of naturalness, the existence of dark matter and the observed baryon asymmetry of the universe. One of such avenues is exotic Higgs decay 1 , which is only loosely constrained by Higgs signal strength measurements. The combination of ATLAS and CMS Run I results constrains undetected Higgs decay branching ratio to be smaller than about 34% at 95% C.L. assuming κ V ≤ 1 [4](κ V denotes hV V (V = W, Z) coupling strength relative to SM assuming κ V ≡ κ W = κ Z ). The ultimate sensitivity on undetected Higgs decay branching ratio via indirect measurements at the High Luminosity Large Hadron Collider (HL-LHC) is estimated to be O(5 − 10%) [5]. On the other hand, due to the expected extremely narrow width of the Higgs boson, even a rather weak coupling between it and any new light degrees of freedom can naturally induce a sizable exotic decay branching fraction. One such possibility is h → φφ, where φ denotes a light spin-0 particle, with mass less than about 62.5 GeV so that this decay channel is kinematically allowed. φ can be CP-even or CPodd, or even a CP-mixed state. If its mass is greater than 2m b , then in most models which approximately obey Yukawa ordering φ will mainly decay to bb. This decay channel is well motivated in a wide class of Beyond the Standard Model (BSM) theories [5], such as the Next to Minimal Supersymmetric Standard Model (NMSSM), Higgs singlet extension of the SM, general extended Higgs sector models [6], and Little Higgs Models. Quite a few phenomenology studies already exist with respect to this channel at the LHC [7,8,9,10,11], with or without using jet substructure techniques. Due to large QCD backgrounds in gluon fusion and vector boson fusion channels, the LHC searches generally focus on the VH associated production channel. However this channel suffers from large top quark backgrounds. A recent ATLAS analysis [12] using 3.2 fb −1 13 TeV data made the first attempt to constrain this channel using WH associated production but the sensitivity is currently quite weak (even Br(h → φφ → 4b) = 100% cannot be constrained assuming κ V = 1).
The not-so-clean hadron-hadron collision environment motivates us to consider better places to search for this exotic Higgs decay channel. Here we consider using the Large Hadron Electron Collider (LHeC) [13] to explore h → φφ → 4b. The LHeC is a proposed leptonhadron collider which is designed to collide a 60 GeV electron beam with the 7 TeV proton beam of the HL-LHC. It is supposed to run synchronously with the HL-LHC and may deliver an integrated luminosity as high as 1000 fb −1 [14]. The electron beam may have −0.9 polarization [14]. It is worth noticing that with such high collision energy and luminosity, the LHeC indeed becomes a Higgs boson factory [14]. With Higgs boson production cross section of about 200 fb −1 , the LHeC will provide amazing opportunities for precision Higgs physics, due to the fact that major QCD backgrounds will be much smaller than LHC and the complication due to pile-up will be greatly reduced. Previous studies on Higgs physics at the LHeC include measuring bottom Yukawa coupling [15,16,13], anomalous gauge-Higgs coupling [17,18,19], invisible Higgs decay [20] and MSSM Higgs production [21]. Studies on charm Yukawa measurements has been reported in [22]. The impact of double Higgs production at the higher energy ep collider FCC-he on Higgs-self coupling measurement has also been studied [23,24].
To quantitatively estimate the sensitivity of the LHeC to the exotic Higgs decay h → φφ → 4b, we perform a parton level study for the signal and background in the next section. The signal definition depends on the required number of b-tagged jets. Here for simplicity and a clear identification of signal we require tagging at least 4 b−tagged jets. We provide the expected LHeC sensitivity for φ mass between 15 GeV and 60 GeV and investigate the robustness of our results under variation of b-tagging performance and pseudorapidity coverage. We also translate our results into the expected exclusion power in the parameter space of the Higgs singlet extension of the SM. In the last section we present our discussion and conclusion.

Collider Sensitivity
The exotic Higgs dacay h → φφ → 4b can be simply characterized by the following effective interaction Lagrangian for a new real scalar degree of freedom φ, In the above v = 246 GeV. λ h and λ b are real dimensionless parameters and L φ decay,other denotes the part of Lagrangian which mediates the decay of φ into final states other than bb. The part of Lagrangian L ef f − L φ decay,other has been taken as CP-even without loss of generality. New physics may also modify hV V (V = W, Z) coupling which affects the Higgs production rate and kinematics. We assume the hV V (V = W, Z) coupling is purely CP-even. Assuming narrow width approximation is valid for both h and φ, we can express the collider reach for h → φφ → 4b via the following quantity for a given value of the φ mass m φ . There are two major Higgs production channels at the LHeC: charged current (CC) and neutral current (NC). Due to the accidentally suppressed electron NC coupling, NC Higgs cross section is much less than that of CC [16]. Therefore in the following we only focus on CC process, although in a more detailed analysis the NC process should also be included to enhance the overall statistical significance. The signal process of CC Higgs production is The corresponding Feynman diagram is shown in Fig. 1. The signal signature thus contains at least 5 jets (in which at least 4 jets are b-tagged) plus missing transverse energy.
The backgrounds can be classified into charged current (CC) deeply inelastic scattering (DIS) backgrounds and photoproduction (PHP) backgrounds. From a parton level point of view, CC DIS backgrounds all have genuine / E T in the final state which comes from neutrinos produced in the hard scattering or decay of heavy resonances (W, Z, t, h). When it comes to PHP backgrounds, only PHP production of heavy resonances (W, Z, t, h) could produce such genuine / E T . However, these PHP processes (which involve the on-shell production of heavy resonances (W, Z, t, h)) are found to be negligible in the total background. On the other hand, PHP multijet production (including heavy flavor jets) could produce / E T only via energy mismeasurement and neutrinos from hadron decay, which means that it could be suppressed efficiently through a sufficiently large requirement of / E T .
The CC backgrounds can be further classified according to the number of heavy resonances (W, Z, t, h) produced which further decay to result in a large number of b-tagged jets. We found that if in one process the number of heavy resonances involved is greater than or equal to two, then its contribution to the total background is always negligible. Therefore in the following we only consider the following CC backgrounds: CC multijet, CC W +jets, CC Z+jets, CC t+jets, CC h+jets. Here "multijet" and "jets" contain jets of all flavor (g, u, d, s, c, b). Higgs decay to 4b via SM processes is also included as a background, in CC h+ jets. Fig. 2 displays representative Feynman diagrams for these background processes.
To simulate the signal and backgrounds, we implement the effective interaction in Eq. (1) into Feyn-Rules [25]. The generated model file together with the SM is then imported by MadGraph5 aMC@NLO [26]. The Higgs boson mass is taken to be m h = 125 GeV. The φ mass is scanned in the region [15,60] GeV with 1 GeV step size. The collider parameter is taken to be E e = 60 GeV, E p = 7 TeV with electron beam being −0.9 polarized. The signal and background samples are generated by MadGraph5 aMC@NLO at leading order with NNPDF2.3 LO PDF [27] and the renormalization and factorization scale is set dynamically by MadGraph default. The NLO QCD correction to the signal process are known to be small [28]. In the following we take all the signal and background K-factors to be 1 although we expect the correct background normalization could be obtained from data. We apply jet energy smearing according to the following energy resolution formula where α = 0.45 GeV 1/2 , β = 0.03 [13]. We consider the following four scenarios of b−tagging performance for jets with p T > 20 GeV: ( b denotes the efficiency of b−jet, while c and g,u,d,s denote the faking probability of c−jet and g, u, d, s−jet respectively) The LHeC detector (including the tracker) is expected to have a very large pseudorapidity coverage [29] and therefore we assume the b−tagging performance listed above is valid up to |η| < 5. We will also show the expected sensitivities with smaller b−tagging pseudorapidity coverage |η| < 4 and |η| < 3 which turn out to change only slightly compared to the |η| < 5 case. Event analysis is performed by MadAnalysis 5 [30]. The event selection in the 4b-tagging case first requires at least five jets satisfying the following basic cuts: Events with additional charged leptons are vetoed. To suppress the photoproduction background, we exclude events which can be tagged by an electron tagger, and also require: Here E 0 denotes the threshold of transverse missing energy. In the following we take E 0 = 40 GeV as the default choice and assume PHP backgrounds can be accordingly suppressed to a negligible level compared to the total background. This rough estimate of missing energy threshold is inspired by a naive simulation of direct photoproduction j + 4b process. 2 A thorough and 2 According to previous experience [31], in PHP multijet processes the resolved component becomes smaller than the direct component when a hard scale is involved. Therefore as in [16], we do not expect resolved photoproduction j + 4b to be a leading component in PHP backgrounds. For direct photoproduction j + 4b (photon virtuality Q 2 < 1 GeV 2 ), we find a cross section of about 0.9 fb after basic cuts and 4b−tagging requirement, with electron tagging and 4b and 2b invariant mass requirement a cross section reduction by two orders of magnitude could be expected. Because the total CC backgrounds are at 10 −3 fb to a few times 10 −4 fb level depending on m φ , the PHP backgrounds would become negligible if the / E T > E 0 cut and perhaps missing energy isolation cuts could bring down the PHP cross section by another two or three orders of magnitude, which could be achieved for E 0 ∼ 40 − 60 GeV by our current rough estimation, given the situation that the LHeC detector is supposed to have better resolution and coverage than LHC.  Table 1 The cross section (in unit of fb) of the signal and major backgrounds after application of each cut in the corresponding row. Lepton veto and electron anti-tagging is implicit in basic cuts. Signal corresponds to C 2 4b = 1, m φ = 20 GeV. Here we assume b−tagging performance scenario (A) and a b−tagging pseudorapidity coverage |η| < 5.0. E 0 = 40 GeV is assumed except that in the last row for the signal and total background we show in parentheses the values corresponding to E 0 = 60 GeV.  Table 2 The cross section (in unit of fb) of the signal and major backgrounds after application of each cut in the corresponding row. Lepton veto and electron anti-tagging is implicit in basic cuts. Signal corresponds to C 2 4b = 1, m φ = 40 GeV. Here we assume b−tagging performance scenario (A) and a b−tagging pseudorapidity coverage |η| < 5.0. E 0 = 40 GeV is assumed except that in the last row for the signal and total background we show in parentheses the values corresponding to E 0 = 60 GeV. detailed detector simulation of multijet photoproduction would be needed to determine the best E 0 (per-haps in synergy with appropriate missing energy isolation cuts or a cut on the ratio V ap /V p of transverse Table 3 The cross section (in unit of fb) of the signal and major backgrounds after application of each cut in the corresponding row. Lepton veto and electron anti-tagging is implicit in basic cuts. Signal corresponds to C 2 4b = 1, m φ = 60 GeV. Here we assume b−tagging performance scenario (A) and a b−tagging pseudorapidity coverage |η| < 5.0. E 0 = 40 GeV is assumed except that in the last row for the signal and total background we show in parentheses the values corresponding to E 0 = 60 GeV. energy flow anti-parallel and parallel to the hadronic final state transverse momentum vector [32]), which is however beyond the scope of the present paper. In the cut flow tables below we will also show the signal and total background in the case that E 0 need be increased to 60 GeV. Then we impose the 4b-tagging requirement: At least 4 b-tagged jets in |η| < 5.0 The 4 b-tagged jets which have the closest invariant mass to m h are required to have their invariant mass m 4b lie in the following mass window: Finally we utilize the event structure of the signal: for the 4 b−tagged jets picked out in the previous step, we group them into two pairs such that the absolute value of the invariant mass difference between these two pairs is smallest among all grouping possibilities. Then we require the invariant masses of these b−jet pairs both lie in the following mass window: Here m 2b,i , i = 1, 2 denote the invariant mass of the two "correctly" grouped b−jet pairs, respectively. We present cut flow tables (Table 1, Table 2, Table 3) for three benchmark masses m φ = 20, 40, 60 GeV under b−tagging performance scenario (A). Only CC backgrounds are listed because PHP backgrounds are expected to be negligible due to electron tagging and an appropriate missing energy requirement. For the decay of t, W, Z in backgrounds, the following two cases are both considered and included in our results. One is the decay to a minimal number of partons, i.e. t → bqq, W → qq, Z → qq, with each parton identified as one jet. The other case is that one additional bb pair is radiated from the decay products of t, W, Z. For t+jets the second kind of process is found to contribute sizably to the total background. For the h+jets background, only h → bb and h → 4b via tree-level SM processes are considered. Due to limited Monte Carlo statistics, there are slight differences among the three tables for the first four cuts on backgrounds. The cross section numbers shown in the tables correspond to the default choice E 0 = 40 GeV, except that in the last row of each table for the signal and total background we show in parentheses the final cross sections corresponding to E 0 = 60 GeV. From the tables it can be concluded that the h → φφ → 4b channel at the LHeC is almost background free-with 100 fb −1 luminosity the expected number of background events is at most O(0.1) while the remaining signal cross section is O(1 fb) for C 2 4b = 1. This is in sharp contrast to the situation at the (HL-)LHC where the signal is buried in large top quark backgrounds. Fig. 3 shows the expected 95% CLs [33,34] exclusion limits and 5σ discovery reach at the LHeC for the C 2 4b quantity in the mass range [15,60] GeV assuming 100 fb −1 and 1 ab −1 luminosity. Various b−tagging performance scenarios are considered in the plots, all assuming a b−tagging pseudorapidity coverage |η| < 5. Because the expected number of background events is quite small, in setting exclusion limits and discovery reach we use exact formulae of the Poisson distribution for a discrete random variable. This leads to some small discontinuities at certain m φ values when the expected limits/reach are interpreted as the limits/reach for the median of background-only or signal plus background hypothesis, as can be seen from the plots. From Fig. 3 it can be easily seen that for m φ in the [20,60] GeV range the LHeC with 100 fb −1 luminosity is capable of probing C 2 4b to a few percent level while with 1 ab −1 luminosity the LHeC will eventually probe C 2 4b down to a few per mille level, both at 95% CLs. We note that for m φ = 20, 40, 60 GeV, the 95% CLs upper limit on C 2 4b is about 0.3%(0.5%), 0.2%(0.4%), 0.1%(0.2%) 15    respectively, for b−tagging scenario (A), assuming E 0 = 40 GeV(60 GeV). The result is generally insensitive to mistag rates of c−jets, because with the requirement of at least 4 b−tagged jets, fake backgrounds do not contribute much to the total background. On the other hand the final signal rate is approximately proportional to the fourth power of the b−tagging efficiency, thus it will be relatively important to maintain a high b−tagging efficiency to retain more signal events. As can be expected, the sensitivity drops quickly when m φ becomes smaller than about 20 GeV due to the collimation of φ decay products that renders the resolved analysis inefficient. A jet substructure analysis is needed to improve the sensitivity in this mass region, which we leave for future study. On the other hand the sensitivity improves as m φ increases from about 40 GeV to 60 GeV. This is mainly because the b−jets from h → φφ → 4b decay in the m φ = 60 GeV case are more likely to pass the basic cuts (especially, the p T j > 20 GeV cut) compared to the m φ = 40 GeV case. Fig. 4 also shows the expected 95% CLs exclusion limits and 5σ discovery reach at the LHeC for the C 2 4b quantity in the mass range [15,60] GeV assuming 100 fb −1 and 1 ab −1 luminosity. Here b−tagging performance is fixed to scenario (A) but various b−tagging pseudorapidity coverage conditions are considered. The plots indicate that the sensitivity reach of the LHeC for this channel is not very sensitive to b−tagging pseudorapidity coverage.

Constraints on the Higgs Singlet Extension of the SM
We now consider the interpretation of the expected sensitivity of the LHeC in the context of Higgs singlet extension of the SM. For simplicity we consider the Higgs singlet extension studied in [35]. In this model, an additional real singlet scalar S is added to the SM. The Lagrangian of the Higgs kinetic and potential terms is extended to the following form: with scalar potential Here Φ denotes the original SM Higgs doublet. The scalar potential obeys a Z 2 symmetry. We allow S to acquire vacuum expectation value and express the Higgs fields in unitary gauge as Here v = 246 GeV ensures the correct mass generation for W, Z bosons and SM fermions. The gauge eigenstatesh, h can be related to mass eigenstates φ, h via an orthogonal rotation Now it is convenient to parameterize the model in terms of five more physical quantities: (m φ , m h are masses of φ and h respectively) The translation formulae between these quantities and original parameters in the Lagrangian can be found in [35]. We are interested in the case in which the additional Higgs boson is lighter, therefore we fix m h = 125 GeV and allow three parameters m φ , α, tan β to vary. Here we focus on the more interesting region where sin α → −1 which also allows for a special direction tan β = − cot α that results in a vanishing Br(h → φφ) [35]. We consider three benchmark values of m φ (m φ = 20, 40, 60 GeV) and plot the current LEP and LHC constraints and future HL-LHC and LHeC constraints on the tan β − sin α plane, see Fig. 5. Each point is colored according to its C 2 4b value for reference. The factor Br 2 (φ → bb) appeared in C 2 4b definition Eq. (2) is almost at a constant value 0.77 in the mass range m φ ∈ [20, 60] GeV [36]. The deep black regions which corresponds to very small C 2 4b values slightly tilt upwards with the decreasing of | sin α| from about 0.995 to 0.980. In this range these regions just center around the abovementioned special direction tan β = − cot α which makes Br(h → φφ) vanish [35] and renders its vicinity difficult to probe through exotic Higgs decay search. The LEP constraints (green dashed line) come from direct search for additional Higgs bosons and is taken directly from [35]. Points on the right side of the green dashed line is excluded at 95% confidence level. This indicates that LEP search forces the mixing between two Higgses to be very small for the scenario in which there is a light Higgs boson in the mass range (m φ ∈ [20, 60] GeV). In such a case there cannot be sizable deviation of Higgs signal strength due to Higgs mixing. However, the opening of exotic Higgs decay h → φφ could lead to siazble suppression of 125 GeV Higgs signal strengths. The LHC Run I constraints (white solid line) come from the 125 GeV Higgs signal strength measurements [4]. The regions between the two white solid lines for m φ = 20, 40 GeV (and the region below the white solid lines for m φ = 60 GeV case) are allowed by LHC Run I measurements at 2σ level. We translated the HL-LHC projection of the precision of Higgs signal strength measurements [37] into constraints (yellow solid line "HL-LHC ind.") on the parameter space of the Higgs singlet extension of the SM (assuming half theoretical uncertainties, according to [37]). At the (HL-)LHC, h → φφ → 4b can be directly probed via W h associated production, as has been done by ATLAS [12]. However, current constraint from this method is quite weak and even C 2 4b = 1 cannot be bound. We extrapolate the current constraint [12] to 3 ab −1 HL-LHC, with a very optimistic assumption that all selection efficiency can be maintained and all systematic uncertainties scale with the square root of luminosity. The corresponding 95% CLs exclusion limits is plotted as yellow dotted line "HL-LHC dir.(opt.)". It can be seen that even with this very optimistic assumption the sensitivity of the direct search from W h channel is at most comparable to the indirect constraint from the HL-LHC 125 GeV Higgs signal strength measurements. The LHeC 1 ab −1 95% CLs sensitivity is plotted as the red solid lines, assuming b−tagging scenario (A), b−tagging pseudorapidity coverage |η| < 5.0 and E 0 = 40 GeV. The LHeC is expected to exclude region outside the red solid lines if no new physics exists. It is obvious that the LHeC exclusion capability extends to the deep black region which represents very

Discussion and Conclusion
In this paper we studied the LHeC sensitivity to the exotic Higgs decay process h → φφ → 4b in which φ denotes a spin-0 particle lighter than half of 125 GeV. We performed a parton level analysis and showed that with 1 ab −1 luminosity the LHeC is able to exclude C 2 4b at a fer per mille level (95% CLs), when only statistical uncertities are included. To maintain the sensitivity, it is important to choose a b−tagging working point with relatively large b−tagging efficiency. The sensitivity is not very sensitive to the variation of b−tagging pseudorapidity coverage from 3 to 5. Using the Higgs singlet extension of the SM as an illustration, we showed that the LHeC direct search of h → φφ → 4b is the most sensitive probe of much of the parameter space of the model in the future, if no lepton colliders are available. Of course this LHeC search will also deliver significant impacts on the scalar sector of other BSM theories when one of the scalar boson lies in the mass range ∼ (2m b , m h /2).
The analysis presented here can be further improved in several aspects. First is of course a more realistic estimation of the signal and backgrounds including parton shower and more detailed detector effects. Especially for multijet final states a parton shower correctly merged to matrix element will be highly desirable. Secondly, we could further utilize the sample with the requirement of less b−tagged jets or even less reconstructed jets, e.g. three b−tagged jets. This technique has already been used in [12] and is expected to further improve the sensitivity, especially in the first stages of data collection when statistics is small. Thirdly, we have only applied a cut-based analysis with very simple variables. A further multivariate analysis may deliver additional gain in sensitivity. Furthermore, the sensitivity in the m φ < 20 GeV mass range could be improved via a jet substructure analysis, as has been emphasized. Besides these directions of exploration, it should however be emphasized that in the current analysis PHP backgrounds are assumed to be negligible compared to CC backgrounds under the condition discussed in Section 2. A more detailed detector simulation is thus needed to pin down the event selection conditions required to suppress PHP backgrounds. We also note that in the present study systematic uncertainties have not been included. However, since the expected background event number is very small, we expect that the obtained sensitivity (discovery and exclusion reach) would be qualitatively stable against systematic uncertainties, which means that the 1 ab −1 LHeC could still do much better than the HL-LHC with respect to the h → φφ → 4b search.
The exotic Higgs decays constitute an intriguing and important part of Higgs physics which deserve comprehensive theoretical and experimental investigations. Previous attempts and attention have nearly all been devoted to hadron-hadron collisions or e + e − collisions. We demonstrate in this paper that for certain important processes which suffer from large backgrounds in hadron-hadron collisions, it is clearly superior to conduct the search at a concurrent ep collider, if an e + e − machine with sufficient center-of-mass energy is not available. In that case, it is highly expected that the ep machine will play an important role in precision Higgs studies, including the study of exotic Higgs decays like h → / E T [20], h → φφ → 4b and other channels beset by jets or / E T [38,39].