Studies of Spin Effects in Charged Higgs Boson Production with an Iterative Discriminant Analysis at the Tevatron and LHC

We report on detailed Monte Carlo comparisons of selection variables to separate tbH+ signal events from the Standard Model ttbar background using an Iterative Discriminant Analysis (IDA) method. While kinematic differences exist between the two processes whenever m(H+).ne.m(W+), the exploration of the spin difference between the charged Higgs and the W+ gauge boson becomes crucial in the particularly challenging case of near degeneracy of the charged Higgs boson mass with the W+ mass. The TAUOLA package is used to decay the tau leptons emerging from the charged Higgs and W+ boson decays taking the spin difference properly into account. We demonstrate that, even if the individual selection variables have limited discriminant power, the IDA method achieves a significant separation between the expected signal and background. For both Tevatron and LHC energies, the impact of the spin effects and H+ mass on the separation of signal and background has been studied quantitatively. The effect of a hard transverse momentum cut to remove QCD background has been studied and it is found that the spin effects remain important. The separation is expressed in purity versus efficiency curves. The study is performed for charged Higgs boson masses between the W+ mass and near the top mass.


INTRODUCTION
The importance of charged Higgs boson searches has in the recent years been emphasized [1,2,3,4] for LEP, a future International Linear Collider (ILC), the Tevatron and the Large Hadron Collider (LHC), as the detection of a charged Higgs boson would be a definite signal for the existence of New Physics beyond the Standard Model (SM). Charged Higgs bosons naturally arise in non-minimal Higgs scenarios, such as Two-Higgs Doublet Models (2HDMs). A Supersymmetric version of the latter is the Minimal Supersymmetric Standard Model (MSSM). It is a Type II 2HDM with specific relations among neutral and charged Higgs boson masses and couplings, dictated by Supersymmetry (SUSY) [5].
The Tevatron collider at Fermilab is currently in its second stage of operation, so-called Run 2, with a center-of-mass (CM) energy of √ s = 1.96 TeV. This machine will be the first one to directly probe charged Higgs boson masses in the mass range up to m H ± ∼ m t . Starting from 2008, the LHC at CERN will be in a position to confirm or rule out the existence of such a particle over a very large portion of both the 2HDM and MSSM parameter space, m H ± < ∼ 400 GeV, depending on tan β, the ratio of the vacuum expectation values of the two Higgs doublets (see the reviews [6,7,8] and a recent study [9]).
At present, a lower bound on the charged Higgs boson mass exists from LEP [10], m H ± > ∼ m W ± , independently of the charged Higgs boson decay Branching Ratios (BRs). This limit is valid within any Type II 2HDM whereas, in the low tan β region (below about 3), an indirect lower limit on m H ± can be derived in the MSSM from the one on m A (the mass of the pseudoscalar Higgs state of the model): If the charged Higgs boson mass m H ± satisfies m H ± < m t − m b , where m t is the top quark mass and m b the bottom quark mass, H ± bosons could be produced in the decay of on-shell (i.e., Γ t → 0) top (anti-)quarks t → bH + , the latter being in turn produced in pairs via gg fusion and qq annihilation. This approximation is the one customarily used in event generators when m H ± < ∼ m t . Throughout this study we adopt the same notation as in Ref. [11]: charged Higgs production is denoted by qq, gg → tt → tbH ± if due to (anti-)top decays and by qq, gg → tbH ± if further production diagrams are included. In fact, owing to the large top decay width (Γ t ≃ 1.5 GeV) and due to the additional diagrams which do not proceed via direct tt production [12,13,14], charged Higgs bosons could also be produced at and beyond the kinematic top decay threshold. The importance of these effects in the so-called 'threshold' or 'transition' region (m H ± ≈ m t ) was emphasized in Les Houches proceedings [15,16] as well as in Refs. [11,17,18,19], so that the calculations of Refs. [12,13] (based on the appropriate qq, gg → tbH ± description) are now implemented in HERWIG [20,21,22,23] and PYTHIA [24,25]. A comparison between the two generators was carried out in Ref. [11]. For any realistic simulation of H ± production with m H ± > ∼ m t the use of these implementations is important. In addition, in the mass region near the top quark mass, a matching of the calculations for the qq, gg → tbH ± and gb → tH ± processes might be required [25].
A charged Higgs boson with m H ± < ∼ m t decays predominantly into a τ lepton and a neutrino. For large values of tan β ( > ∼ 5) the corresponding BR is near 100%. For m H ± > ∼ m t , H ± → τ ν τ is overtaken by H ± → tb, but the latter is much harder to disentangle from background than the former. The associated top quark decays predominantly into a W ± boson, or at times a second charged Higgs boson, and a b quark. The reaction is then a promising channel to search for a charged Higgs boson at both the Tevatron (where the dominant production mode is qq) and the LHC (where gg is the leading subprocess). If the H ± → τ ν τ decay channel is used to search for Higgs bosons, then a key ingredient in the signal selection process should be the exploitation of decay distributions that are sensitive to the spin nature of the particle yielding the τ lepton (H ± in the signal or W ± in the background), as advocated in Refs. [26,27,28,29] (see also [30,31]). The τ spin information affects both the energy and the angular distribution of the τ decay products.
In the search for a charged Higgs boson signal containing a τ lepton, not only the magnitude of the production cross section is important, but also the efficiency of identifying the τ lepton in the hadronic environment plays a crucial role. Since τ leptons have a very short life-time (∼ 10 −6 s), they decay within the detectors and can only be identified through their decay products. In about 35% of the cases they decay leptonically and about 65% of the times they do so hadronically. Both of these decay modes are usually addressed in charged Higgs boson searches by employing dedicated τ lepton triggers. The identification of taus in hadronic pp collisions has recently been studied, e.g. Z → τ + τ − events [32] and further details are given in [33].
It is the purpose of this note to outline the possible improvements that can be achieved at the Tevatron and LHC in the search for charged Higgs bosons, with mass below the top mass and including the appropriate description of the spin effects in the H ± → τ ν τ decay. In order to quantify the spin effect an Iterative Discriminant Analysis (IDA) method has been applied, which is a powerful tool to separate signal and background, even in cases such as the one presently under study when several selection variables with limited discriminant power are present.

TEVATRON ENERGY
We start by studying charged Higgs production qq, gg → tbH ± with subsequent decays t → bW , H ± → τ ν τ at the FNAL Tevatron with √ s = 1.96 TeV. In the following we analyze hadronic decays of the W ± boson and τ lepton (W ± → qq ′ , τ → hadrons+ν τ ), which results in the signature 2b+2j+τ jet +p miss t (2 b jets, 2 light jets, 1 τ jet and missing transverse momentum). The most important irreducible background process is qq, gg → tt with the subsequent decays t → bW + andt →bW − , one W ± boson decaying hadronically (W ± → qq ′ ) and one leptonically (W ∓ → τ ν τ ), which results in the same final state particles as for the expected signal.

Simulation and Detector Response
The signal process qq, gg → tbH ± is simulated with PYTHIA [24]. The subsequent decays t → bW ± (or its charge conjugate), W ± → qq ′ and H ∓ → τ ν τ are also carried out within PYTHIA, whereas the τ leptons are decayed externally with the program TAUOLA [34,35], which includes the complete spin structure of the τ decay. The background process qq, gg → tt is also simulated with PYTHIA with the built-in subroutines for tt production. The decays of the top quarks and W ± bosons are performed within PYTHIA and that of the τ lepton within TAUOLA.
The momenta of the final b and light quarks from the PYTHIA event record are taken as the momenta of the corresponding jet, whereas for the τ jet the sum of all non-leptonic final state particles as given by TAUOLA is used. The energy resolution of the detector and parton shower and hadronization effects are emulated through a Gaussian smearing (∆(p t )/p t ) 2 = (0.80/ √ p t ) 2 of the transverse momentum p t for all jets in the final state, including the τ jet [3]. As typical for fast simulation studies, no effects of underlying events, are simulated. Events are removed which contain jets with less than 20 GeV transverse momentum 2 , corresponding to about |η| > 3. The transverse momentum of the leading charged pion in the τ jet is assumed to be measured in the tracker independently of the transverse momentum of the τ jet. The identification and momentum measurement of the pion is important to fully exploit the τ spin information. In order to take into account the tracker performance we apply Gaussian smearing on 1/p π t with σ(1/p π t )[TeV −1 ] = 0.52 2 + 22 2 /(p π t [GeV]) 2 sin θ π , where θ π is the polar angle of the π. The missing transverse momentum p miss t is constructed from the transverse momenta of all visible jets (including the visible τ decay products) after taking the modelling of the detector into account. The generic detector description is a good approximation for both Tevatron experiments, CDF and D0.

Expected Rates
For completeness we present a brief discussion of the expected cross section of the charged Higgs boson signature under investigation. The signal cross section has been calculated for tan β = 30 and m H ± = 80, 100, 130 and 150 GeV with PYTHIA, version 6.325, using the implementation described in [25], in order to take the effects in the transition region into account. Furthermore, it has been shown in [11] that the signal cross section for tbH ± agrees with the one from the top-decay approximation tt → tbH ± for charged Higgs boson masses up to about 160 GeV if the same factorization and renormalization scales are used. Thus, we have used everywhere in this study the factorization scale (m t + m H ± )/4 and the renormalization scale m H ± for both signal and background (i.e., those recommended in [25] as most appropriate for the tbH ± signal) 3 , since the primary purpose of our study is to single out variables that show a difference between our W ± and H ± data samples and that this can unambiguously be ascribed to the different nature of the two kinds of bosons (chiefly, their different mass and spin state). In addition, the running b quark mass entering in the Yukawa coupling of the signal has been evaluated at m H ± . This procedure eventually results in a dependence of our background calculations on tan β and, especially, m H ± that is more marked than the one that would more naturally arise as only due to indirect effects through the top decay width. Hence, the cross sections have been rescaled with a common factor such that the total tt cross section is σ prod tt = 5.2 pb [36]. To be more specific, we have first calculated the total cross section σ prod,PYTHIA tt (m H ± ) with the built-in routine for tt production in PYTHIA for all m H ± = 80, 100, 130 and 150 GeV and then calculated from this the respective rescaling factors c(m H ± ) = 5.2 pb/σ prod,PYTHIA tt (m H ± ) for each m H ± . Then we have calculated the background cross section for m H ± = 80 GeV into the final state with the signature 2b + 2j + τ jet + p miss t by enforcing the respective decay channels in PYTHIA using the built-in routine for tt production and multiplied it with c(80 GeV). In the same manner we have calculated the signal cross sections with the PYTHIA routines for tbH ± production by enforcing the respective decay channels in PYTHIA and multiplying with the rescaling factors c(m H ± ) for m H ± = 80, 100, 130, 150 GeV. The resulting cross sections are given in Table 1 before (σ th ) and after (σ) applying the basic cuts p jets t > 20 GeV and the hard cut p miss t > 100 GeV. For the four signal masses, the tbH ± and tt → tbH ± cross section calculations agree numerically.

Event Preselection and Discussion of Discriminant Variables
The expected cross sections of the 2b + 2j + τ jet + p miss t signature are of the same order of magnitude for the signal and background reactions, as shown in Table 1. Thus, the same number of signal and background events is assumed for the analysis of different kinematic selection variables. For the signal 5 · 10 5 events have been simulated with PYTHIA for each charged Higgs mass at the Tevatron energy of 1.96 TeV using the built-in tt routine in the tt → tbH ± approximation, while for the tt background also 5 · 10 5 events have been simulated using the built-in tt routine. Then the basic cuts p jets t > 20 GeV are applied. An additional hard cut on the missing transverse momentum p miss t > 100 GeV is used to suppress the QCD background, as for example demonstrated in Ref. [30]. After the additional anti-QCD cut about 28000 to 42000 signal events, depending on the simulated charged Higgs bosons mass, and about 30000 tt background events remain. Other background reactions, for example W+jet production, are expected to be negligible because they have either a much lower production cross section or are strongly suppressed compared to tt background, as quantified for example in Ref. [30]. In addition to the previous study (based on 5000 × BR(τ → hadrons) events each) [33], the present one applies an IDA method [37] to explore efficiencies and purities. As already mentioned, particular attention is devoted to the study of spin sensitive variables in the exploitation of polarization effects for the separation of signal and background events. • the transverse momentum of the τ jet, p τ jet t ( Fig. 1), • the transverse momentum of the leading π ± in the τ jet, p π ± t (Fig. 2) • the ratio p π ± t /p τ jet t ( Fig. 3), • the transverse momentum of the second (least energetic) b quark jet, p b 2 t (Fig. 4), • the transverse mass in the τ jet + p miss t system, m t = 2p • the invariant mass distribution of the two light quark jets and the second b quark jet, m jjb 2 (Fig. 6), • the spatial distance between the τ jet and the second b quark jet, where ∆φ is the azimuthal angle between the τ and b jet (Fig. 7) and • the sum of the (scalar) transverse momenta of all the quark jets, The distributions of signal and background events are normalized to the same number of 10 4 events, in order to make small differences better visible. The results of the IDA study are shown in Figs. 9 and 10 for the event samples with spin effect in the τ decays for m H ± = 80, 100, 130, 150 GeV and for the reference samples without the spin effect for m H ± = 80 GeV in order to illustrate the spin effect. In all plots of the IDA output variable the number of background events has been normalized to the number of signal events. Two IDA steps have been performed. Figure 9 shows the IDA output variable after the first step, where 90% of the signal is retained when a cut at zero is applied. The signal and background events after this cut are then passed to the second IDA step. Figure 10 shows the IDA output variable distributions after the second step. A cut on these distributions leads to the efficiency and purity (defined as ratio of the number of signal events divided by the sum of signal and background events) combinations as shown in the lower right plot in Fig. 10. These combinations define the working point (number of expected background events for a given signal efficiency) and the latter can be optimized to maximize the discovery potential. The difference between the dashed (no spin effects in τ decay) and solid (with spin effects in τ decay) lines for m H ± = 80 GeV in the lower right plot in Fig. 10 stresses again the importance of the spin effects to separate signal and background.
In order to illustrate the effect of the hard cut on the missing transverse momentum (p miss t > 100 GeV), which is imposed to suppress the QCD background, the final efficiency-purity plot of the IDA analysis is shown in Fig. 11 for m H ± = 80 GeV for two reference samples (red, long dashed: with spin effects in the τ decay; red, dotted: without spin effects) without imposing the hard cut. The black lines (dashed and solid) are for the samples with the hard cut as also shown in the lower right plot in Fig. 10. As expected the achievable purity for a given efficiency decreases with the hard cut, therefore the spin effects become even more important to separate signal and background. In principle, by choosing the signal reduction rates in the previous IDA iterations, the signal and background rates in the final distributions can be varied appropriately. However, we have checked that a different number of IDA iterations and/or different efficiencies for the first IDA iteration have only a minor effect on the final result.
for the tbH ± signal and the tt background for √ s = 1.96 TeV (left) and the respective differences between signal and background (right).

LHC ENERGY
The simulation procedure and the emulation of the detector response are the same as those outlined in Sect. 2.1 for the Tevatron, as well as, for the preselection and IDA method, as described in Sects. 2.3 and 2.4, respectively. Hence, only the expected LHC rates are discussed, followed by the description of changes in the distributions of the variables and the final IDA results.
Unlike the case of the Tevatron, where only charged Higgs masses smaller than the top quark mass can be explored, and 2HDM/MSSM signatures practically rely on τ ν τ pairs only, at the LHC the phenomenology is more varied. Here, the search strategies depend strongly on the charged Higgs boson mass. If m H ± < m t − m b (later referred to as a light Higgs boson), the charged Higgs boson can be produced in top (anti-)quark decay. The main source of top (anti-)quarks at the LHC is again tt pair production (σ tt = 850 pb at NLO) [38]. For the whole (tan β, m H ± ) parameter space there is a competition between the bW ± and bH ± channels in top decay keeping the sum BR(t → bW + ) + BR(t → bH + ) at almost unity. The top quark decay to bW ± is however the dominant mode for most of the parameter space. Thus, the best way to search for a (light) charged Higgs boson is by requiring that the top quark produced in the tbH ± process decays to a W ± . While in the case of H ± decays τ 's will be tagged via their hadronic decay producing low-multiplicity narrow jets in the detector, there are two different W ± decays that can be explored. The leptonic signature bbH ± W ∓ → bbτ νlν provides a clean selection of the signal via the identification of the lepton l = e, µ. In this case the charged Higgs transverse mass cannot be reconstructed because of the presence of two neutrinos with different origin. In this channel charged Higgs discovery will be determined by the observation of an excess of such events over SM expectations through a simple counting experiment. In the case of hadronic decays bbH ± W ∓ → bbτ νjj the transverse mass can instead be reconstructed since all neutrinos are arising from the charged Higgs boson decay. This allows for an efficient separation of the signal and the main tt → bbW ± W ∓ → bbτ νjj background (assuming m H ± > ∼ m W ± ). The absence of a lepton (e or µ) provides a less clean environment but the use of the transverse mass makes it possible to reach the same mass discovery region as in the previous case and also to extract the charged Higgs boson mass. Both these channels show that after an integrated luminosity of 30 fb −1 the discovery could be possible up to a mass of 150 GeV for all tanβ values in both ATLAS and CMS [9,39,40].
If the charged Higgs is heavier than the top quark, the dominant decay channels are H ± → τ ν and H ± → tb depending on tan β. They have both been studied by ATLAS and CMS [41,42,43,44]. The charged Higgs bosons are produced in the pp → tbH ± channel. For the H ± → tb decay, a charged Higgs boson can be discovered up to high masses (m H ± ∼ 400 GeV) in the case of very large tan β values and this reach cannot be much improved because of the large multi-jet environment. For the H ± → τ ν decay mode this reach is larger due to a cleaner signal despite a lower BR. In this case the 5σ reach ranges from tan β = 20 for m H ± = 200 GeV to tan β = 30 for m H ± = 400 GeV.
For the LHC, signal and background events have been simulated in the same way as for the Tevatron as described before, however, without implying any rescaling factor to match a measured tt cross section. Table 2 lists the resulting cross sections before (σ th ) and after (σ) applying the basic cuts p jets t > 20 GeV and the hard cut p miss t > 100 GeV. The LHC rates allow for the discovery to be less challenging than at the Tevatron in the region m H ± ∼ m W ± , yet the separation of signal events from background remains crucial for the measurement of the charged Higgs mass.  The kinematic distributions are shown in Figs. 12 to 19 for √ s = 14 TeV. The choice of variables is identical to the one for the Tevatron and allows for a one-to-one comparison, the differences being due to a change in CM energy (and, to a somewhat lesser extent, due to the leading partonic mode of the production process 5 ). The main differences with respect to Figs. 1-8 are that the various transverse momenta and invariant masses have longer high energy tails. In particular, it should be noted that the effect of the spin differences between W ± and H ± events can be explored very effectively also at LHC energies, e.g. the ratio p π ± t /p τ jet t is shown in Fig. 14 which is very sensitive to the spin effects. These observations lead to the conclusion that the same method using spin differences can be used to separate signal from background at both the Tevatron and the LHC.
The distributions of the IDA output variables for the study at √ s = 14 TeV for two steps with 90% efficiency in the first step are shown in Figs. 20 and 21. These distributions are qualitatively similar to those for the Tevatron (Figs. 9 and 10) and the final achievable purity for a given efficiency is shown in Fig. 21. As for the Tevatron energy a good separation of signal and background events can be achieved with the spin sensitive variables and the IDA method even in case m H ± ∼ m W ± . For heavier H ± masses the separation of signal and background events increases due to the kinematic differences of the event topology.     Fig. 19: Distributions of the total transverse momentum of all quark jets, Hjets = p j 1 t + p j 2 t + p b 1 t + p b 2 t , for the tbH ± signal and the tt background for √ s = 14 TeV (left) and the respective differences between signal and background (right).

CONCLUSIONS
The discovery of charged Higgs bosons would be a clear sign of physics beyond the SM. In this case study we have investigated charged Higgs boson topologies produced at the current Tevatron and LHC energies and compared them against the irreducible SM background due to top-antitop production and decay. While sizable differences between signal and background are expected whenever m H ± = m W ± , near the current mass limit of about m H ± ≈ 80 GeV the kinematic spectra are very similar between SM decays and those involving charged Higgs bosons. In this case, spin information will significantly distinguish between signal and irreducible SM background. In fact, we have considered hadronic τ ν τ decays of charged Higgs bosons, wherein the τ polarization induced by a decaying (pseudo)scalar object is significantly different from those emerging in the vector (W ± ) decays onsetting in the top-antitop case. For a realistic analysis which is not specific for a particular detector, a dedicated Monte Carlo event generation and a simplified multipurpose detector response approximation have been applied. The identification of a hadronic tau-lepton will be an experimental challenge in an environment with typically four jets being present. We have demonstrated how an IDA method can be an applied to separate signal and background when the differences between the signal and background distributions are small. Our results show that the IDA method will be equally effective at both the Tevatron and LHC. While only the dominant irreducible tt background has been dealt with in detail, we have also specifically addressed the QCD background. A suitably hard missing transverse momentum cut has been applied to reject such jet activity and we have demonstrated that although the discriminative power is reduced by such a cut, the reduction is small compared to the gain from including the τ polarization effects. Using the differences in τ polarization between the signal and the dominant SM irreducible tt background is crucial for disentangling the former from the latter.