Stop the Top Background of the Stop Search

The main background for the supersymmetric stop direct production search comes from Standard Model ttbar events. For the single-lepton search channel, we introduce a few kinematic variables to further suppress this background by focusing on its dileptonic and semileptonic topologies. All are defined to have end points in the background, but not signal distributions. They can substantially improve the stop signal significance and mass reach when combined with traditional kinematic variables such as the total missing transverse energy. Among them, our variable M^W_T2 has the best overall performance because it uses all available kinematic information, including the on-shell mass of both W's. We see 20%-30% improvement on the discovery significance and estimate that the 8 TeV LHC run with 20 fb-1 of data would be able to reach an exclusion limit of 650-700 GeV for direct stop production, as long as the stop decays dominantly to the top quark and a light stable neutralino. Most of the mass range required for the supersymmetric solution of the naturalness problem in the standard scenario can be covered.


Introduction
A main goal of the Large Hadron Collider (LHC) experiments is to understand electroweak symmetry breaking. In the Standard Model (SM), it is achieved by the vacuum expectation value (VEV) of a scalar Higgs field. However, a fundamental scalar field receives quadratically divergent radiative contribution to its mass-squared and suffers from the hierarchy problem. One of the most promising solutions to the hierarchy problem is supersymmetry (SUSY) which introduces a superpartner to every SM field, so that the quadratically divergent corrections to the Higgs mass-squared can be canceled between SM particles and their superpartners. Supersymmetry has been extensively searched for at colliders, and so far we have not found any evidence for it. The latest LHC search results constrain the masses of the gluino and (light generation) squarks in the minimal supergravity [1,2,3] or constrained minimal supersymmetric standard model (MSSM) [4] to be greater than about 1 TeV [5,6]. At face value, it may imply a serious fine-tuning of the electroweak scale if SUSY exists. However, as the largest radiative correction to the Higgs mass in the SM comes from the top quark loop, only the top superpartners (stops) need to be light enough to cancel the top loop contribution [7,8]. The gluino and first two generation squarks can be heavier than 1 TeV without a naturalness problem, at least at one-loop level [7,8,9,10,11,12]. Therefore, searching for the top superpartners at the LHC offers the most important test of whether SUSY provides a natural solution to the hierarchy problem.
Many third generation squark searches at the LHC rely on gluino production, with subsequent decay to stops or sbottoms [13,14,15,16,17]. This is because the production cross section for gluino is much larger than for direct stop or sbottom production, as long as the gluino mass is not much heavier. However, since the naturalness of the electroweak breaking scale does not require gluino to be light enough to be copiously produced at the 7 or 8 TeV LHC, a more robust stop search would only rely on direct stop pair production. In this paper we focus on the stop search in this channel in a standard R-parity conserving SUSY scenario. Here the stop decays to a top quark and the lightest supersymmetric particle (LSP), which is assumed to be a neutralino. Although a light neutralino is not required by naturalness, it avoids the stable charged particle problem and provides a natural candidate for dark matter. The signal we are looking for is tt + E miss T where the missing transverse energy E miss T comes from the pair of neutralino LSP's which escape the detector. We also assume that the mass difference between the stop and the LSP is substantially larger than the top quark mass.
Otherwise the signal will strongly overlap with the SM backgrounds, which would need some different search strategies [18,19,20,21,22].
In fact, the tt + E miss T signal also occurs in many other extensions of the SM that include a dark matter candidate. It is quite natural to mitigate the hierarchy problem with a relatively light top "partner" that decays to the top quark and the dark matter particle. This is possible if the partner is also charged under the symmetry that protects the stability of the dark matter particle.
Examples are the little Higgs models with T -parity [23,24,25,26], models with the exotic fourth generation and dark matter [27], models with gauged baryon and lepton numbers [28], and so on.
Consequently, the same search applies to many different models, but the mass-reach depends on each model's top partner production cross section. Studies of the tt + E miss T signal for new physics have been performed by many groups in various (fully hadronic, single-lepton) channels in recent years [25,27,29,30,31,32,33,34,35].
The ATLAS collaboration at the LHC has done such a search in the single-lepton channel based on 1.04 fb −1 of data [36]. In the single-lepton channel, one W from the top decays leptonically and the other W decays hadronically. Requiring one lepton in the final state suppresses QCD multijet backgrounds tremendously while still retaining a significant W branching fraction. The final state signal consists of four (or more) jets (including two b-jets), one lepton, and missing transverse energy.
Besides the standard transverse momentum p T and pseudo-rapidity |η| requirements for each object, the two main variables used for separating the signal and backgrounds are the missing transverse energy E miss T and the transverse mass M T constructed from the lepton and E miss T [36]. The signal events are expected to have large E miss T from the top-partner decays, along with a neutrino from the leptonic W . A hard cut on E miss T very effectively reduces the SM backgrounds. The cut on M T removes the backgrounds where the E miss T is mostly due to a single neutrino from a W decay because the M T distribution has an end point at M W in such cases. The existing ATLAS study focused on a fermionic top partner and can exclude this partner's mass up to 420 GeV [36]. There was no sensitivity to the SUSY stop with this limited amount of data because of the much smaller cross section for the scalar particles. Last year's run already delivered more than 5 fb −1 of data. This year, the LHC is expected to deliver even more luminosity at a higher center of mass energy of 8 TeV. Given the importance of the stop (and other top-partner) search, it is desirable to extend the mass reach using current and future data.
The single-lepton channel analysis of the ATLAS top partner search paper [36] found that the largest background remaining after the their cuts on E miss T and M T is the dileptonic tt. In these background events, both W 's decay leptonically, but one of the leptons is not reconstructed, is outside the detector acceptance, or is a τ lepton (which may be misidentified as a jet). Each event contains at least two neutrinos that can produce a large E miss T and also make it easier to pass the M T cut. The additional jets come from QCD initial state radiation (ISR). The next-to-largest background comes from the semileptonic tt and W +jets. The other backgrounds are small after the E miss T and M T cuts.
To improve the search reach, we designed kinematic variables to identify tt backgrounds, focusing on its decay topology. We find that the signal significance and mass reach can indeed be substantially improved with the help of these variables. This paper is organized as follows. In the next section we discuss several such variables. In section 3, we compare the performances of the basic set of cuts and cuts including the new variables, identifying an economical set.

Kinematic Variables for the tt Backgrounds
We study the LHC search for the pair-production of stops, pp →t 1t * 1 , witht 0 1 → t +χ 1 . 1 We focus on the signal's one-lepton decay channel, in which one top quark decays leptonically t → W + b → ℓ + ν ℓ b and the other one decays hadronicallyt →bjj (also the other way around). The signal contains four jets, one lepton, and missing transverse energy. According to the latest ATLAS tt + E miss T search [36], the largest SM background after the E miss T and M T cuts is tt in the dileptonic channel with one lost lepton and two additional jets from ISR that fake the hadronic W . In this section we try to identify some kinematic variables, based on these background event topologies.
Before we discuss the new kinematic variables, we first examine the distributions of the signal and main backgrounds in some traditional kinematic variables. In addition to the total missing transverse energy E miss T and the transverse mass M T 2 used in the ATLAS analysis [36], we also include the often-used effective mass m eff which is defined as the scalar sum of the four leading jet p T 's, the lepton p T and E miss T . Signal and background events are generated using MadGraph5 [37], and showered in PYTHIA [38]. We use PGS [39] to perform the fast detector simulation, after modifying the code to implement the anti-k t jet-finding algorithm with the distance parameter R = 0.4 [40]. We simulated the events at 7 TeV center of mass energy so that we can cross check our results with the ATLAS paper [36]. 3 The signal production cross section is normalized to be the value calculated at NLO+NLL [41], and the background production cross section for tt is normalized to be the value σ tt (m = 173 GeV, 7 TeV) = 163 +7+9 −5−9 pb, calculated approximately at NNLO [42]. In our studies, the leptonic decays of the top quarks contain τ ± leptons. We adopt the same basic selection cuts on the objects in the final state as in Ref. [36] by requiring exactly one isolated electron or muon. 1 We focus on the light mass eigenstates, and calculate reach based on 100% decay to t +χ 0 . 2 The transverse mass is defined by the formula MT = 2p ℓ T E miss where p ℓ T is the pT of the leptons and φ ℓ and φ E miss   [43,44] can be a natural variable to identify this type of background event. (M T 2 has been proposed to reduce tt and W + W − backgrounds in the di-lepton search channel [45,46].) The M T 2 for a given event can be interpreted as the minimal mother particle mass compatible with the postulated event topology and an assumed daughter particle mass [47]. The M T 2 is bounded from above by the mass of the mother particles in the decay chains if the assumed mass for the daughter particles is equal to (or less than) their true mass. By looking at the diagram in Fig. 2, we can define M T 2 and its generalizations or variations with the top quark as the mother particle for our backgrounds. Our observables for the leading leptonic background are the 2 b-jets + one lepton + E miss T subsystem. In fact, the next-to-leading dominant semileptonic tt background also contains exactly the same subsystem if one disregards the jets from the W decay, so they may be used to bound this background too. On the other hand, thett * signal has the additional missing energy source from the missingχ particles. Consequently the corresponding variables can take larger values.
In all M T 2 -type variables, a minimization is performed over all possible ways of dividing E miss T between the two decay chains. More explicitly, the minimization is over all possible pairs of 4-momenta, each with an assumed mass, whose vector sum has transverse components that match E miss T . The difference between variables comes in the assignment of visible and missing momentum to the two decay chains, along with invariant mass or M T constraints imposed on the hidden 4-momenta. In the following, we define three M T 2 -type variables with background endpoints roughly at the top mass.
These new variables are not expected to be completely independent, so their performances will be evaluated in the next section.
The first variable is basically the M T 2 of the tt → bW +b W − subsystem, which is denoted as M b T 2 . Interpreted in the original M T 2 context, it assumes a "missing on-shell W " on each side of the decay chain. Since the lepton momentum results from the W decay, we add it to the E miss T . It is defined as where the W mass is assigned for both p T 1 and p T 2 and jet masses of p b 1 and p b 2 are calculated from To select the two candidate b-jets, we divide all events into three categories. The first category contains exactly two b-tagged jets in the four leading jets of p T , and we can use Eq. (1) directly. For the second category containing exactly one b-tagged jet, we choose the two leading non-b-tagged jets as the other b-jet candidate and take the smaller of the two M b T 2 's. For the third category with zero, three, or four b-tagged jets, we assume that the two candidate b-jets are contained in the leading three jets and we ignore b-tagging information. There are three different combinations, among which we take the smallest as the final value of M b T 2 . For the second variable, we do not add the observed lepton momentum to E miss T . Instead we define an asymmetric M T 2 [48,49] by combining the 4-momenta of the lepton and a b-jet into one effective particle. The missing neutrino on that side is treated as massless. On the other side, the visible particle is the other b-jet (with its mass calculated from its four-vector), and the invisible particle is an on-shell W . This variable is defined as The two b-jet candidates are chosen by the same procedure as in the previous case. There are two ways to pair the lepton with one of the two b-jets, and the combination which produces a smaller M bℓ Each of the two M T 2 variables defined above did not fully utilize the information available for the background event topology: the two intermediate W bosons are on-shell and one of them produces the observed lepton together with a neutrino. We can define a new kinematic variable as the minimal mother particle mass (the top quark mass in this case) which can be compatible with all the transverse momentum and mass-shell constraints of that topology for a given event. Here, the top quark mass is not explicitly used, only implicitly bounded by the event. This is in the same spirit as interpreting M T 2 as the minimal mother particle mass compatible with the minimal kinematic constraints [47] except that all mass-shell constraints on the cascade decay chain are used 4 . One might expect to get a variable 4 The mass-shell constraints are not sufficient to fully reconstruct each event.
which is more sensitive to this background topology because of the additional kinematic information applied in the definition. Specifically, the variable M W T 2 (where the superscript W represents the onshell intermediate W information is included when combining lepton and neutrino) can no longer be cast into the "maximum of two side's M T " form, but is instead defined directly as the minimization 5 M W T 2 = min m y consistent with:  The diagram, along with signal and background distributions are shown in Fig. 5. We use the same method as before to pick the two b-jets, and a method similar to that for M bℓ T 2 is used to choose which b-jet gets paired with the visible lepton. Calculating this variable can be done efficiently in a similar way as the M T 2 calculation in Ref. [47] by generalizing the method there to this case. For perfect measurements, this variable for the dileptonic tt backgrounds is less than the true top quark mass since the top mass should be compatible with all background events. On the other hand, the signal events do not need to satisfy such a bound, because of its different topology and additional missing massive particlesχ. For some of the signal events we may not even be able to find a compatible mass because we apply the variable to a wrong topology with the wrong mass-shell conditions. The background distributions indeed lie mostly below the top quark mass, while a significant number of signal events have no solution below 500 GeV and they are included in the last bin.
One can see from the plots in Figs. 3, 4, and 5 that a cut on these variables around the top quark mass could be an effective way to suppress the main background. It is not clear a priori which one will have the best performance when the experimental smearing and detector resolution effects are taken into account, and whether there is still enough independent information among them so that a combination of them can give some further improvement. In the next section we will make a critical comparison of the performances of these variables and their combinations.

Performances of New Kinematic Variables
To quantify the power of these kinematic variables, we optimized a simple cut-and-count experiment involving three stop masses (400, 500, 600) GeV, with the neutralino mass being fixed at 100 GeV and 100% branching ratio of stop decaying to top plus neutralino. We simulated the signal and background events at 7 TeV to compare with the existing ATLAS study. Although the LHC will run at 8 TeV this year, the relative performance of each kinematic variable will not be affected much. We will comment on the 8 TeV case in the next section.
We translate this probability into a gaussian-equivalent significance (σ) in terms of standard deviations. 6 This approaches S/ √ B for large signal and background, but by handling the small-number statistics, we avoid extreme cuts. By finding cuts that maximize this significance reach, we can estimate the power of of any set of kinematic variables.
To evaluate the performances of the new kinematic variables defined in the previous section, we include them with a basic set of cuts on (E miss T , M T , m eff ). Within the basic set of variables, the E miss T is most powerful in discriminating the signal and the background. A cut on M T > 150 GeV is imposed to remove the W +jets background, and after this cut, the semileptonic tt background is virtually eliminated. We found that further increasing the M T cut will hurt the signal significance,  Other than improvement on the discovery sensitivity via S/ √ B, we also note that from Tables 1 and   2 the improvement on S/B is even more dramatic after including the new variables defined in this Minimum Cuts   Table 2: Cuts optimized for significance to discover 500 GeV and 600 GeV stops with 100 GeV neutralinos for 20 fb −1 at 7 TeV. Again, all runs began with E miss T > 150 GeV and include a fixed M T > 150 GeV cut (not shown), where there are 2115 and 1938 simulated events for 500 GeV and 600 GeV stops and the same number of background events as in Table 1. Cuts on E miss T and M W T 2 still do almost as well as optimization over all variables, but here these additional cuts can improve S/B. paper. So, the systematic errors for the actual experimental searches can further reduced.

Minimum Cuts
We also tried a few small variations of these new variables and did not obtain better results.

Conclusions
The LHC will be running at 8 TeV in 2012 and is anticipated to achieve a larger integrated luminosity.
We expect an even higher mass reach for the stop search compared with the numbers obtained in the previous section based on 7 TeV. To estimate the exclusion or discovery sensitivity of the stop with a 20 fb −1 luminosity, we calculated the stop signal cross sections at tree level using MadGraph5 and applying the same K-factor at the 7 TeV LHC to take into account the QCD NLO corrections. The same procedure is applied to the tt backgrounds to obtain the approximate NLO production cross section at the 8 TeV LHC. The total tt cross section is calculated to be 231.8 pb. With the help of the new kinematic variables discussed in this paper together with the basic variables E miss T , M T and m eff , we found that for mt = 650 GeV and m χ 0 = 100 GeV with Br(t → t + χ 0 ) = 100%, the stop can show up at the 4σ level if we ignore the systematic errors. The 95% C.L. exclusion reach can go up to around 700 GeV. If there is no excess found in the 8 TeV run, it will dent the hope of a non-fine-tuned SUSY solution to the hierarchy problem, unless the stop has some more exotic signatures in some non-standard scenarios, such as degenerate spectrum or R-parity violation, etc.
In this paper we have focused on suppressing the tt backgrounds for the search of direct stop production. However, the Standard Model tt production is a major background for a wide range of new physics searches at the LHC. The kinematic variables proposed here could also be useful in improving searches for other new physics where the tt constitutes the main background with a large missing transverse momentum for the signals.
Comparing the performances of different variables shows that in general the more kinematic information a variable contains, the more discriminant power it can possess. For more specific new physics searches where both the signal and main background event topologies are known, it is worth designing kinematic variables which carry as much information of the signal and/or the background events as possible to achieve the maximal discrimination between them. The strategy of constructing new kinematic variables discussed in this paper could be readily generalized to other cases.