Buckets of tops

Reconstructing hadronically decaying top quarks is a key challenge at the LHC, affecting a long list of Higgs analyses and new physics searches. We propose a new method of collecting jets in buckets, corresponding to top quarks and initial state radiation. This method is particularly well suited for moderate transverse momenta of the top quark, closing the gap between top taggers and traditional top reconstruction. Applying it to searches for supersymmetric top squarks we illustrate the power of buckets.


Introduction
An important difference between the Tevatron and LHC is that the latter can produce and study top quarks in great numbers [1]. This allows us to investigate all different top production mechanisms in detail, including their QCD structure. After the discovery of a Higgs-like [2,3] resonance, studying its coupling to the top quark will play a particularly critical role in our understanding of the Higgs sector. This is made most obvious in the renormalization group evolution of the Higgs potential to large energy scales [4]. The direct measurement of the top Yukawa coupling clearly hinges on top quark identification and reconstruction. At the same time, we have reason to suspect that new physics that solves the hierarchy problem and lives at sufficiently high energy scales tends to couple strongly to top quarks [5]. This motivates us to search for new physics in the LHC top sample; for example by searching for top pair resonance structures of top pair production associated with missing transverse momentum.
Historically, the study of top pair production has largely been restricted to semileptonic decays of the two tops quarks. The reason is that the lepton effectively removes the overwhelming QCD background. However, purely leptonic top pairs not only come at a much smaller rate, they also include two neutrinos, challenging any analysis based on the observed missing transverse momentum. A major challenge in top physics at the LHC is how to gain access to the purely hadronic decays of top quarks.
Identifying hadronic top decays using a jet algorithm was part of the original proposal of jet substructure analyses [6][7][8][9]. Some of the early jet substructure algorithms were designed to target hadronic top decays [10][11][12][13][14][15]. While Higgs taggers [16] should clearly have a high priority within the LHC experiments, working top taggers are the perfect laboratory to test how well substructure approaches work in practice.
The moment we go beyond searches for heavy resonances the main problem of all top taggers is the size of the initial fat jet. For example, using the Cambridge-Aachen jet of size R = 1.5 as the starting point of the HEPTopTagger [17][18][19][20][21][22][23] limits the momentum JHEP08(2013)086 range of reconstructable top quarks to p T,t 200 GeV. Essentially, all other top tagging approaches require even higher boost. Increasing the size of the fat jet to R = 1.8 raises several QCD and combinatorics issues [24]. The big question in using hadronic top analyses as part of Higgs searches or top partner searches is how to further reduce this top momentum threshold.
In this paper we propose an alternative method for an efficient top reconstruction at moderate momentum. It targets the transverse momentum regime, p T,t = 100 − 350 GeV , (1.1) in the fully hadronic decay mode. Starting from an event with a high multiplicity of jets, we assign all jets into three groups or 'buckets'. The buckets are chosen based on a metric in terms of invariant masses, defining two top buckets and a third bucket containing the extra hadronic activity like initial state radiation (ISR). While initially this search strategy does not prefer boosted top quarks, we will see how such events are eventually preferred from a combinatorics perspective.
In section 2 we start with a simple algorithm for reconstructing tops in buckets. We test this algorithm for hadronically decaying top pairs as well as W +jets and pure QCD jets backgrounds. Additional handles will help us separate the top signal from the backgrounds. In section 3, we modify the simple algorithm to take advantage of the b quarks and W bosons that are present in top decays but not in the QCD backgrounds. This improved bucket algorithm is optimized to efficiently find and reconstruct top pairs with moderate p T . In section 4 we apply our bucket algorithm to stop pair searches. It should be noted that our analysis is intended as "proof of concept," and so we do not consider sub-dominant effects such as jet smearing, QCD scale variation, or small background channels.

Simple bucket algorithm
In this section, we start with a simple algorithm to identify and reconstruct hadronically decaying top pairs. While an improved algorithm will be presented in the next section, this simple version captures many of the key concepts we will employ later. The overall scheme is fairly straightforward: by assumption every jet originates from one of the two tops or from initial state radiation, so we assign every jet to one of three 'buckets'. Jets in buckets B 1 and B 2 correspond to top decays, while all remaining jets are placed in B ISR . We cycle through every permutation of jet assignments to minimize the distance between the invariant masses of the jets in B 1 and B 2 and the top mass. The metric is chosen to ensure that bucket B 1 reconstructs the top mass better than bucket B 2 .
Here and throughout the remainder of the paper, all Standard Model (SM) samples are generated with Alpgen+Pythia [25][26][27]. We do not consider the effects of varying the factorization and renormalization scales on our analysis, though we have verified that even large variations in the overall QCD cross section do not significantly degrade our results. We use matrix-level matching [28] to correctly describe jet radiation over the entire phase space. This includes up to tt+2 jets, W +4 jets and 3 − 5 QCD jets, with the top cross sections normalized to next-to-next-to-leading order [30][31][32][33]. These background samples are the JHEP08(2013)086 same as those used in the HEPTopTagger study of ref. [29], so the resulting efficiencies can be compared directly. Single top background is neglected, as it will contribute only a small number of events after our selection criteria. Jets are reconstructed using the Cambridge/Aachen algorithm [34,35] of size R = 0.5 in FastJet [36][37][38]. Note that all our results are relatively insensitive to the choice of jet algorithm.
All leptons we require to be hard and isolated: p T, > 10 GeV and no track of another charged particle within R < 0.5 around the lepton. We consider only jets with p T > 25 GeV and |η| < 2.5. Our analysis is based on a simple calorimeter simulation with granularity of 0.1 × 0.1 in (η, φ). We sum the 4-momenta of all particles in each cell and rescale the resulting 3-momentum to make the cells massless. We have not applied any smearing algorithm to the reconstructed objects.
Even though the algorithm presented in this section is in principle applicable to events with any number of jets we preselect events with five or more jets to reduce QCD backgrounds. Because we are interested in hadronically decaying tt pairs we veto on isolated leptons. The restricted sample denoted as t hth has a cross section of 104 pb at the LHC with √ s = 8 TeV. One last word concerning underlying event and pile-up: unlike methods involving jet substructure [7][8][9] our bucket reconstruction relies on standard jets with moderately large multiplicities, so aside from jet energy scale uncertainties we do not expect specific experimental or theoretical challenges. We rely on the same trigger strategy as the experimental top pair analysis in ref. [20][21][22][23]. The trigger efficiencies might drop slightly for lower top momenta. However, the current ATLAS analyses for example does not make use of the bottom trigger at all, which leaves some room for future improvement.
Bucket definition. As the goal of the bucket algorithm is to identify tops by sorting jets into categories that resemble tops, we need a metric to determine the similarity of a collection of jets to a top. For simple buckets B i it is where we sum over all four-vectors in the bucket. For each event with five or more jets we permute over all possible groupings of the jets into three buckets {B 1 , B 2 , B ISR }. We then select the combination that minimizes a global metric defined as The factor ω > 1 stabilizes the grouping of jets into buckets. In this work we take ω = 100, effectively decoupling ∆ B 2 from the metric. As a consequence we always find ∆ B 1 < ∆ B 2 , i.e. B 1 is the bucket with an invariant mass closer to that of the top than the invariant mass of bucket B 2 . Other values of ω might eventually turn out more appropriate for different applications.
As the first selection cut we require the invariant masses of both top buckets, B 1 and B 2 , to lie in the window 155 GeV < m B 1,2 < 200 GeV . The lower limit selects events above the Jacobian peak for top decays. We will see that this selection improves the top signal over QCD background S/B by about a factor of two. All buckets passing eq.(2.3) we categorize by their number of jets; buckets including three or more jets (3j-buckets) and those including two jets (2j-buckets). Selecting only events with two 3j-buckets improves the signal-to-background ratio by a factor of five.
Jet selection. For tagging two tops in fully hadronic mode, we might naively require at least six reconstructed jets. In practice, with a threshold of p T,j > 25 GeV this condition is too strict. To improve our efficiency we need to consider the case where one of the jets from top pair decays is missing. It is also worth noting that even requiring six jets does not guarantee that we collect all six decay products of the top pair. Frequently, some of the observed jets come from initial state radiation instead [24].
In the left panel of figure 1 we plot the parton level p T distributions of the six decay partons from the inclusive hadronic top pairs. In the left panel we see that the four hardest decay partons are not affected by the threshold p T,j > 25 GeV. In contrast, the softest distribution only peaks around 25 GeV, so roughly half the events do not pass our threshold on the sixth parton. In the central and right panels we show the normalized distributions of 5 th and 6 th hardest partons for events at least 5 jets passed p T,j > 25 GeV. Note that 5 th parton's p T can be below the threshold even for 5 jets events due to jets from initial state radiation. Table 1 shows the number of events in the hadronic t hth sample after several cuts on the jet multiplicity, and the percentage of events with the 5 th or 6 th parton-level top decay jets above p T,j > 25 GeV. In about a half of events with at least six jets the sixth top-decay parton falls below the p T threshold. Adding the two columns tells us that more than 90% of all events capture five of the six top decay products. Requiring only five instead of six jets increases the fraction of events where we miss only one of the top decay products to almost half. The softer top, p T,t 2 . For a moderate top p T threshold our central values for the efficiencies are not strongly affected, but hadronization as well as detector effects might lead to significant shifts due to the steep p T,j behavior.
W reconstruction. After placing each of the jets in the event into one of three buckets (B 1 , B 2 , or B ISR ) we require the 3j-buckets to contain a hadronically decaying W candidate.
In the rare case of one bucket consisting of more than three jets we combine them into exactly three jets using the C/A algorithm and then look for a W candidate. As in the HEPTopTagger [18,19] we define a mass ratio cut for at least one combination of jets k, in the bucket i. Events with 2j-buckets by construction cannot satisfy eq.(2.4). In addition, in such events one of the W decay jets is typically the softest jet and does not pass the p T threshold, and so the W reconstruction could not occur regardless In our first, naive approach we categorize all events with two valid top buckets into three types: • (t w ,t w ): both top buckets have W candidates as defined by eq.(2.4), • (t w ,t − ) or (t − ,t w ): only the first or second top bucket has a W candidate, The t w or t − status is ordered as (B 1 , B 2 ), where B 1 is defined as the bucket closest in mass to the top. Buckets classified as t w have to be 3j-buckets, while t − buckets can be either 3j or 2j.
To extract hadronic top pair events from the QCD background we can compare the different categories on Monte-Carlo truth level. Starting from S/B ∼ 0.005 after the lepton veto and selecting only (t w ,t w ) events yields the highest value S/B ∼ 0.09. This corresponds to an improvement of S/B by almost a factor 20. To improve beyond this level, we need to require at least one, preferably two, b-tags to control the mostly Yang-Mills and light-flavor QCD background.  Table 2. Numbers of events for simple buckets with one b-tag on each bucket, passing various levels of top reconstruction, as described in the text. Events that do not reconstruct tops to within R 1 , R 2 < 0.5 are not shown, but consist of the remaining percentage of events for each category.

JHEP08(2013)086
b-tags. To further reduce the QCD background we exploit b-tags. We assume b-tagging and mis-tagging efficiencies for light flavors ( b , mis ) to be (70%, 1%), and fully account for combinatorial factors in the background. For the tt+jets signal the effect of mis-tagging is sub-leading and can be ignored.
To avoid combinatorics we could impose b-tagging only for the most likely b-jet in a bucket based on the W condition factors to improve S/B, as suggested in ref. [24]. In this algorithm we do not take this option because it reduces the signal efficiency. We prefer to keep the maximum fraction of signal events especially for the case that both signal and the main background include tt events, such as the top partner searches discussed below.
In any top-tagging algorithm we are interested not only in extracting the signal from backgrounds, but in accurately reconstructing the original top momenta. For a measure of our reconstruction accuracy we use the geometric distance in the (η, φ) plane between the bucket momentum and the closer top parton momentum p t obtained from Monte-Carlo truth, (2.5) We consider our reconstruction successful when R i < 0.5. In the following, we indicate the percentage of events with both buckets reconstructing top momenta (R 1 < 0.5 and R 2 < 0.5), events where only B 1 reconstructs the top momentum (R 1 < 0.5 < R 2 ), and events where only B 2 reconstructs the top momentum (R 2 < 0.5 < R 1 ). The last case allows for events where the second bucket (with its worse top mass reconstruction) actually gives the better top direction.
For (t w ,t w ) events where each bucket contains exactly one b-jet, the top momentum reconstruction is generally good. As seen in to table 2 about 75% of the events reconstruct both top directions well. As expected from the discussion above, a significant fraction of signal events only give (3j,2j)-buckets. When a W candidate is not found, but each bucket contains a b-tag and lies in the top mass window eq.(2.3), the momentum reconstruction is good only for the t w bucket; in these events half of the t w momenta with a W candidate reconstruct the top direction well. All this points to using the b-tag information to improve our reconstruction algorithm. This will be the starting point of the improved algorithm in the next section.

Bottom-centered buckets
In section 2 we have seen that we need at least two b-tags per event to control the QCD background. However, in the simple algorithm, each bucket does not always have exactly one b-jet and the reconstruction is not particularly effective for (t w ,t w ) events. The obvious solution is to define buckets around b-tagged jets, i.e. starting each bucket with the bottom jets (which are usually the hardest jets in the event) and adding light-flavor jets to it. In this section we define buckets starting with the requirement that B 1 and B 2 each have exactly one b-jet, and restrict the possible permutation of jet assignments to B 1 , B 2 , and B ISR accordingly. Other than this, we use the same distance measure defined in eq.(2.1) and eq.(2.2) and select {B 1 , B 2 , B ISR } giving the minimum ∆. Figure 2 shows the bucket masses m B 1 , m B 2 and m ISR . For both tt and QCD samples the m B 1 distributions peak at m t by construction. The distribution is narrower for the signal. The dip in the m B 2 distributions at m t is due to the large weighting factor ω in eq.(2.2), which defines the bucket with mass closest to m t to be B 1 . Compared with the m B 1 distributions, the m B 2 distributions are broad but still tend to peak toward m t .
As mentioned above, the analysis of the top buckets constructed around the b-tagged jets is the same as the simple algorithm described in section 2, including the bucket mass cut in in eq.(2.3). In table 3 we show the corresponding results. Starting with two b-jets improves the number of (t w ,t w ) events by almost 50%. Roughly 70% of (t w , t w )-events reconstruct both tops well, essentially unchanged from the earlier analysis. One kind of events which is now correctly accounted for are cases where the simple algorithm finds two b-jets in the same bucket, and give a bucket mass in the correct range.
Asking for two b-tags within at least five jets at the very beginning produces large combinatorial factors for mis-tagging QCD background events. As a result the backgrounds double in each category and S/B degrades for (t w ,t w ) events.
While there is no obvious way to improve the (t w ,t w ) category of events, table 3 shows that a significant number of events come out (t w ,t − ) and (t − ,t w ), that is, only one bucket contains a W candidate. For these events, the QCD background is not huge, S/B ∼ 3, so we will try to improve our treatment of this fraction of events.  b/jet Buckets. In section 2 we found that it is not rare for the softest top decay jet to fall below the jet p T threshold. Attempts to reconstruct two tops in (3j,3j)-buckets will then fail. In 94% of these cases the softest of the six top decay partons comes from the W decay. Restricted to events where the sixth parton falls below 25 GeV this fraction increases to 98.5%, i.e. whenever the sixth parton is missing the surviving two jets are the bottom and the harder W decay jet. In figure 3 we first show the invariant mass of the b and the harder W decay product m bj 1 at parton level. We see a clear peak and an endpoint

JHEP08(2013)086
GeV [39]. For events where the softer W decay jet falls below the p T threshold the peak becomes more pronounced.
The question is: can we use the predicted peak in the m bj distribution to identify tops in 2j-buckets? If the third missing top decay jet indeed fails the p T threshold we expect the top momentum to be close to the b/jet momentum. The left panels of figure 4 show the difference between the parton-level top momentum and the b/jet system in terms of (p T,bj − p T,t )/p T,bj ≡ ∆p T /p T,bj and ∆R. If we assume an R separation of around 0.5 as a quality measure the result looks promising. Similarly, we should be able to reconstruct the transverse momentum of the top at least at the 20% level without including the softest W decay jet. In comparing our bucket method to top taggers, it should be emphasized that for the bucket method we allow for a missing W decay jet rather than replacing the lighter W decay jet by a QCD jet [24].
For 2j-buckets, which we know do not include all three top decay products, we replace JHEP08(2013)086 The peak value of 145 GeV is read off figure 3 and should eventually be tuned to data. Because t w buckets already reconstruct the top momentum we keep them. For top buckets in the (t w ,t − ), (t − ,t w ), and (t − ,t − ) categories which do not contain a W candidate we re-assign jets replacing eq.(2.1) with the new distance measure. In addition, we need to remove the top mass selection cut eq.(2.3). This way combinations of b quarks and jets which do not fall into the window of eq.(3.1) are kept. The new reconstruction algorithm reads • (t w ,t w ): keep these buckets as is, • (t w ,t − ) or (t − ,t w ): reconstruct the failed bucket using all non-t w jets, minimizing ∆ bj B , • (t − ,t − ): use all jets to minimize ∆ bj B 1 + ∆ bj B 2 .
Note that for reconstructing b/jet-buckets we use jets both from the t − bucket and from the ISR bucket.  Table 4. Number of events reconstructed using the b/jet-buckets for (t w ,t − ), (t − ,t w ) and (t − ,t − ) events. The numbers for (t w ,t w ) events are unchanged from table 3.
Comparing to the original algorithm we have adapted the metric for assigning jet for top buckets in the t − category. What remains is to replace the top mass window in eq.(3.1) with appropriate b/jet values. In the right panel of figure 3 we show the b/jet bucket mass distributions m bj for signal and background. For the signal they agree well with the expectation from the left panel of figure 3. For a top candidate we require at least one b/jet pair satisfying 75 GeV < m bj < 155 GeV . (

3.2)
We show the signal and background efficiencies of this new reconstruction algorithm in table 4, along with the percentage of correct top reconstruction. The numbers need to be compared to table 3. First, we see that the number of events which contain valid top buckets in the correct mass window, albeit including one 2j-bucket, has significantly increased. In the (t w ,t − ) category roughly half of all events reconstruct both tops well, in spite of missing one of the six decay jets. The number of (t − ,t w ) events passing this reconstruction algorithm drops significantly when compared to table 3. Most of these events contain one b-jet and one non-b-tagged jet in B 1 . However, the b-jet in this category of events is typically a merger of a b and the third jet from the top decay. Thus, while the bucket itself has an invariant mass near the top, it contains neither a W candidate nor a b-jet that can be combined with another jet in the event to pass the selection criteria in eq.(3.2). Even in the (t − ,t − ) category where neither of the two buckets include a reconstructed W candidate the fraction of well reconstructed top pairs reaches almost 30%.
To study the quality of the top reconstruction in more detail we show the difference between the bucket momentum and the parton level top momentum in terms of ∆R and ∆p T /p T in the right two panels of figure 4. The buckets constructed around b-jets are shown in black. The results of replacing the t − buckets using the b/jet algorithm are shown in red. In this case we see a narrow peak at zero which corresponds to complete top momentum reconstruction where we fail to find a W candidate due to overlapping jets. Such events -which are in the minority -often fail to pass the reconstruction using the ∆ bj B metric. As a result, the narrow peak at zero is not present in this second reconstruction method.
For t − buckets the b/jet algorithm consistently reconstructs the top direction significantly better than using the original method. In contrast, changing t w buckets to the b/jetbucket does not improve the momentum reconstruction. We checked b/jet-momentum provides better top momentum reconstruction than only using the bottom momentum.  table 4 indicate that the efficiency as well as the background rejection of our algorithm allows for a systematic study of hadronic top pairs. However, the fraction of events with not-quite-perfect reconstruction of the top directions (R i > 0.5 for i = 1, 2), is somewhat worrisome. From top tagging we know that a certain fraction of relatively poorly reconstructed tops cannot be avoided [24], but that fraction should be small. What we need is a self-consistency requirement -or QMM 1 -similar to only accepting reconstructed tops with p T,t > 200 GeV in a top tagger [18,19].

JHEP08(2013)086
Once we identify a top buckets we can use two observables to define such a QMM: the top momentum and the geometric size of the hadronic top decay. The latter is defined differently for t w buckets and t − buckets. In the first case we have access to all pair-wise ∆R distances between the three top decay products. We define R bjj as the maximum of the three ∆R separations of the top decay products. For t − buckets we only have one distance, namely R bj between the bottom and the hardest light-flavor jet.
In figure 5 we show the correlation between these two observables, first for parton level simulations in the left column. For both kinds of buckets we see a clear correlation, with JHEP08(2013)086 the main difference being that most t w buckets have relatively low transverse momenta. For t − buckets, which require the softest top decay jet to fall below p T,j = 25 GeV, the distribution extends to larger transverse momenta where the initial boost of the top can compensate or the decay momentum of the softest jet.
The second column shows the reconstructed observables for t w and t − buckets, requiring that the buckets reconstruct the parton level top direction within R < 0.5. The correlation between size and transverse momentum is the same as expected from simulation. However, we clearly see that either large transverse momenta, p T,t 100 GeV, or small sizes, ∆R bj(j) 2.5, are preferred. This is particularly true for t − buckets. The reason for this is that a slight boost of the top quarks generates a geometric separation of the transverse back-to-back tops and the forward ISR jets. Combinations of jets from different buckets are now separated in their typical transverse mass values. This gives us a handle on combinatorics and improves the top reconstruction even in the case where one of the top decay products is missing.
Conversely, buckets passing as tops but giving a poor directional reconstruction reside at low transverse momenta and large size, as can be seen in the third column of figure 5. To veto these buckets we have a choice of criteria in the two-dimensional R bj(j) vs. p T,t plane. We choose the condition p rec T,t > 100 GeV (3.3) at the level of the buckets to increase the fraction of well reconstructed or matched top quarks in both bucket categories. This choice results in the highest efficiency of wellreconstructed tops in both t w and t − buckets. Alternative conditions in terms of R bj(j) or in the two-dimensional planes shown in figure 5 could replace eq.(3.3) in specific analyses. For example, a stricter cut will result in higher purity of well-reconstructed tops.
To illustrate the power of the bucket algorithm we compute the efficiency for reconstructing a single top as well as a top pair as a function of the transverse momenta of the tops. The left panel of figure 6 shows the efficiency for a bucket tag as a function of the true transverse momentum of the top. The baseline is all fully hadronic tt events in the Standard Model, with five or more jets and two b-tags. A possible mis-measurement of p T,t in particular at low transverse momenta explains the tail of events below the apparent consistency criterion p rec T,t > 100 GeV. We see that the tagging efficiency increases rapidly right at threshold. Above p T,t = 150 GeV more than 90% of the tagged top quarks can be matched to a true top within R i < 0.5. For p T,t = 100 − 150 GeV about 80% can be so matched. For t w and t − buckets the number of unmatched tops becomes negligible above 250 GeV. Adding t w and t − buckets, the total efficiency of our algorithm is 60-70% for 150 < p T,t < 350 GeV.
In the central panel of figure 6 we show the tagging efficiency for two top quarks as a function of the average true transverse momentump T = (p T,t1 + p T,t2 )/2. The total efficiency is split between (t w ,t w ) events (black), (t w ,t − ) or (t − ,t w ) events (red), and (t − ,t − ) events (green). For each of these categories we also show the well reconstructed tops only. As expected, the (t w ,t w ) events are reconstructed with an encouragingly high efficiency and essentially negligible number of non-matched tops. For the other two categories the JHEP08(2013)086  Figure 6. Left: efficiency for a single bucket tag as a function of the true transverse momentum of the top, shown for t w buckets (black) and t − buckets (red). Dashed lines indicate events matched to a top quark within R i < 0.5. Center: efficiencies for two bucket tags as a function of the average true p T,t . We show (t w ,t w ) events in black, (t w ,t − )/(t − ,t w ) in red, and (t − ,t − ) event in green.
Dashed lines again indicate reconstructed tops matched to parton level tops. Right: corresponding efficiency for two HEPTopTagger [17][18][19] tags. In all cases the last bin includes all events above 450 GeV.
fraction of unmatched tops is slightly larger, but well under control. For the comparison with the HEPTopTagger approach we show the corresponding efficiencies for the same event sample in the right panel.
Also note that the efficiency for (t w ,t w ) events is slightly higher than the square of the single bucket t w efficiency. This is because, once one top in an event is reconstructed, the second top becomes easier to find, due to combinatorial factors. Similar correlations occur in the (t w ,t − ), (t − ,t w ) and (t − ,t − ) categories. The total double top tag efficiency for p T,t = 150 − 350 GeV is close to the single tag efficiency: 50-70%. As we always search for two tops (otherwise we regard the event as un-reconstructed), the total double tag efficiency and total bucket tag efficiency must be closely related, as long as the individual p T,t and averagedp T,t distributions are similar. We should note that some of the unmatched tops may still be correct tags as QCD effects will change the direction of the true top as compared to the top decay products at parton and particle level.
Unlike for a typical top tagger, illustrated in the right panel, the efficiency of the buckets does not reach a plateau at large transverse momentum. Once the top decay jets start merging at the scale of the C/A jet size the method will fail, so for example R C/A = 0.5 leads to a drop above p T,t m t /R ∼ 350 GeV. Towards smaller top momenta the requirement eq.(3.3) limits the efficiency by removing poorly reconstructed tops due to combinatorics. By construction, the bucket method targets the intermediate regime 150 GeV < p T,t < 350 GeV where it should serve as a very useful tool in Higgs searches as well as new physics searches.
The resulting cross sections of reconstructed tops with the consistency selection cut eq.(3.3) are summarized in table 5. The total double top tag efficiency for the t hth +jets sample with five jets of which two are b-tagged is 26%. The mis-tagging efficiency of finding two valid top buckets in the pure QCD events (five jet, two mis-tagged as b-jets) is of the JHEP08(2013)086  Table 5. Number of events reconstructed using the b/jet-buckets for (t w ,t − ), (t − ,t w ) and (t − ,t − ) events with p rec T,t > 100 GeV cut.
order of 5%. Note that we only simulate up to five hard jets for QCD events. While simulating six hard jets at the matrix element level might in principle affect the number of (t w ,t w ) events, our analysis is largely based on events with one t − bucket, which should be well described by our simulation.
To estimate the effect of QCD scale variations on our results, we assume that the relative ratios of n-jet QCD backgrounds scale uniformly with this variation. Using this staircase relation [40,41], we estimate the effect of a 10% change for each power of α s by increasing each event weight depending on its number of jets. We confirm that the background rates in table 5 increase by roughly 55% while the normalized shape in the right panel of figure 3 does not visibly change. This reflects the fact that our analysis mainly probes background events with at least five jets. While our assumed change in normalization for the merged leading order approach is smaller than what one might expect from fixed-order QCD computations the key question is how well the predictions for example from Alpgen or Sherpa describe the normalization and the shapes of the observed QCD backgrounds at the LHC. If the experimental determination of the QCD backgrounds turn our significantly more precise than the theoretical predictions a consistent scale choice in the simulation should, within a certain range, be considered as a Monte-Carlo tuning parameter than a measure of uncertainty [40,41]. In any case, the QCD background normalization should not be a problem for a proof-of-concept analysis in the hadronic top pair channel; the parameters in our algorithm might have to be adjusted, but the ratio S/B 1 will still be fine. Moreover, in this first study we do not conclusively show that theoretical uncertainties due to higher order corrections have no effect on all our relevant distributions. Instead, we assume that such changes will be mild enough to not affect the kinematic signal extraction significantly. A dedicated experimental study, currently under way, will have to show how the measured QCD backgrounds can indeed be suppressed by our buckets selection algorithm.

Stops from buckets
As a demonstration of our algorithm for top reconstruction, we apply it to scalar top searches. Searches for supersymmetry, or general, top partners are becoming more and more central in ATLAS and CMS. They constrain the allowed stop masses to mt 600 GeV [42][43][44][45]. Theoretically, many analysis strategies have been suggested, covering JHEP08(2013)086   Table 6. Cross sections for top background and stop pairs with masses of 500, 600, and 700 GeV after selection cuts and application of the b/jet bucket analysis. We assume exclusively stop decays to 100 GeV neutralinos. The significance for 600 GeV stops is given for an integrated luminosity of 25 fb −1 .
In this section, we assume scalar top pair production followed by decay into tops and the lightest neutralinoχ 0 1 with 100% branching ratio. For all model points we set the lightest neutralino mass to mχ0 1 = 100 GeV. Cross sections at the LHC assuming √ s = 8 TeV are shown in table 6. To generate the signal for stop masses of 500, 600, and 700 GeV we use Herwig++ [56,57]. We normalize the production cross section to the Prospino results at next-to-leading order [58][59][60][61][62].
Since the reconstruction technique described in the previous section are also applicable for tops from stop decays, we expect good top reconstruction. To reduce the non-top background we first need to apply a set of simple selection cuts. We first require at least JHEP08(2013)086 five jets, two of them b-tagged. Then, we require large missing momentum, / E T > 150 GeV, and veto isolated leptons. The results are summarized in table 6. Because QCD has no intrinsic source of missing momentum, and W +jets has a small rate and a lepton we ignore this backgrounds in this paper, and assume mostly tt backgrounds with large missing transverse momentum, typically the result of mis-measurement or τ decay.
Based on the algorithm developed in this paper we require two top buckets with b/jet re-ordering. The two reconstructed bucket momenta we denote as p t 1 and p t 2 . After the missing momentum cut the main background is semi-leptonic top pairs, which means one of the two tagged tops in the background sample is mis-tagged.
The advantage of an analysis based on fully hadronic top decays is that both tops are fully reconstructable [18][19][20][21][22][23]. We use the bucket momenta to compute m T 2 (p t 1 , p t 2 , / E T ) [63,64]. Its distributions for the tt background and the stop pair signal is shown in figure 7.
To extract stop pairs we select events with m T 2 > 350 GeV . (4.1) After this cut and for a stop mass of 600 GeV we arrive at S/B ∼ 1 and more than three sigma significance at the 8 TeV LHC with the currently available integrated luminosity of 25 fb −1 . In addition, the endpoint of the m T 2 distribution with fully reconstructed hadronic tops should allow us to precisely measure the stop mass [18,19]. All intermediate steps as well as results for other stop masses are shown in table 6. Note that some numbers are different from those shown in table 5 due to the leptonic decays. Of all events with two reconstructed tops about 10% involve τ leptons, both for the signal and the background. After the missing momentum cut a significant fraction (∼ 60%) of the top background comes from these events. In contrast, only 10% of the signal events include a top decay to a τ . Therefore, a τ -rejection would improve our results significantly, as shown in table 6.

Conclusion
In this paper we have presented a new method to identify and reconstruct hadronically decaying top quarks. It is based on assigning regular jets to buckets, one for each top decay and one for initial state radiation. The buckets corresponding to tops are each seeded with one of the two b-jets we require in every events. If a top bucket includes all three top decay products it has to fulfill W and top mass constraints. However, frequently the softer W decay jet is missing, so we have to rely on the two leading jets to reconstruct a defined fraction of the top mass. After an appropriate re-ordering of the buckets missing the softest decay jets both kinds of buckets can be used to reconstruct the top four-momentum.
To suppress tops which for one or another reason cannot be matched to a generated top quark we apply a self consistency condition (QMM) to each bucket. This condition defines the lower bound of the typical transverse momentum range 100 GeV < p T,t < 350 GeV to which the method is sensitive. For higher boosts the buckets will eventually fail due to the size of the jets they are constructed from. For top quarks with this moderate boost we achieve a maximum efficiency around 60-70% for the reconstruction of two top quarks.

JHEP08(2013)086
In particular, for p T,t < 250 GeV our method gives a significant improvement over subjetbased top taggers, which have low efficiencies in this regime. Our algorithm is in fact more appropriate for two hadronically decaying tops than semi-leptonic decays, as fully hadronic events provides two mass conditions to satisfy in constructing the two top buckets, rather than just one.
To illustrate our approach in a new physics framework we have applied it to supersymmetric stop searches, relying on stop decays to tops and missing energy. Because we reconstruct the top four-momenta we can apply a simple m T 2 analysis, including a measurement of the stop mass. This makes stop search strategies as simple as sbottom or slepton searches.
It should be noted that the simulations we have used in this paper are proof-of-concept only, and must be buttressed by more detailed work. In particular, we have not attempted to fully characterize the effects from QCD uncertainties. Such effects are less critical in the usual top-tagging methods, which are usually benchmarked against dijet QCD backgrounds, which are less sensitive to variations in the QCD scale. Fixed-order NLO calculations indicate O(1) theoretical uncertainties on general 5-jet QCD background observables, so our QCD background estimates are not guaranteed to conservatively estimate the measured LHC background rates. The precise structure of the QCD backgrounds will have to be addressed in future theoretical and experimental analyses.
Nevertheless, the numerical results of this study can be directly compared with the published HEPToptagger studies, based on the same simulated event samples. There should exist a wide range of possible applications for top buckets in ATLAS and CMS. As a first step, hadronic top pair production with and without contributions from beyond the Standard Model might serve as a useful testing ground [20][21][22][23].