Multi-tops at the LHC

The experiments at the LHC are searching for many different final states that can hint to the presence of new physics beyond the Standard Model. One of the most interesting and promising sectors for these searches is that of the top quark, for both theoretical and phenomenological reasons linked to its large mass and to its possible special role in the electroweak symmetry breaking sector. We suggest that multi-top events, beyond the standard $t$-$\bar t$ and four top searches, can bring further insight in constraining and discovering physics beyond the Standard Model, taking advantage of experimental techniques similar to those used in present top-quark analyses. This is relevant both for the next data taking runs at the LHC and even more at higher luminosity and higher energy collider options, which are discussed for future LHC upgrades and future accelerators. In particular we consider six top and eight top final states, discussing the generic colour representations for beyond the Standard Model particles giving rise to those final state. We also discuss the limits which can be extracted by using the present analyses sensitive to four top final states, as well as the potential bounds from new searches we propose to experimental collaborations as an alternative.


Contents
1 Introduction: closing the window on top multiplicity The top quark holds a special place in the Standard Model (SM) and in its extensions. Indeed it is one of the central points of interest for the experimental collaborations at the LHC due to its mass and its peculiar properties compared to the other quarks on the phenomenological side, and due to its link to the electroweak sector and to physics beyond the SM on the theoretical side. One very interesting question is whether present searches like those for four top final states can be extended to multi-top final states with more tops, and in particular what can be learned in this exercise. It turns out that even if one can think naively that a very large number of top particles can be produced at the LHC, these multi-top final states in practice have many constraints both regarding the number of particles and the type of production processes. In the following we only perform a preliminary study and we consider simple generic models that yield large multiplicities of top quarks in their final states. A detailed study is not possible without considering all the experimental details, therefore this note is meant only as a suggestion for further analyses by the LHC experimental collaborations. At present only few pioneering experimental analyses have started to focus on related final states (see for example [1] which considers an 8-jet final state), but one could extend these analyses to top quarks. As the number of particles grows, the size of the available phase space shrinks and one can guess that there is a limit on the number of top quarks which can be produced in a single observable event. In the Standard Model, this number however lies well below the naive estimate N max = √ s m t ≈ 80 for √ s = 14 TeV, and as Figure 1 shows, there is little hope to see more than 4 tops at the LHC. Even beyond the Standard Model the maximal top multiplicity observable at the LHC stays much smaller than N max . In this note we attempt to give an idea of this limit using a set of simple and generic effective models which yield six or more top quarks in the final state.

Toy models for multi-top physics
If several different models can give rise to the final states we are interested in, we limit ourselves to a set of models based on the assumption that new physics couples only to top quarks (a kind of "top portal"). The topologies we consider, consist of the decay chains of pair-produced coloured particles, which will either be "coloron"-like bosons (see for example [2]) or t -like vector-like fermions (see for example [3][4][5]).
It is important to note that no matter how complicated the decay chain, there is only one parameter that strongly influences the event yield: the mass of the original pair-produced particle, as its production mechanism is fixed by QCD and we assume a unique decay channel. Hence limits and observation windows estimated using cuts and simple event counting with these topologies are fairly generic and can be generalised to many other models with the same signatures.

Six tops
As the four top final state is already extensively studied in the literature [6][7][8][9], and searched in detail by the LHC experimental collaborations [10,11], we start considering the six top quark final state, by analysing the production and decay of such a state together with simple analyses which can be used to bound this process.

Model and production process
Reaching a six top final state requires two particles beyond the Standard Model: a top partner T with mass M T and a bosonic particle which we will call Z , with mass M Z . The six top final state is produced in the process shown in Figure 2. The most natural gauge group representation is (3, 1, 2/3) for T and (1, 1, 0) for Z . More exotic colour embeddings are possible, as listed in Table 1, but having an additional coloured particle obviously leads to a richer -and hence more constrained -model. In the following subsections, we will set limits on this model using searches for new physics at the LHC, which could be sensitive to the final state of Figure 2.

Kinematic Distributions
As usual with models involving a high-mass new particle, the process described above always involves a high total transverse energy H T . The bulk of the distribution is shifted to higher Table 1. All possible colour embeddings for T and Z in the topology of Figure 2 energies as M T grows (Figure 3(a)), and so with √ s, in a milder fashion (Figure 3(b)). But even at the kinematic limit and 8 TeV, it is largely sufficient for a large majority of events to pass the most selective QCD-background-reducing cuts used in CMS lepton analyses (see [12]). Since we focus on events where two tops decay leptonically, there is also a large amount of E T coming from the undetected neutrinos, which again is typically sufficient to be distinguished from QCD events. The shape of the E T distribution has very low dependence in the centre-of-mass energy of the collisions (Figure 3(d)), which is also true for the transverse momentum of observed particles, such as leptons and jets ( Figure 3(f)). The shapes of these variable do however depend on M T (Figures 3(c) and 3(e)). Angular distance variables show the same lack of √ s-dependence (Figure 3(h)) and also do not depend on M T (Figure 3(g)). This shows that due to the high multiplicity of the event, little correlation is kept between the leptons and the other decay products and one can consider the events as almost spherical.

Existing limits
In the following section we examine the present bounds on the six-top final state coming from different analyses, mainly those including two and three same sign leptons, which have a very reduced background. We also discuss simple improvements in these search strategies.

Two same-sign leptons with b-jets
Among the large possible number of detector-level signatures of the six-top final state, those involving several same-sign leptons seem the most promising. These leptons will be accompanied by a large number of b-jets and jets. The CMS search for same-sign dileptons production associated with b-jets (2SSL+b) presented in [12] is sensitive to such final states, so we reproduced their analysis on simulated events from our model to set limits on our parameter space. We specialised to the signal region 7 of the search (SR7), whose selection criteria are detailed in Table 2, due to its low background expectation value.
Other analyses could be sensitive to these multi-lepton, multi-jet, E T final states, in particular ATLAS supersymmetry searches in the squark-gluino searches [14] and [13]. However they both have drawbacks that make [12] seem most relevant: the first only focuses on opposite sign dileptons and hence will have a larger background and put weaker limits while in the latter, only signal regions SR3b and SR3L would be sensitive due to the chargino invariant  mass cut of the others. However, SR3b does not use E T in its selection which allows for more QCD background and SR3L does not use b-tagging information.
The events were generated by MadGraph 5 [17] and further hadronised in Pythia 6 [18]. The jets were reconstructed using FastJet [20] with the anti-k T algorithm (R = 0.5 as used in the original search on real data) and we applied the cuts in Table 2 to the reconstructedlevel variables with MadAnalysis 5 [19]. This provides an event yield expectation value for our signal, for which we can compute a confidence level for the signal to be compatible with the observed number of events using the CL s method [21]. The expected background and measured number of event are taken from the CMS search and both signal and background are modelled using Poisson statistics. Of course, we apply our selection at the reconstructed level without proper detector simulation, which means that our confidence levels should be taken for what they are worth: gross estimates for a preliminary exploration of multitop final states.
We performed a parameter scan in the plane M T , M Z in the minimal colour embedding model and wehave been able to exclude the region below M T < 710 GeV, as shown in Figure 4. As expected the limit has very mild dependance on M Z so we will keep it at 400 GeV and only scan over M T for further discussions.
We did not account for K-factors in our limits, which means that we underestimate the parton-level cross-sections and thus the limit on M T . This is justified by comparing tt production in Madgraph5 aMC@NLO at LO and NLO with m t = 600 − 1000 GeV which yields a < 20% increase in the overall cross-section. Such a factor would change unsignificantly the limit we set on M T . This limit could be extrapolated to the more unusual colour structures of Table 1 by multiplication of the signal yield by adapting the color factor in the approximation where we neglect colour correlation effects.

Possible improvements in the event selection
The analysis we rely on to set bounds on our model, uses very stringent cuts, which reduce the background tremendously but also eliminate a significant part of the signal. Among the most restrictive cuts, the requirement for all jets to have a p T above 40 GeV seems to be rather difficult to pass so we considered an alternative selection where only the two leading b-jets are asked to fulfil this condition, while other jets need only have a p T bigger than 20 GeV. As shown in Figure 5, this increases the event yield of our signal for the lower M T   Figure 5. Comparison between the signal event yield in the CMS analysis (black) and with reduced jet p T cuts (purple), which could allow to extend the limit by 50 GeV the background estimates are data-driven, it is hard to extrapolate it to selections that have not been studied by CMS. One could however hope to balance the increase due to this less constraining condition by imposing harder H T cuts, which, as shown in Figure 3(a) could be as high as 500 GeV without reducing the signal in a noticeable way. For want of a more precise description in the expected background in this case, we predict that such a change in the cuts would account for an increase of around 40 GeV in the hypothesis that the balance between the change in p T and H T cuts is exact. This should not, of course, be seen as an accurate prediction but rather as a rough estimate to illustrate this discussion.

Adding an extra lepton
Among the main issues one might encounter for such an analysis is the technical difficulty of identifying and tagging many jets in the very intense hadronic activity that this kind of events produce. To anticipate this possible limitation, we propose a different analysis (3SSL) where the hadronic activity is only controlled with a transverse energy cut, without requiring jet identification. This of course is much less constraining than the previous selection relying on the presence of three b-tagged jets so we propose to balance the background increase by increasing the requested isolated same-sign lepton count by one, which will drastically reduce any Standard Model contribution. This is however also harmful for the signal as we now pay the price of another leptonic top decay. Our selection relies then on the following criteria: • 3 leptons of the same sign with p T > 20 GeV • the sum of the hadronic p T within a cone ∆R < 0.3 of each lepton should not be larger than 10% of the lepton's p T In order to investigate the possible reach of such a search, we model the background for having three same-sign leptons at the LHC using MadGraph and Pythia. This background can be sorted in three categories, depending on whether the leptons come from a physical process, lepton charge misidentification or fakes from heavy flavour decays. Real three samesign lepton processes in the Standard Model are really suppressed since each lepton will come from a separate W , Z or t decay. Table 3 summarises the dominating channels and the associated cross-sections at both 8 TeV and 14 TeV, and shows how small their contribution is.
The contribution of lepton charge misidentification was estimated at 8 TeV using an CMS search for trileptons [22]. We take their measured number of events with no hadronic τ decay with H T > 200 GeV and E T > 50 GeV, which is 218 events with 19.5 fb − 1 at 8 TeV and multiply it by the highest estimation for the lepton charge misidentification rate: 10 −3 , which results in a cross section contribution of 1.1 × 10 −2 fb.
Finally, the background from fake leptons stemming from heavy-flavour decay is modelled using the rule-of-thumb presented in [15], that 1 in 200 b-quarks typically gives an isolated electron or muon. We show in Table 4 the main channels where a b-quark decay is at the origin of one of the three leptons. Given that a rather high cross-section of ≈ 10 −4 pb is reached in the first three channels under consideration, we studied them in a more precise analysis in MG5+Pythia.
3.67 × 10 −8 Table 4. Processes with heavy flavour decay contributing to 3SSL. The only kinematic cut imposed was p T,l > 10 GeV. The branching b → isolated lepton is taken as 2% as advised in [15] and for the most computer resource intensive channels, the production cross-section is multiplied by appropriate factors of Z → l branching ratios (≈ 7%) and W → l branching ratios (≈ 20%). The first three channels are intense enough to deserve a complete analysis as described below.
The analysis included the full production of the final state with decayed vector bosons and top quarks at the MG5 level, with further decay, ISR, FSR and hadronisation in Pythia.
Applying the full 3SSL analysis shows that the real background is much lower than the estimate, so low in fact that we were unable to generate events passing all the cuts in two channels, for which we give only upper bounds for the cross-section. This can be probably explained by the fact that among the two leptons from B decay which are isolated from their mother jet, only a small fraction are isolated from the surrounding intense hadronic activity.
4.98 × 10 −7 Table 5. Event yield expected with the restricted cuts. Now that we have some handle on the background we can turn to the signal at 8 TeV. The number of signal events in a scan over M T is shown in Figure 6, where it is clear that this channel is less intense than the 2SSL+b channel. If the latter should fail due to the reasons we listed above, an analysis using 3SSL might put a weaker limit on our toy model. At 8 TeV, the signal is however so low that with our background estimate that sums up to ≈ 1 × 10 −1 fb, the expected limit is below the kinematic limit put on M T , as shown in Figure 7 (left panel). Our background estimation is however very conservative so that we display in Figure 7 (right panel) the expected confidence level on the exclusion of our model where we add a 10% acceptance to all simulated channels that were not processed through the whole 3SSL analysis. We keep the data-based charged misidentification contribution unchanged because the CMS analysis already performs cuts very similar to ours. This 10% acceptance factor seems reasonable given the < 10 −2 factor between the naive and realistic simulations of the heavy flavour decay contribution. In that case we can set a 1σ limit at 625 GeV, which is a little above the kinematic limit. We shall however see in the next section that there is hope to increase this limit by a large value with the LHC at 14 TeV.

Four top processes
While this model was constructed to study six top final states, it can yield four tops as well in all color embedding. Indeed, a highly off-shell top in a pair-production might radiate a Z whose decay gives two additional quarks. This channel, however, is negligible, as we checked by comparing its contribution to 2SSL+b to that of the 6 top process, which is indeed largely dominant by two orders of magnitudes. A significant production rate can however be achieved when the Z is a coloured (coloron-like) vector. The QCD coloured Z pair-production is by far more important compared to T pair-production as we only consider the region of phase space where M T > M Z + m t . The limit set by this process will be much more stringent than  those set by the six-top channel, which means that large-multiplicity final states are not of great interest when the Z is colored, and we will set it aside in the rest of this work.

Perspectives at 14 TeV
The analysis used to constrain our model has a background which is very hard to properly reproduce in simulations, which is why experiments do not rely on Monte Carlo samples but use data-driven techniques to estimate it. This, however makes it impossible to scale their results up to higher energies to precisely establish the expected limits for the next run of the LHC. We can however rely on the smallness of the background to make gross estimate of the observation window for a given amount of data, knowing that several events would probably be sufficient for having an observable signal. Figure 9 shows the event yield expected for a 2SSL+b analysis with 10.5 fb, which shows that the reach could be enhanced rather strongly if the background does not increase too much. One can also project the expected yield for the three same-sign lepton analysis at 14 TeV. As for the previous case, it is impossible to make a sensible extrapolation of the background because part of it is data-based. We can hence only present the event yields as a function of M T . As was shown already at 8 TeV, the cross-section is significantly decreased compared to the 2SSL+b case but gains a very large factor compared to the 8 TeV case, which means that one could hope for this channel to start being competitive, since its background will most likely not scale as fast with energy.

Model and production process
It is easy to build upon our first toy model to have a decay chain producing eight top quarks in the final state by adding an extra bosonic coloured particle ρ decaying to tT . This resonance can be pair produced in pp collisions by QCD and it can give rise to the expected signature.
In this model, the six-top final state can obviously also arise and all the above discussion applies, but we want to investigate the phenomenology of the eight top final state. The colour assignments for T and Z are constrained exactly as before, which imposes ρ to be a colour-octet if we stick to the minimal and phenomenologically more reasonable choice for T and Z . For the sake of the discussion, we will also set it as a real vector field.

Phenomenology of the eight-top final state
The most promising ways to detect a multi-top final is through a leptonic channel. As before, exploiting the possibility of having same-sign leptons is probably the best way to reduce Standard Model backgrounds. It is therefore useful for understanding how the dynamics differs from the six-top case, as presented in Figure 11. One can see that as expected, the H T spectrum is harder for eight-top events. The E T and leptonic p T are slightly lower than in the six-top case, which is understandable since individual tops have lower momentum. The extra hadronic activity is also visible in the distribution of the leptonic isolation variable (sum of the hadronic p T in the vicinity of a lepton divided by the lepton's p T ), which means that the leptonic cuts have a lower acceptance in the eight-top final state. These factors combine to eliminate the possibility to put any bound on M ρ with the 2SSL+b analysis using current data, as for a 800 GeV, the results of the CMS analysis is compatible with the background only hypothesis within less than 1σ. The 3SSL analysis is no better off: even though combinatorics help reducing the signal loss due to the third branching ratio to leptons, compared to the six-top case, no signal event is expected at the LHC with the current amount of data at 800 GeV.

Conclusions
This article tackles in a preliminary way the question of the maximal top multiplicity that could be detected in order to motivate searches for those final states that could be accessible. In a very broad analysis based on toy models, we could establish that current LHC data could allow to place a limit on new physics giving rise to six-top final states, but that eight-top final states are beyond our reach. We could also show that there is hope to increase the window in top multiplicity once the LHC runs at nominal energy, as much stronger bounds should be placed on six-top processes, and eight top processes should begin to become visible. This work is however more of a proof-of-principle than a precise analysis, which should be conducted in a more realistic setup by the experimental collaborations, in order to be able to place precise bounds.