LHC Top Partner Searches Beyond the 2 TeV Mass Region

We propose a new search strategy for heavy top partners at the early stages of the LHC run-II, based on lepton-jet final states. Our results show that final states containing a boosted massive jet and a hard lepton, in addition to a top quark and possibly a forward jet, offer a new window to both detecting and measuring top partners of mass $\sim 2$ TeV. Our resulting signal significance is comparable or superior to the same sign di-lepton channels for top partner masses heavier than roughly 1 TeV. Unlike the di-lepton channel, the selection criteria we propose are sensitive both to $5/3$ and $1/3$ charge top partners and allow for full reconstruction of the resonance mass peak. Our search strategy utilizes a simplified $b$-tagging procedure and the Template Overlap Method to tag the massive boosted objects and reject the corresponding backgrounds. In addition, we propose a new, pileup insensitive method, to tag forward jets which characterize our signal events. We consider full effects of pileup contamination at 50 interactions per bunch crossing. We demonstrate that even in the most pessimistic pileup scenarios, the significance we obtain is sufficient to claim a discovery over a wide range of top partner parameters. While we focus on the minimal natural composite Higgs model, the results of this paper can be easily translated into bounds on any heavy partner with a $t\bar{t}Wj$ final state topology.


I. INTRODUCTION
The discovery of the Higgs boson at the Large Hadron Collider (LHC) is a great victory for the Standard Model (SM) of particle physics. With its minimal scalar sector of electroweak symmetry breaking, the SM at short distances is a complete weakly coupled theory up to very large energy scales. Furthermore, the SM admits a set of accidental symmetries that eliminate proton decay and suppress custodial, flavor and CP violating processes. Even though the SM cannot explain several experimental observations such as the neutrino masses, the baryon asymmetry of the universe and the origin of dark matter one cannot deduce with any certainty the energy scale at which the extensions of the SM would be relevant, with the exceptions of the Planck scale and the scale of the Landau pole of the hyper charge interactions. The only fuzzy scale, potentially accessible to the LHC, is related to the recently discovered Higgs boson. As a fundamental scalar the Higgs mass is ultra-violet (UV) sensitive. Hence, we expect that on the quantum level the Higgs mass will pick up large contributions from high energy scales, resulting in a very large mass of the Higgs boson. This, of course, is in direct contradiction with our direct and indirect knowledge of the Higgs boson dynamics.
A simple possibility to stabilize the Higgs mass and the electroweak scale in a controlled manner is to add new fields to SM, with the same gauge quantum numbers as the SM fields, such that the contributions of the new fields to the Higgs mass eliminate the UV sensitivity. In the absence of interactions the Higgs will loose its quantum sensitivity (setting quantum gravity aside), and hence the most severe known sensitivity of the Higgs to quantum corrections arise as a result its large coupling to the top quark. To ensure the stabilization of the electroweak scale, the virtual contributions of some of the new particles to the Higgs mass should cancel the contributions coming from the SM top quarks. These new states are collectively denoted as top partners. In known examples the partners might be scalars as in the case of supersymmetry or fermions as in the case of composite Higgs models (CHMs). Current bounds on the top partner masses are roughly 700 GeV for supersymmetric scalar states and 800 GeV for composite-Higgs fermionic states (see e.g. Refs. [1,2] for recent results).
While the bounds on the top partner masses are fairly strong they are not bullet proof, and they also only result in moderate pressure on naturalness (here we are not concerned with various definitions of fine tuning). Probably the most relevant question amidst the "LHC battle for naturalness" is how are we going to discover top partners (if any exists) or improve the bounds on the top partners both in terms of mass reach and in terms of robustness. The two criteria can be used to guide the focus of theoretical, phenomenological and experimental effort.
One can define two "mini-frontiers" for the battle for naturalness at the LHC [3]: 1. The mini energy frontier, where the effort is directed towards searching for ultra massive top partners. The experimental focus of the energy frontier searches is defined by the highest center-of-mass energies that can be reached by the LHC.
2. The mini intensity frontier, where the effort is focused on searching for partners with mass below or near the current bounds. The mini intensity frontier focuses the searches for top partners to possibilities that partners are elusive (i.e. when for some reason the current searches are not sensitive enough to their presence).
The physics describing the above frontiers is qualitatively different both in terms of the phenomenology describing them and in terms of the necessary experimental effort. It is important to notice that prior to the start of the LHC the starting points of the framework of supersymmetry and pseudo-Nambu-Goldstone boson (pNGB) composite Higgs models were different in the context of naturalness. If we were to remove our LHC-based knowledge (the results of the ATLAS and CMS direct searches) then supersymmetric models are not subject to any substantial pressure from naturalness. For instance, stop (as well as most of the other superpartners) masses close to that of the top quark are not in conflict with existing data. This is not the case when pNGB composite Higgs models are considered as the combination of LEP and Tevatron data is already constraining the model's decay constant f to lie above the f > O(800 GeV) scale [4,5]. Beyond the mere fact that this rather strong constraint on the value of f forces some amount of fine tuning, it also suggests that we should have expected that the composite fermion resonances would be somewhat heavy with masses probably larger than f . Even at the centre of mass energy of 8 TeV, the typical fermonic top partner production cross sections and the collected luminosity were simply not enough to produce the heavy partners. Thus, there is very little surprise that the first run of the LHC, which was limited in centre of mass energy, did not observe them. In order to make experimental progress on fermionic top partner searches at the LHC, it is hence necessary to focus on the region of parameter space where the top partner masses are larger than f . So far, the parameter space region of heavy fermionic top partners has not been explored, providing the main motivation for our current study of heavy top partners at the mini energy frontier. The main focus of this work is to study the reach of the LHC in the next run to discover and measure (or exclude) the presence of top partners in a regions of model parameter space which results in large top partner masses. When searching for top partners one needs to distinguish between event topologies of pair produced and singly produced top partners [6,7]. While the former is more robust as the partners are produced via SM QCD processes it suffers from a severe "large x suppression" from the parton distribution function (PDF) for large top partner masses. As two heavy particles are produced, the quarks and gluons in the proton have to carry a high x in order to achieve a heavy final state. The expected reach of searches for doubly produced top partners is rather limited even when considering high luminosities [8]. Single production processes, on the other hand, are model dependent but are subject to much lower level of PDF suppression and thus can potentially lead to a much better experimental reach.
Following the original papers that have emphasized the importance of the same sign lepton signal [9][10][11][12] of most of the fermionic top partner studies so far focused on final. Standard Model processes are highly unlikely to produce final states with two same sign leptons, deeming such signals a "clean" signature of BSM physics. 1 However, the same sign lepton searches are only applicable for the exotic 5/3 charged top partners. Furthermore, the di-lepton final states suffer from low branching ratios and from the fact that the resonance masses are smeared due to the missing energy having at least two hard neutrino components.
In this paper consider the case where the heavy partners decay to hadronic-leptonic final states. For other studies involving hadronic final state see [6,7,13,14]. We provide a strategy and a detailed phenomenological study which shows that in preferred regions of pNGB composite Higgs models one can discover top partners (at 5 sigma CL) with mass as high as 2 TeV at the 14 TeV LHC run, and with integrated luminosity of roughly 35 fb −1 . Furthermore, in the absence of signal one can exclude the presence of 2 TeV partners (at 2 sigma CL) with as little as 10 fb −1 .
Our study adopts the Template Overlap Method (TOM) [15][16][17][18] to tag the highly boosted decay product of the partners and in part reject the corresponding SM backgrounds. The final state of our signal events is characterized by multiple b-jets, which we employ through a semi-realistic b-tagging procedure. We take into account the contamination from pileup, assuming average of 50 interactions per bunch crossing and show where the effects of pileup on our selection criteria can be mitigated and where additional improvement might be necessary. Finally, our study of singly produced top partners employs the presence of a high energy forward jet in the signal events, which is in principle susceptible to contamination from pileup. We propose a modification of forward-jet tagging, whereby we cluster the jets int the forward region using a small cone (e.g. r = 0.2). We show that the signal distributions are hardly affected by reduction in forward jet cone size, while the background is significantly suppressed upon requiring a forward jet tag. As this is the first time that such a technique is proposed we present the results with and without the use of this new forward-jet tagger.
At large top partner masses the mass splitting between the partners due to electroweak symmetry breaking is subdominant. Hence, as we are not confined to the same sign di-lepton final states, our event selection strategy is adequate for searches for all partners that decay to tops and W s and not only the 5/3 charged ones. For the sake of concreteness and simplicity our current study focuses only on the relatively simple final state of ttW j. Note, however, that it is straight forward to generalize our study to include other final states as well.
In Section II we provide a brief introduction to our benchmark composite Higgs model. We include only the bare minimum of information directly relevant for the phenomenology of top partners and postpone a detailed discussion of the composite Higgs models and derivations of the equations until the Appendix. Section II also contains a discussion of dominant production and decay modes of fermionic top partners. The main results of the paper are discussed in detail in Section III. We include a detailed overview of our forward jet tagging proposal in Section III C, as well as discuss our simplified b-tagging algorithm and the boosted jet tagger in Sections III B and III D. We present the results on the sensitivity of Run-II LHC searches for 14 TeV to heavy top partner masses in Section III F. Finally, in Section III H we comment on the use of various final states to extract additional information about top partners, if a signal is ever observed. A highly detailed description of our benchmark composite Higgs model, top partner production mechanisms and decays can be found in the Appendix.

A. Brief Description of the Benchmark Model
In this articles, we use the Minimal Composite Higgs Model (MCHM) [19] as a benchmark for illustrating the performance of our event selection searches for top partners. Here we give a brief overview of the model features important for our phenomenology study. For a detailed description of the model see Appendix B 5.
The Higgs doublet in MCHM is a Goldstone boson multiplet which arises from the breaking of a global SO(5) × U (1) X down to SO(4) × U (1) X SU (2) R × SU (2) L × U (1) X of a strongly coupled theory. The SU (2) L and a U (1) subgroup of SU (2) R × U (1) X are gauged in order to provide the electroweak gauge bosons. 2 The low energy description of the strongly coupled sector with "weakly coupled" deformations is expected to contain additional scalar, fermionic and vector resonances, typically at a scale g * f , where f is the scale of compositeness and g * is a strong coupling, O(1) ≤ g * ≤ 4π. Electroweak precision measurements tend to push the mass bounds on scalar and vector resonances towards the multi-TeV range while light (∼ TeV) fermionic resonances are required in order to accommodate an effective potential for the Higgs which induces Electroweak Symmetry Breaking (EWSB) and the Higgs mass.
We use a bottom-up approach and only include a minimal set of light fermionic resonances: a top partner multiplet in the 5 of SO(5). The partner multiplet contains a partner with electric charge 5/3 (X 5/3 ), a partner with charge −1/3 (B), and three partners with charge 2/3 (T f 1,2 and T s ), where T s is a singlet while the other four states form a 4 under the SO(4). A generic feature of composite Higgs models is that the 5/3 charge partner (X 5/3 ) is the lightest state amongst the partners in the 4. Furthermore, if one neglects the electrical charge sign of the decay products, the g ut d Figure 1: Dominant single-production channels for the top partners X 5/3 , B, T f 1 and T f 2 (from left to right) at a proton-proton collider.
phenomenological signatures of X 5/3 and the B are identical. We will hence focus our effort on searches for X 5/3 /B states and postpone the searches for other top partners until future studies. Upon the diagonalization of the mass matrix (see the Appendix for more detail) the masses of the top, and the partners, are given by: where M 1 and M 4 are the singlet and fourplet mass scales, φ is a relative phase between them (see [20] for a detail discussion on the model's flavor parameters), f is the compositeness scale, y L,R are the left handed/right handed preyukawa couplings, and ≡ v/f . Eq. (1) reveals an important point which we will employ in the following sections. The mass splitting between the M 5/3 and B goes as f /M 4 , implying that the heavier the X 5/3 partner is, the more mass degenerate it becomes with the B state, provided y L is not too big.
Our current study will focus only on the tW decays of the top partners, since this is the only mode X 5/3 can decay to due to charge conservation. The dominant couplings of X 5/3 and B states are of strength where c R is a right-handed strong sector coupling between the partners in the 1 and 4. 3

B. Production of Top Partners
The top partners are colored and can therefore be pair-produced via QCD interactions, where the production cross section only depends on the mass of the respective top partner. The top partners can also be single-produced via the Left: Pair-production cross section for X 5/3 and single-production cross section for X 5/3 orX 5/3 as a function of M4 for different values of cR. Other parameters are fixed to f = 800 GeV, M1 = 1.5 TeV, φ = π, yL = 1 . Right: Single-production cross section for X 5/3 orX 5/3 , as compared to single-production cross section of other top-partners and their antiparticles as a function of M4 . Other parameters are fixed to f = 800 GeV, M1 = 1.5 TeV, φ = π, cL = cR = 3, yL = 1 .
interactions of Eq.(2). For low top partner masses, pair production dominates, but for higher top partner masses, single-production becomes kinematically favorable, as can be seen in Fig. 2. 4 Since here we are interested in heavy top partners, we will focus our attention on single production only. Fig. 1 shows the dominant production channels for the respective top partners. The X 5/3 partner is produced together with a jet and an anti-top, where the dominant effective coupling is right-handed. Due to the larger up quark PDF in the proton, X 5/3 production is preferred as compared toX 5/3 production, which requires a d or aū in the initial state. TheB is produced together with a jet and a top via a right-handed coupling with preference of B over B production, again due to the larger up quark PDF. The fourplet top partners T f 1 and T f 2 are produced together with a jet and a top via a right-handed coupling. Analogously, their anti-particles are produced together with a jet and an anti-top. As their production arises from a Z which is radiated off an initial state u, the production rates for them and their antiparticles are comparable. Finally, the singlet top partner T s dominantly couples to W b via a left-handed coupling. It can thus be produced together with only a jet, but requires a (PDF suppressed) b quark in the initial state. Due to the larger up-quark PDF, T s production is preferred overT s production at a proton-proton collider.
The effective couplings Eq. (2) relevant for single-production 5 depend not only on the mass of the top partners but also directly on the other model parameters -in particular on c R (for X 5/3 , T f 1 , T f 2 and B single production) and c L (for T s single production) -but also on the pre-yukawa couplings and the relative phase φ. In the large c L,R limit, the production cross sections of the fourplet (singlet) states scale with |c R | 2 (|c L | 2 ). For c R ∼ 1, the first (c-independent) and second (c-dependent) term contributions to the effective couplings in Eq. (2) become comparable in magnitude, and can cancel or enhance each other depending on the phase of c R and φ. As an illustration, the left panel of Fig.2 shows the single production cross section of X 5/3 andX 5/3 for different values of c R as a function of M 4 , where we fixed the other model parameters to f = 800 GeV, M 1 = 1.5 TeV, φ = π, y L = 1, and y R ∼ O(1), while making sure to reproduce the top mass. For comparison we also show the pair production cross section for X 5/3 +X 5/3 .
In the limit of large c R , the production cross section of theB is marginally lower than the one for X 5/3 because the B is slightly heavier. The T f 1,2 production cross sections are lower because the dominant production channel involves 4 The single production cross section depends on the model parameters beyond the mass of the top partners as can be seen already from the couplings in Eq. (2). Hence, the top partner mass scale at which single production becomes dominant depends on the model parameters. We will return to this point momentarily. 5 The analogous couplings for the charge 2/3 partners are given in Eq. (B37). two couplings to the Z rather than to the W which yields a suppression of ( g/2cw g/ √ 2 ) 4 ∼ 0.4 . 6 As an illustration, Fig. 2, right panel, shows the single production cross section of X 5/3 , B, T f 1 , T f 2 , T s and their antiparticles as a function of M 4 where we fixed the other model parameters to f = 800 GeV, M 1 = 1.5 TeV, φ = π, c L = c R = 3, λ L = 1, and y R by the requirement to reproduce the top mass.

C. Top Partner Decays
For all top-partners, the dominant couplings to W, Z, h and an SM quark are chiral (either left-or right-handed coupling dominates). In this case, the partial widths for a decay of a fermion F into a fermion f and a gauge boson or Higgs are For the X 5/3 partner one obtains There are several interesting features of the X 5/3 decay width to tW . First, note that although the effective coupling is O( ), the partial width is not suppressed. For large c R (and M 1 and M 4 of similar size), it is proportional to |y 2 R c 2 R |. For |y 2 R c 2 R | 5, this still yields a narrow resonance (Γ/M 25%), but for larger values of y R c R the resonances become broad. Resonances of ultra-large widths are difficult to measure since they tend to "blend" into the continuum spectra of differential cross sections. Hence, sections of parameter space which can be probed by the future LHC runs are limited by the width/mass resolution.
Since X 5/3 is the lightest partner state in the 4 (fourplet) such that decays into B, T f 1 , T f 2 and SM particles are kinematically forbidden, hence X 5/3 always decays into W t. 7 For the B decay width and its branching ratios, the analogous discussion applies. The total B width is of similar size as the X 5/3 width (c.f. Appendix B 5 for the explicit expression). The decay B → W t dominates over B → Zb and B → hb because effective couplings for the latter decays are of higher order in . "Cascade" decays B → W T f 1,2 are kinematically suppressed (if not forbidden) due to the small mass splitting between B and T f 1,2 .
For more details on top partner decays see Appendix B 5. In addition to very interesting event topology, the single X 5/3 /B production is also interesting because at high enough M X 5/3 /B it becomes the dominant production mode. The kinematics of singly produced X 5/3 /B events are mostly determined by two parameters: M X 5/3 /B and Γ X 5/3 /B (modulo effects of spin correlations), while the production cross section is subject to many other model parameters. Here we are not interested in details of models but in general features of ttW j event topologies and will hence leave the production cross section as a free parameter. We consider a range of M X 5/3 /B , while keeping the width Γ(X 5/3 /B) ∼ 15 − 20% of M X 5/3 /B . Keeping the cross section a free parameter has an additional benefit of presenting the analysis in a model independent fashion and being able to apply our results to other new physics searches in the ttW j channel.
In order to determine the "reasonable range" of cross sections, we consider several combinations of model parameters in a general partially composite model. We do not make any assumptions about the mass hierarchy in the model (e.g. we do not only consider the decoupling limit of M 1 M 4 ), while we make sure that each model parameter point reproduces the correct m t .
The current limits of X 5/3 /B partners place M X 5/3 /B 1 TeV. Hence, if X 5/3 /B is to be found during the future runs of the LHC, it will be found almost exclusively in the events containing at least one boosted top quark and one boosted W . Previous searches for X 5/3 /B partners focused mostly on the same sign di-lepton searches, due to the extremely clean signal, but at a cost of the signal rate. Compared to the inclusive single X 5/3 /B production, the signal rate is diminished by the branching ratio of W decays to leptons, resulting in where σ tot is the inclusive X 5/3 /B single production cross section. In addition, we checked that the geometric acceptance (i.e. |η l | < 2.5) for two leptons in a same sign di-lepton final state is 50%, implying that the total same sign di-lepton cross section is at least a factor of 2 smaller after the event selections. Instead, here we propose to search for top partners in channels which contain at least one lepton and a fat jet. Fig. 3 shows an example diagram of singly produced X 5/3 /B, including the decay modes, where we take the initial state radiated top to decay inclusively. Compared to the same sign di-lepton searches, the starting signal cross section in our search strategy is if we consider both the top and the W decaying hadronically (but not simultaneously). Note that the signal cross section is increased roughly by an additional factor of two for high M X 5/3 /B , where we expect X 5/3 and B to be nearly mass degenerate. The same sign di-lepton cross section, however, remains the same at high M X 5/3 /B , as the top and the W from the B decay are of the opposite charge l, q υ, q Figure 3: Single production of top partners with decay channels. We consider events characterised by a boosted tW system in the case of X 5/3 /B, as denoted by the ovals, in addition to a high energy forward jet and a top. Notice that the only difference in the X 5/3 production and B production is the sign of the decay products' charges. We consider inclusive decays of the initial state radiated top.

III. RESULTS
We proceed to discuss the main results of the paper. The signal events at a √ s = 14 TeV pp collider are characterised by four distinctive features: 1. A single, high energy forward jet.
2. One boosted t or one boosted W (M X 5/3 /B 1 TeV), as can be seen in Fig. 4 .
3. One hard (p T > 100 GeV) lepton, resulting from a top or W decay. 4. Two b jets, one of which can be a part of a top fat jet. Fig. 4 shows the features of the signal and background fat jet p T spectrum. The p T distribution of background events is characterised by a steep decline as a function of transverse momentum. Conversely, the signal distributions tend to peak at roughly ∼ M X 5/3 /B /2, with the PDF broadning effects becoming significant at high M X 5/3 /B , as the partner becomes more likely to be produced off-shell.
As we will demonstrate in the following sections, our event selection based on the unique single X 5/3 /B event topology, combined with boosted jet techniques, b-tagging and forward jet tagging can achieve sensitivity to X 5/3 /B top partners over a wide range of model parameters at the 14 TeV run of the LHC. We further argue that our results are comparable and in some cases superior to the same sign di-lepton searches, with an additional advantage that our method allows for the reconstruction of the resonance. In Section II A we pointed out that at large M X 5/3 /B we expect the X 5/3 top partner and the B to be nearly mass degenerate if the left hand yukawa coupling is not too large, a fact which has significant implications on the phenomenology of the heavy top partners and highlights a key advantage of our method over the same sign di-lepton searches. Since we do not consider the charge of the leptons as a part of the selection, the fact that the mass splitting between X 5/3 and B is small means that our search is sensitive to both channels, effectively doubling the signal cross section. Conversely, requiring a presence of two same sign leptons would essentially veto the B production, as the B partner decays to a top and W of the opposite charge. In the following sections we will consider the production of top partners both individually and under the assumption they are mass degenerate where relevant.  Table I: Background channels to single X 5/3 /B production, before Basic Cuts. We will only consider tt and W + jets in our analysis. Branching ratio of 2/9 for leptonic decays of the W in W +jets is included in the cross section, as well as the branching ratio of (2/9) × (2/3), for the semi-leptonic tt decays. For improved statistics at high M X 5/3 /B , we consider tt samples with two HT cuts, while we only take W +jets sample with HT > 600 GeV since at the end of the analysis it is a sub-leading background.

A. Data Simulation and Event Pre-selection
We generate all our simulated events at √ s = 14 TeV pp collider, using leading order MadGraph 5 [23] and shower them with Pythia 6 [24], with a fixed renormalisation and factorisation scale and assuming the CTEQ6L [25] parton distribution functions. In order to improve the statistics in the background channels, we impose a generation level cut of H T > 600 GeV on the background events, where H T is the sum of all hard parton p T values in the event. We require that all final state hard level patrons are generated with p T > 15 GeV, and impose a rapidity acceptance of η j < 5.0 for the quarks and η l < 2.5 for the leptons. The background tt events are matched to 1 extra jet while W +jets is matched up to four extra jets, using the MLM matching scheme [26] with the matching parameters Q min = 20 and xqcut = 30.
For the purpose of pileup studies, we generate a large sample of minimum bias events using Pythia 6 with default tunes. We simulate the effects of pileup contamination on signal/background events by adding to each event a random number of pileup events drawn from a Poisson distribution centered around N vtx = 50.
Next, we cluster the showered events using the FastJet [27] implementation of the anti-k T algorithm [28], where we use R = 1.0 for the fat jets and r = 0.4 for the light and b-jets. For the purpose of pileup mitigation, it is useful to consider a smaller R for higher p T fat jets. The pileup contamination scales as the jet area, hence a R = 0.6 cone will experience only ∼ 40% of the pileup effects on a R = 1.0 cone. However, decreasing the fat jet cone involves an elaborate procedure of calibrating jet energy scales and other systematics which is beyond the scope of our current work. For simplicity, here we will cluster all fat jets with R = 1.0 with the caveat that pileup effects can further be mitigated by reducing the fat jet cone.
We consider signal events in which both the top and the W daughters of X 5/3 /B decay leptonically (but not simultaneously), while we take the other, non-boosted top to decay inclusively. Table I shows a list of possible backgrounds and the corresponding cross sections. The main background channel in our search strategy is SM tt production and W +jets, while even at generation level the other SM backgrounds are subleasing. Since we require at least one hard lepton in our analysis, we will only consider the background channels in which one of the tops or W bosons decays leptonically. We normalize the tt cross section to the NNLO result from Ref. [29], while the NLO corrections in W + jets is not expected to be large. Here we will consider a conservative estimate of K W +jets = 2.0 for the NLO K-factor. In the following sections, we will show that our results are not strongly affected by the W + jets K-factor.
All events are subject to Basic Cuts: where l represent the hardest lepton with mini-ISO > 0.8 [30], "fj" stands for the fat jet, "j" stands for light jets, and N j (∆R fj,j > r + R) is the multiplicity of r = 0.4 jets with p T > 25 GeV and |η j | < 2.5 which are isolated from the R = 1.0 fat jet by ∆R > r + R.
In addition to Basic Cuts, we consider a series of additional selections designed to further suppress the background channels while maintaining as much of the signal as possible. In order to suppress the tt background further we require where j is the hardest jet isolated from the fat jet by ∆R = R + r = 1.4, and l is the hardest mini-isolated lepton in the event. The rest of the cuts we employ in this analysis deserve more attention and are described in detail in the following sections.
TOM approach to jet substructure aims to match the energy distribution of a jet to a parton-like configuration of heavy particle decays. The output of the method is the overlap score Ov, a measure of likelihood that a jet is, say, a top quark, a Higgs or a W boson, as well as the partonic configuration (i.e. peak template) which maximized the Ov score. The latter is of much importance, as one can in principle approximate the fat jet with the peak template. We will utilize this possibility in the following sections when considering effects of pileup on the measurements of the top partner mass.
Our analysis of jet substructure follows the prescription of Ref. [18], with the main difference that we divide events into hadronic top and hadronic W candidates before analyzing the fat jets. Note that the our work in this paper represents the first use of TOM as a boosted W tagger. We begin by selecting the hardest mini-isolated lepton in the event and determining whether it originated from a top quark or a W . If there is a p T > 25 GeV, r = 0.4 jet within ∆R = 1.0 from the hardest lepton, such that m l1, E T / , lj > 90 GeV 8 we declare that the lepton is a part of a leptonically decaying top, and hence the hardest fat jet in the event is a W candidate. Otherwise, we declare that the W decayed leptonically, and that the hardest fat jet is a top candidate. An alternative method of determining the "candidacy" of a fat jet would be to simply use the fat jet invariant mass cut, but such a choice requires techniques to subtract or correct for pileup contamination of the jet mass. Here, instead, we aim for pileup insensitive criteria for both jet substructure and event selection, whenever possible.
The leptonically decaying t/W also serves as a pileup insensitive estimator of the fat jet p T in the Template Overlap analysis, as the fat jet and the leptonic object recoil agains each other. We find that the scalar sum of the leptonic object constituent's p T (i.e. the lepton, missing energy and ,if the leptonic object is a top, a light jet) is a good estimator of the fat jet transverse momentum [18].
In order to speed up the numerical calculations, we generate template states at fixed p T , and use 10 bins of width δp T = 100 GeV, starting from p T = 450 GeV. The templates are produced assuming 50 steps in η, φ, while we scale the template sub-cones using the p T scaling rule of Ref. [17]. We produce two separate sets of templates, the three body template sets for top states and two body template sets for the W states, where we use the appropriate set based on whether the fat jet is a top candidate of a W candidate. Note that the use of the leptonic top TemplateTagger does not add much to the analysis, as the background objects already contain a leptonically decaying top (in case W is the fat jet), and the leptonic W is too simple of an object to require a substructure analysis (in case that t is the fat jet).
Finally, for an event to pass our boosted object selection, we require that the fat jet has an overlap score: for both the hadronic top and hadronic W candidates. Figure 5 shows an example distribution of Template Overlap for signal and background events, after the Basic Cuts. The left panel shows only the events which were categorized as hadronic top candidates, while the right panel shows the corresponding plot for hadronic W candidates. In both cases the W +jets events are rejected very well by TOM, as our lepton requirement deems that the W decays leptonically and the fat jet is hence either a light jet or a combination of light jets which get clustered together. Semi leptonic tt events are more challenging to reject via Template Overlap, since the final state content in terms of jet substructure is more similar to signal events. If a tt is categorised as a hadronic top candidate, TOM will likely tag the event with a high overlap score, since the fat jet is indeed a hadronically decaying top. If the events is categorized as a hadronic W candidate, the expected peak overlap score will likely be lower since TOM will try to match the substructure of a top to a decay of a W boson.
It is important to note that when it comes both to tt and W +jets background, higher order effects on the shape of the kinematic distributions become significant at high energies. Extra hard gluons are likely to appear in a highly energetic tt final state, causing the top-antitop system not to appear back to back in the transverse plane. Such "asymmetric" events offer an additional handle to reject Standard Model di-top events. Proper treatment of the  effect requires a full NLO event simulation, which is beyond the scope of our current study. It is impotent to note that since here we only consider a leading order tt sample matched to one extra jet, our estimates for the Template Overlap's ability to reject Standard Model tt events is likely underestimated. One of the most attractive features of TOM is its weak susceptibility to pileup contamination. Refs. [17,18] showed that the effects of pileup are not significant on TOM (up to 50 pileup events). The low susceptibility to pileup is a manifest of the fact that, by construction, TOM is sensitive mostly to the hard energy depositions within the fat jet and less so to the incoherent soft radiation. Here we find similar results both in the case of the top as well as the W, as shown in Figure 5. The signal distributions maintain a very similar shape upon the addition of pileup contamination, with the signal efficiency of the Ov > 0.5 cut remaining at ∼ 65% for both hadronic top and hadronic W candidate events. The shape of the background distributions is affected more drastically in the presence of pileup, however, notice that the region of Ov > 0.5 remains weakly affected, resulting in a small effect on the background fake rate upon the overlap selection cut.

C. Forward Jet Tagging
The event topology in Fig. 3 offers another interesting handle on background mitigation -a high energy forward jet [6]. The question of how well forward jet tagging (FJT) will perform in the high pileup environment of the future LHC runs remains open [52,53]. Yet, there is much interesting physics one can do with forward jets. Single top production, tagging Higgs events which originate from vector boson fusion and understanding of the proton structure at high x are just some of the examples. Here we are interested in forward jets only as event tags. The problem of forward jet tagging hence becomes simpler, as we are not concerned with precise measurements of forward jet energy and transverse momentum.
We propose a novel approach to forward jet tagging, which addresses the effects of pileup contamination (at 50 interactions per bunch crossing). Pileup contribution to jet p T goes as δp T ∼ R 2 , where R is the jet cone, resulting in a shift of the jet kinematic observables to higher values and a broadening of the kinematic distributions. In addition, larger jet cones are more likely to produce fake pileup jets, thus increasing the overall forward jet multiplicity. In order to limit the pileup contamination in the forward region, here we propose to cluster the jets in the forward region with a cone smaller than the standard r = 0.4 (i.e. r = 0.1, 0.2). Notice that this approach does not require an elaborate re-calibration of jet observables as we do not propose to measure the forward jet, just tag it. Arbitrary units MadGraph + Pythia  We define forward jets by clustering the entire event using a cone of radius r fwd and then selecting the jets in the event which satisfy the following criteria: where p fwd T and η fwd are the transverse momentum and rapidity of the forward jet. We then define forward jet tagging by requiring the number of forward jets in the event N fwd ≥ 1.
How is the forward jet multiplicity affected by pileup? Figure 6 provides the answer. Clustering the event with a standard ATLAS r fwd = 0.4 cone results in a dramatic shift in the forward jet multiplicity distribution, with as many as 10 forward jets easily appearing in an event at 50 interactions per bunch crossing. Reducing the cone size to r fwd = 0.1 almost extinguishes the effects of pileup, but at a cost to signal efficiency as only about 50% of the signal events pass the forward jet tagging requirement. We find that r fwd = 0.2 gives a good compromise between effects of pileup and signal efficiency, and throughout the rest of this paper we will adopt the term "forward jet" to mean a jet of radius r fwd = 0.2 which passes the forward jet criteria of Eq. (10).

D. b-tagging
Our analysis utilizes the presence of multiple b-jets in the signal, whereby we use information from the hard process to simulate the b-tagging procedure. We define every r = 0.4 jet to be b-tagged if there is a hard process b or c quark within ∆R = 0.4 from the jet axis. We consider the benchmark efficiency of 75% for every b jet to be tagged as a b, with the fake rate of 18% and 1% for c and light jets respectively. We further consider a fat jet to be b-tagged if there is a b-tagged r = 0.4 jet within ∆R = 1.0 from the fat jet axis.
We apply different b-tagging criteria based on whether the fat jet is a hadronic top or hadronic W candidate. Namely, we require: • One b-tagged fat jet (i.e. ∆R(fj,b) < 1.0), and at least one b-tagged r = 0.4 jet outside the fat jet (i.e. ∆R(fj,b) > 1.4) if the fat jet is a hadronic top candidate. Note that the criteria for an event to be a hadronic top candidate also require the r = 0.4 jet to be isolated from the hardest lepton (i.e. ∆R(l, b) > 1.0).
• One fat jet without a b-tagged r = 0.4 jet within ∆R = 1.4 from the fat jet axis (e.g. anti-b-tagged) and at least one b-tagged r = 0.4 jet outside the fat jet, if the fat jet is a hadronic W candidate.
How large of a b-tagging efficiency should we expect for the signal events? Naively, we would assume that the fraction of events which contain two true b-jets is ∼ 1.0. When folded into the above mentioned b-tagging efficiencies, we would hence expect the overall signal b-tagging efficiency to be ∼ 0.5. Figure 7 shows more precise and complete information on the b-tagging of signal events (for the purpose of illustration, here we show only hadronic top candidate events). From the left panel, we can see that the geometrical acceptance for events which contain two proper b-jets is ∼ 80%, as represented by dashed, red histogram area with a b-tag score ≥ bb. The probability that the highest p T fat jet of a signal event will contain a proper b-tag is ∼ 90%, due to the large degree of collimation of the top decay products and the large fat jet clustering cone R = 1.0.
In addition, we find that the isolation criteria on the b-jet outside the fat jet reduce the signal efficiency by an additional 20 − 30%, as seen in the right panel of Fig. 7. The effect can be understood almost entirely from a simple geometrical argument. Consider for instance the b-tagging criteria for hadronic top candidate events. Because anti-k T jets are roughly circular in η, φ, the fraction of the available detector area in which a b-jet will be isolated both from the fat jet and the hardest lepton is given by: where ∆η a is the detector acceptance in rapidity for the r = 0.4 jets (i.e. -2.5 to 2.5), r is the radius of the b-tagged jets, and R is the radius of the fat jet. The (R + r) 2 term serves to isolate the b-jet from the fat jet while the term proportional to R 2 isolates the jet from the lepton. Jet rapidity acceptance is roughly ∆y ≈ 5, although this is an under-estimate since tracks with |y| < 5 are all taken into account during jet reconstruction. Next, for b-jets clustered with r = 0.4 and fat jets with R = 1.0 one obtains (b−tag isolation) ∼ 0.7, roughly the fraction of isolated b-tag events with a b-tag score greater than b in the left panel of Fig. 7. We conclude that the expected b-tagging efficiency for the hadronic top candidate events (including the 75% efficiency of b-tagging) will be of order A full study of pileup effects on b-tagging requires detailed detector information, an endeavor which is beyond the scope of our current analysis. However, we would like to point out that the experimental studies of Ref. [54] suggest that b-tagging performance at the LHC will perform well at 50 interactions per bunch crossing.

E. Resonance Mass Reconstruction
Reconstructing the mass of the top partner in our analysis involves one fat jet, a hard lepton, missing E T and possibly a r = 0.4 jet (if the event topology is such that the top decays leptonically). In such a situation, there are three issues which might arise: • Combinatorics: The signal final state is characterized by no less than 5 small cone (i.e. r = 0.4) jets. Determining which jets resulted from a decay of a top or a W is hence a challenge.
• Pileup contamination: Pileup contamination not only shifts mass distributions to higher values and broadens them, but also creates "pileup jets" which could additionally complicate the combinatorial issues.
• Missing energy reconstruction: Reconstructing the resonance mass in our case involves reconstructing the z component of the neutrino coming from the resonance decay as well as a contribution from a possible additional neutrino from an initial state radiated top.
The resonance mass reconstruction method we propose bypasses all of the above-mentioned issues. Since , the resonance decay products are boosted, with average p T ∼ M X 5/3 /B /2. As we will show shortly, simply selecting a hardest fat jet in the event, a lepton (with a possible r = 0.4 jet in its vicinity) and missing energy suffices to reconstruct M X 5/3 /B , and hence eliminates many combinatorial issues. Boosted final states also allow us to make an approximation of η ν = η l , where ν is the total missing transverse momentum in the event and l is the hardest lepton. For the purpose of M X 5/3 /B reconstruction, we find this approximation to be adequate without loss of generality. We avoid significant effects of pileup contamination by reconstructing M X 5/3 /B using the pileup-insensitive peak template momenta instead of the fat jet momentum (as in Ref. [18]), while the effects of pileup on the lepton, missing energy and a possible r = 0.4 jet in the vicinity of the lepton are manageably weak at N vtx = 50 pileup events. Finally, we have checked that the possible additional neutrino coming from the initial state radiated top does not significantly contribute to the missing transverse energy. Figure 8 shows two examples of mass reconstruction for signal events only, where we denote the true resonance mass as M X 5/3 /B and the reconstructed mass as the lowercase m X 5/3 /B . The solid blue lines represent the m X 5/3 /B distribution if no pileup was present, while the red lines show the corresponding distribution at N vtx = 50 pileup events. In both cases, our mass reconstruction method is able to resolve the resonance peak to a very good degree, while effects of pileup on the mass peak resolution remain weak at average 50 pileup events. The main question we would like to answer in this paper is how sensitive will the future LHC runs be to different M X 5/3 /B in our analysis framework? For this purpose we analyzed several signal event samples with M X 5/3 /B = (1.0, 1.25, 1.5, 1.75, 2.0) TeV. Varying other model parameters will change the value of the single production cross section but will not significantly affect the event kinematics. Hence, we fix all other couplings and scales and leave the inclusive production cross section σ X 5/3 /B a free parameter. An additional benefit of considering σ X 5/3 /B as a free variable is that our results in this section can be applied to other searches for BSM physics in the same final state channel. In this section we assume no pileup contamination and postpone the discussion of 50 interactions per bunch crossing until the next section.
To illustrate the ability of our proposal to reject SM backgrounds, we begin with the example cutflow results in Table II. We chose the values for the inclusive cross section in each table to be roughly in the mid-range of the cross section values for model parameters which address the hierarchy problem and result in a correct mass of the top quark. Perhaps the most exciting result of our analysis is that the future LHC runs will be sensitive to M X 5/3 /B ∼ 2.0 TeV top partners, where we find that 5σ sensitivity should be achievable with 35 fb −1 of data (assuming b-tagging and no forward jet tagging), while requiring a signal cross section large enough to give ∼ 10 events, as shows in Table II. Note that because our event selection is sensitive to both X 5/3 and B production, the final signal cross section we achieve for hadronic top candidate events alone is higher than the the naive estimate of the same sign di-lepton cross section (assuming a 50 % geometric acceptance for the two leptons).
We present detailed information for masses lower than 2 TeV in Table III. We find that the LHC run at 14 TeV can achieve S/B > 1 for M X 5/3 /B > 1 TeV, with ∼ 5σ significance using the b-tagging proposal of Section III D alone, while the addition of a forward jet tag from Section III C results in an almost background free signal and a significant improvement in significance at an additional 25−30% signal loss. Forward jet tagging alone is not sufficient to produce desirable sensitivity to any of the M X 5/3 /B we considered, except for very large signal cross sections and integrated luminosities. However, complemented by b-tagging, we find that forward jet tagging can significantly improve the M X 5/3 /B sensitivity. The lower M X 5/3 /B (e.g. M X 5/3 /B ∼ 1 TeV) cases benefit more from forward jet tagging, as the signal cross section is larger and hence allows for lower final signal efficiency. We find that an additional factor of ∼ 4 − 6 in S/B improvement is typically achieved by adding a forward jet tag. We show a detailed comparison of results with and without forward jet tagging in Fig. 10, where the left panels assume the b-tagging criteria without the forward jet tag, while the right panels assume both b-tagging and a forward jet tag.  Table II, where we mark the point presented in the table by a star.
There are several interesting features of our analysis. First, we find that for M X 5/3 /B > 1 TeV, the S/ √ B we achieve in the hadronic top channel is significantly better than for hadronic W candidate events, even though it results in a 50% lower signal efficiency, while the significance is comparable for M X 5/3 /B ∼ 1 TeV. The effect can be attributed to the asymmetry in the proportion of hadronic top vs. hadronic W candidate events in the background as defined in Section III B. For instance, the signal events contain hadronic top and hadronic W candidate events in the equal proportion, while the background tt events we consider always contain a leptonic top. Hence, the amount of SM tt events which will be categorised as hadronic top events is smaller and will amount only to the events in which the b quark from the leptonic top decay happens to land far enough from the lepton. Notice that the probability that a b quark will land far from the fat jet axis increases with the decrease in the fat jet p T , hence the comparable S/ √ B at lower M X 5/3 /B . The second interesting feature of our results is that the sensitivity to signal events increases with M X 5/3 /B . One of the reasons for higher efficiency at higher M X 5/3 /B is that the TOM reconstruction and tagging of boosted objects becomes more efficient at higher p T . The hard parts of a boosted jet, which TOM is designed to tag, become more prominent features of a fat jet at high p T , while a higher degree of collimation of signal fat jets at high p T make it less likely that radiation will "leak" out of the R = 1.0 cone. In addition, the fact that the high p T tails of background distributions fall-off as several powers in p T and faster than the signal distributions, imply that at high M X 5/3 /B we expect less background contamination.
The final signal efficiencies for M X 5/3 /B < 2.0 TeV, where we do not expect a large degree of mass degeneracy between the X 5/3 and B, are roughly at the level as the naive estimate of a background free same sign di-lepton analysis (assuming a detector acceptance of 50%), with the additional benefit that our method allows for good reconstruction of the resonance in a pileup-insensitive manner.
We show a more complete representation of our main results with no pileup contamination in Fig. 9 for M X 5/3 /B = 2.0 TeV , where we assume that the X 5/3 and B states are mass degenerate, while Fig. 10 shows the results for M X 5/3 /B = 1.0 − 1.75 TeV. The plots show contours of constant S/ √ B (solid lines) for various M X 5/3 /B as a function of the inclusive signal cross section and integrated luminosity. For completeness, we give S/B as dashed lines. The left panels assume the b-tagging requirement, but no forward jet tag while the right panels include both b-tagging and the forward jet tag. We find that in all considered cases, the future LHC runs have excellent potential for discovery of singly produced top partners, even in the early stages of the experiment and with as low as 20 fb −1 of data. In addition, the 14 TeV run of the LHC should be able to achieve a 2σ sensitivity, enough to rule out major parts of the parameter space even with 10 fb −1 .    Table II: Example cutflow for signal and background events for M X 5/3 /B = 2.0 TeV and inclusive cross sections σX 5/3 +B . σ s,tt,W +jets are the signal/background cross sections including all branching ratios, whereas are the efficiencies of the cuts relative to the generator level cross sections. The results assume no pileup contamination. The signal cross section assumes both X 5/3 and B production.

G. Effect of Pileup on M X 5/3 /B Sensitivity
As we pointed out in the previous sections, our event selection criteria contain several observables which are weakly affected by pileup (i.e. Ov, M X 5/3 /B , forward jet tag). However, some of the other selection criteria (i.e. m jl , p fj T ) are somewhat pileup sensitive. The lower p T cut on the fat jet allows for some low fat jet p T events to migrate into the sample which passes the Basic Cuts due to the fact that we use a large R = 1.0 cone for fat jet clustering 9 . Furthermore, the effects of pileup on any observable constructed out of the r = 0.4 jets are limited (compared to the fat jet) by the small jet cone size, but can still be non-negligible at 50 average pileup events.
Overall effects of pileup on our results are fairly mild and can be mitigated by slight modifications of the cuts on pileup sensitive observables. For illustration, we analyzed three samples of signal events with masses M X 5/3 /B = 1.0, 1.75, 2.0 TeV and the relevant backgrounds in the presence of average N vtx = 50 interactions per bunch crossing. In order to reduce the effects of "pileup induced migration", we increase the transverse momentum threshold on the fat jet to p T > 600 GeV for M X 5/3 /B = 1.75 TeV and p T > 500 GeV for M X 5/3 /B = 1.0 TeV, as well as shift the cut on the m jl > 300 GeV in both cases. We do not modify the cuts on pileup insensitive observables. The increase in lower p T , m jl cuts is most certainly dependent on the amount of pileup contamination and requires further consideration at N vtx > 50 pileup events.
In addition to shifting and broadening kinematic distributions, high pileup is likely to produce uniformly distributed, soft "pileup jets" which could mimic leptonic top decays in case they land close enough to the hardest lepton. In order to reduce the effect of fake pileup jets on the event cathegorisation criteria from Section III B (i.e. whether the event is a hadronic top or hadronic W candidate), we consider only r = 0.4 jets with p T > 50 GeV which are in the vicinity of the hard lepton. And while it is in principle possible to design alternative criteria for cathegorising events into hadronic top and hadronic W candidates, here we choose to postpone this detail until future studies. Table IV and Fig. 11 show the effects of pileup on our results in more detail. On a cut-by-cut basis, we find that the signal cross section remains weakly affected by pileup at N vtx = 50 interactions per bunch crossing, with the efficiencies of each cut remaining at a few percent level compared to our study with no-pileup. The background events are somewhat more pileup sensitive, especially W +jets, as multi-jet events are characterized by more soft components and hence more pileup susceptible.
We find that without any pileup correction or subtraction, we can achieve the same signal cross sections as in our studies without pileup while the amount of background events which survive the event selections is increased by roughly a factor of 2 − 3. Still, we find that for our benchmark data points, a ∼ 6σ sensitivity is achievable for M X 5/3 /B = 1.75 TeV and ∼ 4σ for M X 5/3 /B = 1.0 TeV with 20 fb −1 , assuming both b-tagging and forward jet 9 Note that, in principle, the effects of pileup can further be suppressed by lowering the size of the fat jet cone without increasing the lower p T cut on the fat jet M X 5/3 /B = 1.75 TeV, σ X 5/3 /B = 50 fb, L = 20 fb −1         Table IV, where we mark the point presented in the table by a star. tagging. Our results on effects of pileup on signal significance can be interpreted as the most pessimistic scenario and a lower limit on how well the experiments can perform as a function of pileup mitigation efficiency.
Future LHC experiments are likely to employ advanced pileup subtraction techniques using track information and overtaxing, which could only further improve the performance of event selection in a high pileup environment. However, it is important to note that since we employ a number of already pileup insensitive observables, it is likely that no aggressive pileup subtraction technique will be necessary to recover the full power of our events selection.

H. A Few Remarks on the Complementarity of Top Partner Searches
In case a top partner is discovered at the LHC, combining results from different channels could greatly improve the significance of the signal. Yet, there is additional information one can obtain from measurements of both same sign di-lepton and other decay channels.
For instance, a possible mass degeneracy between the X 5/3 and B states could be difficult to untangle with the current mass resolution of the LHC experiments. In case a signal is observed, considering only the invariant mass distribution of a tW system or the H T distribution would likely not be sufficient to determine whether there are one or more resonances observed in the signal events. Complementary information from same sign di-lepton channel M X 5/3 /B = 2.0 TeV, σX 5/3 +B = 15 fb, L = 35 fb −1 , Nvtx = 50    could aid in resolving the mass degeneracy. As noted before, same sign di-lepton searches are sensitive only to the production o the X 5/3 partner and not the B state. A simple cross section measurement (upon unfolding) of both the same sign di-lepton and lepton-jet channels should thus show a difference ∆σ = σ l+fj X 5/3 +B − σ 2l where l + fj refers to lepton-jet channel and 2l represents the same sign di-lepton channels. Note that normalising ∆σ with, say, the sum of same sign di-lepton and lepton-jet cross sections can further reduce the systematic uncertainties. Furthermore, indirectly deducing the presence of a B in the signal is also possible by considering charge asymmetries. As outlined in Section II B, X 5/3 production dominates overX 5/3 because the former is produced from g and an uptype quark in the initial state while the latter is produced from g and a down-type quark. In the same sign di-lepton search one should thus observe an excess of the l + l + signal over the l − l − events. Analogously, if the lepton charge would be measured in the lepton-jet events we are investigating here (e.g. top partner decays into W t → lν l jjb), one should observe an excess of l + events in the final state over l − if the decay results from X 5/3 orX 5/3 . If however the decay results from a B orB, there is no charge asymmetry.B production dominates over B production, again because of the larger u quark PDF in the initial state, but B andB decay into W + W − and a b-jet, and for both, the final state lepton can arise from either the W + or the W − decay with equal probabilities.
In conclusion, if the X → W t → lν l jj signal arises from both, X 5/3 and B (and their antiparticles), the charge asymmetry is partially washed out and does not match the charge asymmetry of the di-lepton signal, and hence indirectly pointing towards the presence of the B state. A possible advantage of the lepton asymmetry measurement would also be a reduced sensitivity to experimental systematics, although it will be susceptible to the effects of charge symmetric backgrounds. However, given our results from previous sections and a an S/B > 1, it is likely that the background effects on the charge asymmetry will be manageably low.

IV. CONCLUSIONS
In this paper we study the potential of the early run-II of the LHC to discover and measure heavy fermionic top partners. So far, most experimental studies have been focusing on pair production relying on same-sign di-lepton signals as a main feature for distinguishing the top partner signals from the SM background. However, as pointed out in Ref. [7], single production has an advantage of utilizing an efficient boosted tagging strategy without loosing signal efficiency from requiring two leptonic decays. In addition, the single production cross section becomes larger than that of pair production in the higher mass region (e.g. somewhere between 1 TeV and 1.5 TeV depending on models), which makes the single production process more relevant for the upcoming run of the LHC. In conjunction with the our usage of jet substructure physics and b-tagging, we also propose a new method to tag forward jets that characterise our signal events. We demonstrate that both our substructure and forward jet handles are robust against contamination from pileup.
For the purpose of illustration, we focused on partial composite scenarios for the top sector, where both top quark chiralities consist of an elementary fermion field which has a sizable mixing with the strong dynamics sector. We use the Minimally Composite Higgs Model, based on the coset space SO(5)/SO(4) as the benchmark model for signal events, where we kept the signal cross section a free parameter in order to reduce the model dependence of our results. Our analysis considered the most significant signal which comes from the singly produced charge 5/3 and −1/3 partners (X 5/3 and B), and their conjugates. X 5/3 is typically the lightest top partner, with the mass splitting of B and X 5/3 becoming small at high M 4 . In addition, the decay topology of X 5/3 and B is effectively identical when the semi-leptonic final states are considered, such that the combined signal typically has the largest cross section.
The singly produced X 5/3 and B partners appear in a final state with an additional top and a light jet, so that the signal has a ttW j event topology. For our search strategy, we require that only one of the daughter products of the top partners (top or W ) decays leptonically, but not simultaneously. For jet substructure analysis we employ the TemplateTagger v.1.0 implementation of the Template Overlap Method, which is relatively robust against a large pile-up contamination. The presence of two highly boosted objects allows for a straight-forward reconstruction of the top partner mass, despite the missing energy component and high pileup.
Since our signal has an additional high energy forward jet, we propose a new approach to forward jet tagging in order to limit the pileup contamination in the forward region. We propose to cluster the jets in the forward region with a cone smaller than the standard r = 0.4 (i.e. r = 0.1, 0.2), which does not require an elaborate re-calibration of jet observables, since all we require is to tag the forward jet as opposed to measure it. In addition, we include a semi-realistic b-tagging algorithm into our analysis, as multiple b-jets appear in our signal events. As our forward jet tagging proposal is new, we presented the result of our analysis both with and without forward jet tagging, while we found that we can achieve the best result when both b-tagging and our forward jet tagging are employed.
The main results of our analysis can be summarized as follows: • We showed that Run-II of the LHC at 14 TeV can detect and measure 2 TeV top partners in a lepton-jet final state, with almost 5σ signal significance and S/B > 1 at 35 fb −1 . The results assume a total production cross section of 15 fb, an average 50 interactions per bunch crossing and no pileup subtraction. In a no-pileup environment, the significance is approximately twice as high.
• A sizeable part of the model parameter space parts which result in a 2 TeV top partner can be ruled at 2σ with as little as 10 fb −1 .
• High levels of pileup (i.e. 50 interactions per bunch crossing) present a challenge for the lepton-jet final states. However, even with no pileup correction/subtraction lepton-jet channels provide sufficient sensitivity to major parts of the fermonic top partner parameter space, whereby the use of several pileup-insensitive observables greatly reduces the effects of pileup contamination.
• The searches for singly produced fermionic top partners will greatly benefit from the introduction of a forward jet tag, with the additional factor of ∼ 2 in the overall rejection power at 60% signal efficiency. We proposed a simple new procedure of how to mitigate effect of high pileup levels on forward jet multiplicity.
• We find that the sensitivity the experiments can achieve in the hadronic W -leptonic top channel is comparable to the hadronic top-leptonic W channel for M X 5/3 /B ∼ 1 TeV, while the sensitivity of hadronic top channel is superior for higher masses.
Note that it will be straightforward to combine our current analysis with the conventional same-sign lepton searches in the single production of charge 5/3 and −1/3, as well as pair production channels. Furthermore, our method can be easily adapted in other top partners searches, including charge 2/3 partners, and other models of top partners beyond the minimal composite Higgs models. We also want to emphasize that our analysis is done independent of the underlying physics model, by keeping the signal cross section a free parameter, such that any new physics searches with a ttW j event topologies can use our result directly.
Finally, in case a signal is observed at the future LHC runs, a combination of lepton-jet channels and same sign di-lepton channels offers valuable information beyond the simple improvement in signal significance. A possible mass degeneracy between the heavy partner states can be disentangled by comparing results of same sign di-lepton measurements and signals from lepton-jet events, as the former is sensitive only to 5/3 charge states, while additional states might appear in the latter.
where Π ≡ (Π 1 , Π 2 , Π 3 , Π 4 ) T and Π ≡ Π · Π, and where the last equation holds in unitary gauge, where the Goldstone multiplet reduces to The components of the CCWZ d µ and e µ ≡ e a µ t a symbols are ∇ µ Π is the derivative of the Goldstone fields Π "covariant" under the EW gauge group, where A a µ contains the elementary SM gauge fields written in an SO(5) notation that is where s w and c w are respectively the sine and cosine of the weak mixing angle. Note that the d µ and e µ symbols transform under the unbroken SO(4) symmetry as a fourplet and an adjoint, respectively. In unitary gauge, the e µ symbol components reduce to

Appendix B: Details of Composite Higgs Models with Partially Composite Top
The model used in this article in order to illustrate the potential of boosted top searches in discovering composite quarks in composite Higgs models is the MCHM 5 , which is based on the breaking of SO(5)×U (1) X → SO(4)×U (1) X SU (2) R × SU (2) L × U (1) X of a strongly coupled theory. The SU (2) L and a U (1) subgroup of SU (2) R × U (1) X are gauged in order to provide the electroweak gauge bosons. The non-linearly realized Higgs is parameterized by the Goldstone boson matrix which in unitary given in Eq.(A4).
Beyond the (pseudo-) Goldstone boson Higgs, the low energy description the strongly coupled sector is expected to contain scalar, fermionic and vector resonances, typically at or below a scale 4πf . Here, we use a bottom-up approach and only include a minimal set of light fermionic resonances. The symmetry structure of the strong dynamics does not fix the embedding of the fermionic resonances. For simplicity we assume that the top partners live in a single 5 5 5 multiplet (transforming non-linearly under SO(5)) with a U (1) X charge of 2/3, while the elementary third generation quarks are embedded as incomplete 5 5 5 multiplets (transforming linearly under SO (5) The 3rd family (partner) particle content along with its quantum numbers is summarized in Table V. The states given above are the gauge eigenstates of the model, which mix due to EWSB as discussed below. The resulting mass eigenstates are two states b, B with (electromagnetic) charge −1/3, four states t, T f 1 , T f 2 , T s with charge 2/3, and the state X 5/3 with charge 5/3. In what follows, we adopt the Callan-Coleman-Wess-Zumino prescription in order to write down the effective Lagrangian in a non-linearly invariant way under SO (5). The Lagrangian of the model is (3)c  3 3 3  3 3 3  3 3 3  3 3 3  3 3 3  3 3   The first line denotes the kinetic terms for the elementary fermions withq L = (t L ,b L ), and the Standard model covariant derivatives. The second line contains the composite quark mass terms with a fourplet mass M 4 and a singlet mass M 1 as well as the kinetic terms, where the covariant derivatives for the singlet and four-plet are given by

Partial compositeness: masses and mixing
Entering the Goldstone matrix into the effective Lagrangian and expanding around the vacuum expectation value, we obtain the quark mass terms The mass matrices depend on the fourplet and singlet mass scales M 4 and M 1 and the left-and right-handed pre-Yukawa couplings y L,R . A priory all these parameters are complex. However, all but one phase can be absorbed by field redefinitions of the quarks and quark partners. We choose the phase remaining phase φ to be on the singlet mass term (as indicated in the Lagrangian Eq.(B2)) and y L,R and M 1,4 to be real in what follows, while c L,R are complex parameters. X 5/3 is the only state with electric charge 5/3 and as such must be a mass eigenstate with mass M 4 . The charge −1/3 mass eigenstate are where with masses m b = 0 and M B = M 2 4 + y 2 L f 2 , where b is identified with the SM-like bottom quark, while B is a heavy partner state. 10 In the charge 2/3 quark sector, the elementary top mixes with the two fourplet states T, X 2/3 as well as with the singlet stateT . For our phenomenological studies, we perform the diagonalization numerically. To provide a qualitative discussion, here, we provide some approximate results by expanding the mass matrix in ≡ v/f . The charge 2/3 mass eigenstates are where U t,φ 1, 1, 1), withφ being the phase of 1 − M4 M1 e −iφ , of SU (2) doublet and singlet states and therefore at least one insertion of v/f . Therefore, m t as well as all matrix elements of U L/R between SU (2) doublet and singlet components are (at most) of O( ).
For our later phenomenological studies, let us discuss typical parameter ranges and mass scales. In order to avoid too large fine-tuning, the compositeness scale f should be close to the electroweak scale. On the other hand, electroweak precision constraints imply f 800 TeV [4,5] so that we assume f to lie at the TeV scale. The composite mass scales M 1 and M 4 arise from the condensation of the strongly coupled theory and therefore have a natural value between f and ∼ 4πf . Searches for top partners in the 8 TeV LHC run impose a bound of M 4,1 800 GeV already, and in this article, we aim to explore prospects for LHC at 13 TeV to explore top partner masses around 2 TeV, i.e. above the scale f . Finally, requiring the top mass Eq.B8 to take its measured value requires y L and y R to be O(1). Therefore, the typical partner we consider contains the SO(4) singlet partner T s whose mass scale is set by M 1 , a almost degenerate SU (2) doublet (X 5/3 , T f 1 with mass M 4 and a second almost degenerate SU (2) doublet (T f 2 , B) which for f < M 4 and y L ∼ 1 is also close to degenerate with the former SU (2) doublet.

Interactions of quarks with quark partners in the gauge eigenbasis
The interaction terms of the model are derived by writing out the Goldstone matrix, the d µ and the e µ symbols in the effective Lagrangian Eq. (B2) and expanding in ≡ v/f . We first calculate the couplings in the gauge eigenbasis ψ t L,R ≡ (t ,T ,X 2/3 ,T ) L,R ,ψ b L,R ≡ (b ,B ) L,R ,X 5/3 L,R . The pre-Yukawa terms yield a contribution to Higgs-quark couplings where The kinetic terms include an e-term contribution which yields L q,gauge = α=L,Rψ b α / W − G B,g α ψ t α +X 5/3 α / W + G X,g α ψ t α +ψ t α / ZG Zt,g α ψ t α +ψ b α / ZG Zb,g α ψ b α + h.c.
where δ L α is 1 for α = L and 0 for α = R.
The d µ term interactions in Eq. (B2) yield further contributions to the quark interactions with gauge bosons and the Higgs which read To rewrite the first term of Eq. (B19) we partially integrate it and make use of the quark equations of motion to obtain where, using Eq. (B4) Collecting all interaction terms then yields the interaction Lagrangian in the gauge eigenbasis Again, the coupling structure is easily understood in terms of SU (2) multiplets in the expansion. Concerning the gauge couplings, at leading order, the elementary states couple SM-like, and the fourplet and singlet composite quarks have canonical couplings determined by their charge. At O( ), the d-terms lead to interactions between EW gauge bosons, fourplet and singlet states. Furthermore, there are no gauge interactions with one elementary and one composite quark; these are solely induced due to the mixing or the mass eigenstates. The higgs -quark interactions obtain contributions from the pre-yukawa terms which where also responsible for the mass mixing. In addition, the d-terms contain derivative interactions of the Higgs to singlet and fourplet quarks which can be rewritten as Yukawa couplings via the quark equations of motion. L/R given in Eqs. (B6,B8,B8). For our simulations we implemented the full set of interactions and diagonalized the mass matrix numerically, but the main phenomenological features can be readily understood from the dominant couplings of the lightest quark partner states to SM gauge bosons and SM-like quarks which are relevant for the single-production of the quark partner as well as its decay channels. X 5/3 : The exotically charged X 5/3 has mass M 4 and is thus the lightest fourplet quark partner. Its couplings to only SM particles are Other couplings to two SM particles are forbidden due to (electric) charge conservation. The structure of the dominant right-handed coupling can be understood from the mass insertion picture as shown in Fig. 12.

B:
The B has charge −1/3 and can a priori couple to W t, Zb, or hb. However, the b does not have any pre-yukawa couplings within this model so that aBhb coupling term is absent. ABZb coupling is absent as well. In the gauge eigenbasis, noB Zb couplings are present. In the right-handed sector, b R and B R are already mass eigenstates. The left-handed coupling in Eq.(B34) is universal for b L and B L , and rotation into the mass eigenbasis does not induce a "mixed"BZb interaction.BW t are present and given by Figure 12: Contributions to g R XW t at O( ) from the mass insertion picture. At O(1), the gauge eigenstate X 5/3 R only couples to W + and X 3/2 R via the e-term. The X 5/3 R then mixes via mass and VEV insertions with t R andTR, which make up the O(1) components of the mass eigenstate tR. From the mass matrix in Eq. (B4) it can be seen that the only mass insertion combination at O( ) goes from X 2/3 R through X 2/3 L to t R . Combining the couplings and mass insertions and taking into account that the t R component of tR has a coefficient M1/MT s yields the first contribution to the coupling g R XW t in Eq. (B37). At O( ), the gauge eigenstate X 5/3 R couples to W + andTR via the d-term. TR mixes viaTL to t R at O(1). Projecting t R on tR and assembling the couplings and insertions yields the second term of g R XW t in Eq. (B37). The analogous analysis for g L XW t results in couplings of O( 2 ) because the mixing of X ( 2/3 L) to t L is of O( ) while the mixing ofTL to t L is of O( 2 ). Couplings of other heavy quark partners to SM quarks and EW gauge bosons or the Higgs can be understood analogously.