Search for resonant $\mathrm{t}\overline{\mathrm{t}}$ production in proton-proton collisions at $\sqrt{s} =$ 13 TeV

A search for a heavy resonance decaying into a top quark and antiquark ($\mathrm{t\bar{t}}$) pair is performed using proton-proton collisions at $\sqrt{s} =$ 13 TeV. The search uses the data set collected with the CMS detector in 2016, which corresponds to an integrated luminosity of 35.9 fb$^{-1}$. The analysis considers three exclusive final states and uses reconstruction techniques that are optimized for top quarks with high Lorentz boosts, which requires the use of nonisolated leptons and jet substructure techniques. No significant excess of events relative to the expected yield from standard model processes is observed. Upper limits on the production cross section of heavy resonances decaying to a $\mathrm{t\bar{t}}$ pair are calculated. Limits are derived for a leptophobic topcolor Z' resonance with widths of 1, 10, and 30%, relative to the mass of the resonance, and exclude masses up to 3.80, 5.25, and 6.65 TeV, respectively. Kaluza-Klein excitations of the gluon in the Randall-Sundrum model are excluded up to 4.55 TeV. To date, these are the most stringent limits on $\mathrm{t\bar{t}}$ resonances.


Introduction
The top quark (t) is the most massive known fundamental particle [1,2] in the standard model. It has a Yukawa coupling to the Higgs field that is near unity. It is also closely connected to the hierarchy problem, where the largest corrections to the Higgs mass arise from top quark loops. Furthermore, studies of the top quark may provide insight into the mechanism of electroweak (EW) symmetry breaking.
Many theories beyond the standard model (SM) predict heavy resonances at the TeV scale, which would decay to top quark and antiquark (tt) pairs. These resonances can present themselves as peaks on top of the falling tt invariant mass spectrum or as a distortion of the tt spectrum if the resonance has a large width and a mass above the center-of-mass energy of the colliding partons. Resonances decaying to tt pairs can be found in models that contain TeV scale color singlet Z bosons [3][4][5], a pseudoscalar Higgs boson that may couple strongly to tt pairs [6], axigluons [7][8][9], or colorons [10][11][12][13], and especially models that contain a leptophobic topcolor Z [14]. Additionally, extensions of the Randall-Sundrum model [15,16] with extra dimensions predict Kaluza-Klein (KK) excitations of the gluons g KK [17] or gravitons G KK [18], which can have large branching fractions to tt pairs. This analysis searches for spin-1 resonances that do not interfere with SM tt production. Previous searches at the Fermilab Tevatron have excluded a leptophobic Z boson up to 900 GeV [19][20][21][22][23][24] at 95% confidence level (CL). Experiments at the CERN LHC have excluded various Z and g KK models at 95% CL in the 1-4 TeV mass range [25][26][27][28][29][30][31][32]. The results presented here represent a significant improvement on the previous searches for tt resonances.
This paper presents a model-independent search for tt resonances. Since no excess is seen, limits are calculated on several spin-1 resonance models of varying widths. The tt system, and all its daughter particles, decay as described by the SM. The top quark predominately decays to a W boson and a bottom quark (b). Each of the two W bosons in the event can decay to either a lepton and its corresponding neutrino or to hadrons. The analysis considers three subanalyses based on the decay modes of the two W bosons: dilepton, single-lepton, and fully hadronic decay modes of the tt system. In the fully hadronic channel, both W bosons decay to hadrons. In the single-lepton channel, one W boson decays to an electron (e) or muon (µ) and its neutrino (ν) counterpart, while the other W boson decays to hadrons. In the dilepton channel, both W bosons decay to an e or µ and a ν. The leptonic selections are not optimized to identify electrons or muons originating from leptonically decaying tau leptons; however, such particles are not excluded by the event selections. The search is based on √ s = 13 TeV protonproton (pp) collision data collected in 2016 by the CMS experiment at the LHC, corresponding to an integrated luminosity of 35.9 fb −1 .
The dilepton final state consists of two leptons (µµ, ee, or µe), two jets originating from bottom quarks (b jets) with high transverse momentum (p T ), and missing transverse momentum ( p miss T ). The large mass of the resonance causes the resulting top quarks to have a significant Lorentz boost, which leads to a collimated system consisting of a lepton and a b jet. To account for the overlap between the lepton and the b jet, special reconstruction and selection criteria are used to increase lepton selection efficiency and reduce the SM background. The dominant irreducible SM background arises from tt nonresonant production. Smaller contributions are due to a Z boson produced in association with jets (Z+jets), single top quark, and diboson processes. Events that have a large separation between the lepton and b jet are allocated to control regions (CR), which are used to validate the modeling of the SM backgrounds.
The single-lepton final state consists of one lepton (µ or e), at least two high-p T jets, and p miss T .
In this channel also, the final state particles from the decay of the tt pairs have a large Lorentz boost because of the mass of the resonance. Leptons from the decay of the W boson are found in near proximity to the b jet from the top quark decay. The same lepton reconstruction and selection criteria used in the dilepton channel are used in the single-lepton channel. In addition to those techniques, a special triggering technique is used to select events with a single nonisolated lepton and an additional jet. A t tagging algorithm is used to identify top quarks where the daughter W boson decays hadronically (t → W b → qq b). Events with a jet that passes the t tagging criteria are classified into a category with higher sensitivity. The largest irreducible background is the tt continuum production, while the largest reducible background is from W bosons produced in association with jets (W+jets). The latter background is separated from the signal using a multivariate analysis technique.
The fully hadronic channel contains events with a dijet topology, where both large radius jets are required to pass t tagging criteria that select Lorentz-boosted hadronically decaying top quarks. Because of the dijet topology of the search region, the largest reducible background arises from dijet events produced from quantum chromodynamic (QCD) interactions between the colliding protons. This background, referred to as QCD multijet production, can be reduced considerably by requiring one of the subjets in each of the two large radius jets, which are selected by the t tagging algorithm, to be consistent with the fragmentation of a bottom quark [33]. A subjet is defined as a smaller radius jet reconstructed within a larger radius jet. The use of subjet b tagging for categorization nearly eliminates the QCD multijet background leaving only the tt continuum in the highest sensitivity category.
Except for the QCD multijet background in the fully hadronic channel, the shapes of all SM backgrounds are estimated from simulation. The total normalization of each simulated sample is obtained from a simultaneous binned maximum likelihood fit to the reconstructed tt invariant mass (m tt ) distribution for the single-lepton and fully hadronic analyses and S T for the dilepton analysis, where S T is defined as The variable S T is used because it has a greater sensitivity to signal than m tt , in the dilepton final state. A limit on the production cross section of heavy resonances is extracted by performing a template-based statistical evaluation of the m tt (single-lepton and fully hadronic) and S T (dilepton) distributions simultaneously in all of the channels.
Jets are clustered using the anti-k T jet clustering algorithm [38] with a distance parameter of 0.4 (AK4 jets). If a lepton is found with ∆R < 0.4 of an AK4 jet, its four-momentum is subtracted from that of the jet. The single-lepton and fully hadronic analyses also use anti-k T clustered jets with a distance parameter of 0.8 (AK8 jets). These larger-radius jets are used to tag the hadronic decay of top quarks. A high-mass resonance decay creates daughter particles with significant Lorentz boost. The three jets from the top quark decay merge into a single-larger AK8 jet. Jets in all three channels are contaminated with neutral particles that are generated from additional pp collisions within the same or a neighboring bunch crossing (pileup). The extra energy in each jet is corrected based on the average expectation of the pileup within the jet footprint [43]. The expected energy offset due to pileup is modeled as a function of the number of primary vertices in the event [40]. Jets that are produced from the decay of charm and bottom quarks are identified using the combined secondary vertex algorithm (CSV) [44]. Loose, medium, and tight operating points are used in this analysis. They have a probability of 10, 1, and 0.1%, respectively, of misidentifying a light-parton jet as heavy flavor, where the light-flavor jet has p T > 30 GeV and is determined from a simulated multijet sample with a center-of-mass energy between 80 and 120 GeV [33]. They correspond to a b tagging efficiency of 81, 63, and 41%, respectively, for b jets (p T > 20 GeV) in simulated tt events. All jets are required to pass a minimal set of criteria to separate them from calorimeter noise and other sources of jets that do not originate from the PV [45]. Events are also required to pass a set of selections that remove spurious p miss T that is generated from calorimeter noise [46].
The t tagging algorithm [47,48], which is based on the algorithm described in Ref. [49], is applied to AK8 jets that use pileup per particle identification (PUPPI) corrections [50], referred to as PUPPI jets, in order to separate hadronically decaying top quarks from light quark or gluon jets. While CHS only removes charged particles originating from pileup, PUPPI corrects for both charged and neutral pileup particles. PUPPI jets, as opposed to CHS jets, are therefore used for t tagging because of their better performance as a function of pileup. The CMS t tagging algorithm only considers jets with p T > 400 GeV, as lower-momentum top quarks frequently decay into resolved jets. The algorithm iteratively reverses the jet clustering procedure in order to remove soft radiation. First, it reclusters the AK8 PUPPI jet with the Cambridge-Aachen jet clustering algorithm [51]. It then separates the jet (j) into two subjets, j 1 and j 2 , which must satisfy the "soft drop" (SD) criterion where p T1 and p T2 are the transverse momenta of the two subjets and ∆R 12 is the distance between them. The implementation of the SD algorithm used in this analysis has an angular exponent β = 0, making it equivalent to the "modified mass drop tagger" algorithm [52]. Additionally, a soft cutoff threshold of z cut = 0.1 and a characteristic radius R 0 = 0.8 [53] are used. If the SD criterion is met, the procedure ends with j as the resulting jet. If not, the lowerp T subjet is discarded and the declustering procedure continues with the higher-p T subjet. The SD mass (m SD ) of the jet pair is required to be near the mass of the top quark (105 < m SD < 210 GeV). The CMS t tagging algorithm also requires that the N-subjettiness [54,55] ratio (τ 32 ≡ τ 3 /τ 2 ) must be less than 0.65. The N-subjettiness (τ N ) is a measure of the consistency of an AK8 PUPPI jet with N or fewer subjets, and is defined as where i is a summation over all jet constituents, d 0 is a normalization constant, and ∆R is the distance between a given jet constituent i and a candidate subjet axis N.

Triggers and data set
The events in the dilepton channel are triggered by single-lepton and dilepton triggers without isolation requirements. The triggers for µµ and eµ events require one muon with p T > 50 GeV and with |η| < 2.4 that is seeded by hits in either the muon chambers or the inner tracker. The ee events are selected using a dielectron trigger that requires the presence of two electrons with p T > 33 GeV and |η| < 2.5.
Events used in the single-lepton channel pass either a single electron or a single muon trigger. The single-lepton muon channel uses the same triggers as the dilepton µµ and eµ channels. The triggers for the electron channel require one electron with p T > 115 GeV or an electron with p T > 55 GeV and a PF jet with p T > 165 GeV. Both triggers require electrons within |η| < 2.5, and the electron-jet combination trigger requires the jet to be within |η| < 2.4. In the combination trigger, if the electron lies within the jet footprint, the four-vector of the electron is subtracted from the uncorrected four-vector of the jet, and then the jet energy corrections are reapplied. Neither the muon or electron triggers have isolation requirements.
The fully hadronic analysis uses events that are selected by a logical 'OR' of five different triggers. The first trigger requires a single AK8 jet with p T > 450 GeV, a second trigger requires an AK4 jet with p T > 360 GeV and mass (m jet ) > 30 GeV. A third trigger requires H T > 800 GeV, where the H T is the scalar sum of the p T of every AK4 PF jet above 30 GeV in the event. A fourth trigger requires H T > 900 GeV, and remains un-prescaled during the acquisition of data. The final trigger requires that the H T > 700 GeV, but also requires a jet with m jet > 50 GeV.
Small differences in trigger efficiency between data and simulation in the dilepton and singlelepton channels are accounted for with corrections determined from events selected by triggers with different conditions.

Simulated events
The Z → tt process is simulated using the MADGRAPH5 aMC@NLO v5.2.2.2 [56] event generator, which produces a resonance with the same spin and left-and right-handed couplings to fermions as the SM Z boson. Matrix element calculations are done at tree level and include up to three additional partons for the g KK and most Z models, Z bosons above 5 TeV are simulated with only up to two additional partons in their final state. The Z → tt process is simulated at masses between 500 GeV and 7 TeV for resonances with a relative decay width (Γ/m) of 1% (narrow), 10% (wide), and 30% (extra-wide). Matching between the hard matrix element interactions and the lower energy parton showers is done using the MLM algorithm [57]. The KK gluon excitation is simulated using PYTHIA 8.212 [58] with the couplings described in Ref. [59]. The Γ/m of the g KK resonance lies between the wide and extra-wide Z resonances, depending on its coupling to the top quark. The expected Z production cross section is calculated at NLO accuracy, and the g KK production cross section is calculated at LO. A multiplicative factor of 1.3 is applied to the g KK cross section as an NLO K factor [60]. Both the Z and g KK processes are simulated without interference from SM tt production.
The invariant mass distribution of the tt system at the parton level for Z resonances with three different widths and a g KK resonance can be seen in Fig. 1. The plots are normalized such that the total integral of each signal model is 1. A resonant structure is manifest at 3 TeV, but at 5 TeV the off-shell component of the signal is strongly enhanced by the available parton luminosity at lower masses. This effect is not noticeable for the narrow Z signal, but becomes more apparent for the wider Z resonances. Such behavior is expected for resonant tt production in general.
The tt pair production background is simulated at next-to-leading order (NLO) with the POWHEG v2 generator [61][62][63][64]. The POWHEG generator is also used to simulate single top quark production via EW interactions at NLO [65,66]. The W+jets background is simulated with the MADGRAPH5 aMC@NLO generator with the FxFx matching prescription between matrix element calculations and parton shower simulations [67]. The Drell-Yan (DY) process with an invariant mass between 10 and 50 GeV is simulated at NLO with the same generator, while for an invariant mass above 50 GeV, leading order (LO) precision is used. Diboson and QCD multijet production are simulated at LO with PYTHIA. It should be noted that simulated multijet events are only used for the background estimate when QCD multijet production is a secondary background. In the case of the fully hadronic analysis, the multijet background is estimated from a CR in data, as described in Section 7.3. For all simulated events, PYTHIA with the CUETP8M1 tune [68] is used to describe the fragmentation and hadronization. All the samples are generated with the NNPDF 3.0 parton distribution functions (PDFs) [69]. All sample cross sections are normalized to the latest theoretical calculations, usually at next-to-NLO precision [70][71][72][73].
All samples are processed through a GEANT4-based simulation [74], which models the propagation of the particles through the CMS apparatus and the corresponding detector response. For all samples, the pileup distributions are weighted to have an average of 23 pileup interactions per event, as measured in data. The same event reconstruction software is used for data and simulated events. Differences of a few percent in the resolution and reconstruction efficiency are corrected to match those measured in data using dedicated samples from data [75].  contributions from low-mass resonances and Z/γ(→ )+jets production in events with sameflavor lepton pairs, the dilepton invariant mass is required to be above 20 GeV and outside of the Z boson mass window 76 to 106 GeV. Contamination from QCD multijet background is reduced by applying a two-dimensional (2D) selection for both leptons: ∆R min ( , j) > 0.4 or p T,rel ( , j) > 15 GeV, where ∆R min ( , j) is the minimum ∆R-distance between the lepton candidate and any AK4 jet with p T > 15 GeV and |η| < 3 and p T,rel ( , j) is the p T of the lepton with respect to the axis of the ∆R-nearest AK4 jet. The 2D selection reduces the QCD multijet background by a factor of ≈100. Events are further required to contain at least two AK4 jets with |η| < 2.4 and p T > 100 and 50 GeV for the leading and subleading jets, respectively. It is required that at least one of the two leading jets must be b tagged as determined by the loose CSV tagger operating point. Finally, p miss T is required to be larger than 30 GeV. The resulting sample is dominated by the irreducible tt background, which amounts to >90% of the total background. Figure 2 shows the distributions of ∆R sum = ∆R( 1 , j) + ∆R( 2 , j) in µµ, ee, and eµ subchannels, where ∆R( 1 , j) and ∆R( 2 , j) are the ∆R variables between the leading and subleading lepton and the nearest jet. The lepton-jet pairs from Z boson decays are expected to be collimated and populate the low-∆R sum region. The ∆R sum variable is used to separate events into signal-and background-enriched samples: ∆R sum < 1 and 1 < ∆R sum < 2 defines the boosted and nonboosted signal regions (SRs), respectively, whereas ∆R sum > 2 defines the backgroundenriched region. The shape and normalization are in agreement between data and simulation at low ∆R sum , which is the region of interest for separating boosted and resolved events.

Single-lepton channel
The selection for events used in the single-lepton analysis requires the presence of a muon with p T > 55 GeV and |η| < 2.4 or an electron with p T > 80 GeV and |η| < 2.5. Neither lepton has an isolation requirement other than passing the lepton 2D selection, which requires the ∆R min ( , j) > 0.4 or the p T,rel ( , j) > 25 GeV, where both quantities are calculated with respect to all AK4 jets with p T > 15 GeV. Events with a second lepton are removed from the sample to avoid any overlap with the dilepton channel. Events are also required to contain at least two AK4 jets with |η| < 2.4 and a minimum p T of 150 (185) GeV for the leading jet in the muon (electron) channel, and 50 GeV for the subleading jet. To reduce the contributions to the sample from QCD multijet events, additional requirements are imposed. In the muon channel, p miss T and H T are required to be greater than 50 and 150 GeV, respectively, where H T ≡ p miss T + p T . In the electron channel, it is required that p miss T > 120 GeV. The electron channel has a higher p miss T requirement because of the larger QCD multijet background. As a result of this requirement, an additional selection on H T would not increase performance. In order to suppress the contamination from events originating from W+jets events, a boosted decision tree [76] (W+jets BDT) was trained using the TMVA software package [77] on the jet-related variables listed below, in order of importance.
1. ∆R min ( , j), i.e., the separation between the lepton and its closest jet.
2. The CSV score of the subleading and leading AK4 jets.
3. The number of jets. 4. p T,rel ( , j), i.e., the relative momentum between the jet and nearby lepton. 5. The reconstructed mass of the leading AK4 jet. 6. ∆R min ( , j) p T (j), i.e., the ∆R separation between the jet and nearby lepton scaled by the p T of the jet.
7. The reconstructed mass of the subleading AK4 jet.
8. The shape variable S 33 of the sphericity tensor , where α, β correspond to the x, y, and z components of the momentum vectors of the jets [78,79]. 9. H T + H T , i.e., the summation of the hadronic, leptonic, and p miss T in the event. Figure 3 shows the W+jets BDT distribution in the muon and electron channels. The requirement W+jets BDT ≥ 0.5 is applied to the events in the SR, which is further separated in two regions, depending on the presence of a t-tagged AK8 jet with p T > 400 GeV and rapidity |y| < 2.4. Events with no t-tagged AK8 jet and W+jets BDT < −0.75 or 0 < W+jets BDT < 0.5 are dominated by W+jets and tt events, respectively, and constitute the background enriched CRs. The SR is defined as events with W+jets BDT ≥ 0.5. The contribution expected from a 4 TeV Z boson, with a relative width of 1%, is shown normalized to a cross section of 10 pb. The hatched band on the simulation represents the statistical and systematic uncertainties. The lower panels in each plot shows the ratio of data to the SM background prediction and the light (dark) gray band represents statistical (systematic) uncertainty. The error bars on the data points indicate Poisson statistical uncertainty.
The tt system is reconstructed by assigning the four-vectors of the reconstructed final-state objects (charged lepton, p miss T , and jets) to the leptonic or hadronic legs of the tt decay. For events without an AK8 jet, several hypotheses are built based on possible assignments of each AK4 jet to either the leptonic t decay, the hadronic t decay, or neither. For events with an AK8 jet, that jet is associated with the hadronic t decay, and the leptonic t decay hypotheses only consider AK4 jets that are separated from the AK8 jet by ∆R > 1.2. In both cases, the combination chosen is the one that minimizes the χ 2 discriminator, where In this equation, m lep and m had are the invariant masses of the reconstructed leptonically and hadronically decaying top quarks, respectively. The parameters m lep , σ m lep , m had , and σ m had in the χ 2 discriminator are determined from simulation by matching reconstructed final-state objects of the hypothesis to the corresponding generator-level particles from the tt decay. Events in signal-and background-enriched regions are all required to have χ 2 < 30. Events with two t-tagged AK8 jets are removed from the sample in order to avoid any overlap with the fully hadronic channel.

Fully hadronic channel
All events used in the fully hadronic analysis are required to fulfill the following kinematic and t tagging criteria. In order to reach a trigger efficiency of ≈100%, each event must have H T > 950 GeV. Events are reconstructed using the two p T -leading AK8 jets, both of which are required to have p T > 400 GeV and |y| < 2.4. In order to ensure a back-to-back topology, the two jets must have an azimuthal separation |∆φ| > 2.1. These kinematic requirements are later referred to as the fully hadronic preselection. Both AK8 jets are required to be t tagged for events to enter the SR. These events are then separated into six SRs based on two criteria: the rapidity difference between the two jets (|∆y| < 1.0 or |∆y| > 1.0) and the number of jets with a b-tagged subjet (0, 1, or 2).
The categories with a greater number of jets with a b-tagged subjet are expected to provide higher sensitivity, while those with fewer b-tagged subjets are included to provide better constraints on the backgrounds and additional sensitivity to the analysis. The low-|∆y| region is expected to be more sensitive than the high-|∆y| region. At high values of m tt , QCD multijet events will have jets with greater y separation, as compared to those from a massive particle decay, in order to achieve such high invariant masses. This is illustrated in Fig. 4, which shows the dijet rapidity difference for events passing the fully hadronic event selection. The plot on the left is inclusive in m tt , while the plot on the right shows events with m tt > 2 TeV.

Estimation of the background 7.1 Dilepton channel
The dominant irreducible background in the dilepton channel is tt production. Other secondary backgrounds arise from Z+jets, single top quark, and diboson processes. Simulated events are used to model the shape of the kinematic distributions for the background processes, including modeling the S T variable used in the statistical interpretation of the observations. The overall normalization of the background processes is based on the corresponding theoretical cross sections. The distributions are allowed to vary within prior bounds of rate and shape uncertainties during the statistical treatment, which employs six signal-and three backgroundenriched regions, defined in Section 6.1. Modeling of the background is separately checked in the background-enriched CR obtained with the requirement ∆R sum > 2. Figure 5 shows the distributions of S T in the CR for µµ, ee, and eµ channels. The background simulation is in agreement with data within the statistical and systematical uncertainties. The quantity 'pull', shown in Fig. 5 and subsequent figures, is computed according to the following procedure. First, the total uncertainty per bin is determined by adding the statistical and all systematic uncertainties together in quadrature. Based on the expected number of events and the total uncertainty in each bin, pseudo-experiments are performed by sampling from a Gaussian distribution with the mean equal to the expected number of events and the standard deviation equal to the total uncertainty. For each pseudo-experiment, a distribution of the number of expected events is populated using Poisson statistics convolved with the Gaussian distribution describing the variation in the expected number of events in the bin. Finally, the number of events observed in data is used in conjunction with the distribution of pseudo-experiments to calculate a p-value, and the corresponding z-score is taken to be the pull.

Single-lepton channel
Standard model tt production is the main irreducible background in the single-lepton channel.
Other background processes include W+jets, single top quark, Z+jets, and diboson production. The QCD multijet background is a minor contribution in the single muon channel (≈3%), and is suppressed to a negligible level in the single-electron channel because of higher p T and p miss T requirements. All background processes in the single-lepton channel are modeled from simulated events, and the normalization of each background is based on its theoretical cross section. The rate and shape of the backgrounds are allowed to vary in the statistical analysis as described in Section 9. Events that pass the requirements in Section 6.2 are separated in two signal-and two background-enriched regions, defined as follows.
2. Signal Region (SR0T): χ 2 < 30, W+jets BDT ≥ 0.5, no t-tagged AK8 jet.   in eight exclusive categories used in the binned maximum likelihood fit. The rate at which light-flavor quarks and gluons are misidentified as originating from top quarks (t mistag) is measured in data and simulation using a W+jets mistag CR with χ 2 lep > 30 and W+jets BDT < −0.5. The p T and m SD distributions in the W+jets background can be seen in Fig. 6.

Fully hadronic channel
The two main sources of background in the fully hadronic channel are QCD multijet and tt production. For the latter background, simulated events are used to model the shape of the m tt distribution. This distribution is initially normalized to the theoretical cross section, but it is allowed to vary within the bounds of rate and shape uncertainties during the statistical treatment. The final normalization and shape are determined by fitting the distributions in the six SRs, defined in Section 6.3.
The QCD multijet background is estimated from data, using a method similar to the techniques described in Ref. [30]. The preselection described in Section 6.3 is enforced in order to select a back-to-back dijet event topology. In the first step of the background estimate, the t mistag rate in QCD multijet events is measured. A QCD multijet enriched region is selected by requiring one of the two jets to be "anti-tagged," meaning it has a PUPPI soft drop mass in the t-tag mass window 105 < m SD < 210 GeV, but the N-subjettiness requirement is inverted to τ 32 > 0.65. The opposite "probe" jet is used to determine the t mistag rate. This rate is parametrized as a function of probe jet momentum (p) and is measured for each of the three subjet b-tag categories (Fig. 7). This "anti-tag and probe" procedure is repeated for the tt simulation, indicating that there is a small (≈2%) contribution from SM tt events. The observed tt contamination is then subtracted from the anti-tag and probe data selection.  After the t mistag rate has been measured in the QCD multijet CR, it is used to estimate the m tt QCD multijet distribution in the SR. First, a "single-tagged" region is selected, in which at least one of the two jets is required to be t tagged, meaning it has a PUPPI m SD in the t-tag mass window 105 < m SD < 210 GeV and an N-subjettiness requirement of τ 32 < 0.65. One of the two top quark jet candidates is randomly selected, in order to avoid bias. If the selected jet is t tagged, the event is included in the QCD multijet estimate. The event is weighted by the previously measured t mistag rate, based on the momentum of the opposite jet and the number of subjet b tags in the event. Again, the procedure is repeated for the tt simulation, and the tt contamination is subtracted from the QCD multijet background estimate. This eliminates double counting between the tt and QCD multijet distributions.
Finally, a "mass-modified" procedure is employed in order to ensure that the jets used in the QCD multijet estimate mimic the relevant kinematics of the jets in the SR. If the mass of the second QCD multijet jet is not in the top quark mass window, it is assigned a random value within that window. This modified mass is randomly selected from the distribution of simulated lightflavor jets, with masses within the t-tag window, 105 < m SD < 210 GeV. A check of the entire background estimation method using simulated QCD multijet events is self-consistent.

Systematic uncertainties
Several sources of uncertainty that impact the final results of this search are considered. In all cases, the uncertainties in reconstruction efficiency and event interpretation are propagated to the distribution used for signal extraction. These uncertainties can be broadly grouped into two categories: those uncertainties that affect only the overall normalization of expected background events and those uncertainties that can result in a different reconstruction of the tt system, and therefore change the shape of the m tt distribution. Each source of systematic uncertainty is accounted for through unique nuisance parameters applied to the likelihood described in Section 10. For contributions that apply to multiple analysis channels, the nuisance parameters are fully correlated, allowing better constraints to be placed on sources of systematic uncertainties. The individual sources of uncertainty are described in detail below, and are summarized in Table 1.
Including all the systematic uncertainties degrades the final cross section limits by 10% for resonance masses above 2.5 TeV. Lower mass hypotheses are more sensitive to the systematic effects, thus the limit on the cross section degrades by up to 60% for the lowest mass Z resonance considered (500 GeV). The uncertainties in the jet energy corrections, pileup distribution, and tt cross section are the most significant. They result in a reduction of the excluded mass by 1.1, 1.0, and 1.0%, respectively. All other systematic uncertainties have less than a 1% effect. Per channel, the most significant systematic uncertainties are the b tagging scale factor, the tt renormalization and factorization scales, and the standard model tt cross section for the dilepton, single-lepton, and all hadronic channels, respectively. The most constrained nuisance parameters are those associated with the tt renormalization and factorization scales as well as the top tagging efficiency, which are constrained to 8.5 and 9.2% of their prior uncertainty. The average nuisance parameter has a post-fit uncertainty that is 75% lower than its prior estimate. 1. Standard model cross sections: Uncertainties in the cross sections used to normalize simulated background processes are obtained using the fitting procedure described in Section 1. For the tt, W+jets, and Z+jets backgrounds, a priori uncertainties of 20, 25, and 50% are assigned, respectively. A cross section uncertainty of 50% is used for the subdominant diboson and single top quark backgrounds. The values chosen reflect the relatively large uncertainties associated with modeling these backgrounds in the Lorentz-boosted phase space where the analysis is performed.

Integrated luminosity:
The uncertainty in the measurement of the integrated luminosity is 2.5% [80], and is applied to all simulated signal and background samples. 4. Lepton reconstruction and triggers: Simulated events are corrected by scale factors to account for differences between data and simulation in the efficiencies in the identification criteria for muons and electrons. By applying the scale factors shifted up or down by their uncertainties, new templates are obtained that correspond to these uncertainties. These templates can be used as the nuisance parameters, which are correlated between channels as identical identification criteria are used. The scale factors are parametrized as functions of lepton p T and η to account for different detector response. In the same way, uncertainties in the trigger efficiency are also accounted for, in the muon and electron trigger selections for this analysis.

5.
Jet energy scale and resolution: Uncertainties in the energy corrections applied to jets are propagated to the final discriminating distributions by reconstructing events with the jet level corrections shifted within their corresponding uncertainties, which depend on the jet p T and η.
6. Jet b tagging: Simulated events are corrected with scale factors to account for differences in the efficiency for identifying a b jet between data and simulation. There are two components to this process, each with an independent, uncorrelated nuisance parameter: one that accounts for the scale factor applied to the rate of identifying b-tagged jets (efficiency) and one that accounts for the scale factor applied to the rate of mistakenly identifying light-flavor jets as b jets (b mistag rate). In each case, the uncertainty is obtained by shifting these p T -dependent scale factors within their uncertainties. The b tagging uncertainties are fully correlated between the dilepton and fully hadronic analyses, as they use the same b tagging criteria.
7. CSV discriminant shape: The CSV tagger provides a continuous variable that can be used to identify b jets. This continuous variable is used as an input to the W+jets BDT described above. The W+jets BDT is only used in the single-lepton analysis, therefore the CSV shape systematic uncertainty only applies to that analysis. Several sources of systematic uncertainties are evaluated, including jet energy scale, flavor effects, and statistical effects. Each of these effects contributes an additional uncertainty in the CSV value that is propagated to the final signal discrimination process.
8. Jet t tagging: It is not possible to define a CR that is capable of measuring the t tagging scale factor without overlapping the tt SR. The t tagging efficiency scale factor is determined during the statistical analysis. This is done by including a nuisance parameter with a flat prior distribution that is unconstrained and correlated between the fully hadronic and single-lepton channels. Sources of misidentified t-tagged jets are different in the single-lepton channel, where they originate from W+jets processes, and in the fully hadronic channel, where they originate from QCD multijet processes. Therefore, the nuisance parameters corresponding to the uncertainty in the t mistag rate are treated as uncorrelated between the channels, and are also uncorrelated with the nuisance parameter assigned to the t tagging efficiency.

Parton distribution functions:
For the tt simulated sample, the PDFs from the NNPDF3.0 set [69] are used to evaluate the systematic uncertainty in the choice of PDF, according to the procedure described in Ref. [83]. 10. Scale uncertainties: For the tt sample, the matrix element renormalization and factorization scales were varied up and down independently by a factor of 2 to account for uncertainties in the choice of Q 2 used to generate the simulated sample.
11. Top quark p T reweighting: The simulated SM tt process was corrected at parton-level using a function derived from the ratio of top quark p T measured in data and next-to-NLO predictions from POWHEG and PYTHIA [84]. The uncertainty in this process is estimated by taking the difference between the unweighted and weighted results applied symmetrically to the nominal value as a function of p T . The top quark p T reweighting does not significantly impact the m tt and S T distributions, and would not obscure a resonance signal.
12. QCD multijet background estimation: The 'mass-modified' procedure described above to predict the shape of the background in the fully hadronic channel includes an uncertainty in the resulting distribution, equivalent to half of the difference between the uncorrected and 'mass-modified' background shapes. This difference affects both the shape and normalization of the final distributions, and the corresponding nuisance parameter is independent from all other effects. The uncertainties in the t mistag rates are propagated to the final distributions, and the corresponding uncertainty is handled via the t mistag rate nuisance parameter described above. A closure test is performed with simulated QCD multijet events to test the accuracy of the method. An additional systematic uncertainty is included, equal to the magnitude of the discrepancy observed from the closure tests results, evaluated and applied on a bin-by-bin basis to the fully hadronic signal categories. This systematic most greatly affects the two b-tag, high-|∆y| category, for which the method only closes within 20%. For the other categories, the method closes within ≈4%.

Statistical analysis
Before extracting the final results of the analysis, a background-only binned maximum likelihood fit is performed on the signal and control regions to determine the preferred values of the background process normalizations and shapes, using constraints from the sources of systematic uncertainty described above. Each source of systematic uncertainty is included through a unique nuisance parameter that is allowed to vary within the rate and shape constraints described above, using a log-normal prior distribution. The post-fit values of the nuisance parameters are used to correct the normalization and shape of each background process. The m tt and S T distributions after the fitting procedure are shown in Figs. 8, 9-10, and 11, for the dilepton, single-lepton, and fully hadronic channels, respectively. The mild deficits at low m tt in the two plots on the left in Fig. 10 do not significantly impact the limit, because this region is used to evaluate the tt and W+jets cross sections and is not sensitive to the resonance signal. The t tagging efficiency is measured simultaneously in signal and control regions during the maximum likelihood fit, as it is not possible to select a CR that might not be contaminated by the potential signal. The t tagging efficiency scale factor is modeled as a free nuisance parameter, with an unconstrained prior, in the binned likelihood fit. The t tagging efficiency scale factor measured by the fit is 1.001 ± 0.012.
Data are found to be in agreement with expectations in each of the categories considered in this analysis. Limits on the product of the production cross section and branching fraction are calculated, σ(pp → X) B(X → tt), for heavy resonances decaying to a pair of top quarks. A shape-based analysis is performed using both the signal and control regions from the three exclusive analysis channels. The THETA software package [85] simultaneously fits the m tt distributions from the single-lepton and fully hadronic channels and the S T distributions from the dilepton channel. For the limit calculation, a Bayesian likelihood-based method is used [86,87] with each bin of the distributions combined statistically, along with the implementation of unique nuisance parameters that correspond to the systematic uncertainties described in Section 8. The signal normalization is allowed to vary with a distinct unconstrained nuisance parameter having a uniform prior, while the other nuisance parameters have log-normal prior distributions. Finally, to account for the limited number of simulated events, an additional statistical uncertainty is included for each process relying on simulated events through the "Barlow-Beeston lite" method [88]. Prior to the statistical analysis, the m tt distributions are rebinned. For the fully hadronic and dilepton channels, the total statistical uncertainty in the background is required to be below 30% in any given bin. In the single-lepton channel, the total statistical uncertainty in the background expectation for the sum of small backgrounds (single top quark, multijet, Z+jets, W + b, or c jets) is required to be below 10% in each bin. The tighter statistical uncertainty requirement is needed for these backgrounds because the events are rejected at a high rate, resulting in significantly fewer simulated events that pass the final selection. Figure 12 shows a comparison of the expected sensitivities in each of the three analysis channels in terms of the expected limits for the g KK signal model. The contributions from the singlelepton and fully hadronic channels dominate the sensitivity over most of the mass range, apart from the region of lowest masses, where the dilepton channel makes a significant contribution.

Results
The statistical analysis is performed for each of the signal models considered in this analysis: three variations of a Z boson having a width-to-mass ratio of 1, 10, and 30%, as well as a g KK . In each case, a 95% CL limit is obtained on the product of the resonance production cross section and branching fraction. The observed and expected limits and 1 and 2 s.d. bands are calculated for resonance masses ranging from 0.5 to 5.0 TeV and are listed in Tables 2-5. New exclusion limits on the mass of resonances decaying to tt are set by comparing the observed limit to the theoretical cross section, where the branching fraction B(X → tt) is assumed to be 1. As shown in Fig. 13, the analysis excludes narrow Z bosons with masses up to 3.80 TeV      Figure 12: Comparison of the sensitivities for each analysis channel contributing to the combination. The expected limits at 95% CL are shown for each channel with the narrow colored lines, while the combination result is shown with thick the black line. These results are shown specifically for the g KK signal hypothesis, as this model has characteristics that are common to many tt resonance searches. The multiplicative factor of 1.3 for the g KK is the NLO K factor.     wide Z bosons with masses up to 6.65 TeV (6.40 TeV expected). For the g KK resonance hypothesis, the analysis excludes masses up to 4.55 TeV (4.45 TeV expected). These results represent a significant improvement on the previous results in this channel from the 2015 data taking period, not only because of the increase in integrated luminosity, but also the reduction in the uncertainty in the multijet background estimate in the fully hadronic channel, the improved W+jets rejection via the W+jets BDT in the single-lepton channel, and the inclusion of dilepton event categories in the combination. The absolute cross section limits are 10-40% better, for m tt above 2 TeV, than the previous result released by CMS [31] scaled to an integrated luminosity of 35.9 fb −1 . These results are the most stringent exclusion limits on a tt resonance to date.

Summary
A search for a generic massive top quark and antiquark (tt) resonance has been presented. The analysis was performed using data collected by the CMS experiment in 2016 at the LHC at √ s = 13 TeV, corresponding to an integrated luminosity of 35.9 fb −1 . The analysis is focused on searching for a tt resonance above 2 TeV, where the decay products of the top quark become collimated because of its large Lorentz boost. The analysis performed a simultaneous measurement of the backgrounds and the t tagging efficiency from data. The data are consistent with the background-only hypothesis, and no evidence for a massive tt resonance has been found.
Limits at 95% confidence level are calculated for the production cross section for a spin-1 resonance decaying to tt pairs with a variety of decay widths.
Limits were calculated for two benchmark signal processes that decay to tt pairs. A topcolor Z boson with relative widths of 1, 10, or 30% is excluded in the mass ranges 0.50-3.80, 0.50-5.25, and 0.50-6.65 TeV, respectively. The first Kaluza-Klein excitation of the gluon in the Randall-Sundrum scenario (g KK ) is excluded in the range 0. 50-4.55 TeV. This is the first search by any experiment at √ s = 13 TeV for tt resonances that combines all three decay topologies of the tt system: dilepton, single-lepton, and fully hadronic.
The sensitivity of the analysis exceeds previous searches at √ s = 8 and 13 TeV, particularly at high tt invariant mass. Previous measurements have excluded a topcolor Z up to 3.0, 3.9, and 4.0 TeV, for relative widths of 1, 10, and 30%, and g KK from 3.3 to 3.8 TeV, depending on model [31,32]. The presented analysis improves upon those limits, extending the Z exclusions to 3.80, 5.25, and 6.65 TeV and the g KK exclusion to 4.55 TeV. These are the most stringent limits on the topcolor Z and g KK models to date.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. [16] L. Randall      [31] CMS Collaboration, "Search for tt resonances in highly boosted lepton+jets and fully hadronic final states in proton-proton collisions at √ s = 13 TeV", JHEP 07 (2017) 001, doi:10.1007/JHEP07(2017)001, arXiv:1704.03366.