Search for pair production of excited top quarks in the lepton + jets final state

A search is performed for pair-produced spin-3/2 excited top quarks (t* t*-bar), each decaying to a top quark and a gluon. The search uses data collected with the CMS detector from pp collisions at a center-of-mass energy of sqrt(s) = 8 TeV, selecting events that have a single isolated muon or electron, an imbalance in transverse momentum, and at least six jets, of which one must be compatible with originating from the fragmentation of a b quark. The data, corresponding to an integrated luminosity of 19.5 inverse femtobarns, show no significant excess over standard model predictions, and provide a lower limit of 803 GeV at 95% confidence on the mass of the spin-3/2 t* quark in an extension of the Randall-Sundrum model, assuming a 100% branching fraction of its decay into a top quark and a gluon. This is the first search for a spin-3/2 excited top quark performed at the LHC.


Introduction
The standard model (SM) of particle physics provides a successful description of the properties of the elementary particles and their interactions. Despite its success, the SM is assumed to be an effective model of a more complete theory. Many extensions of the SM predict that the top quark is a composite particle and not a fundamental object [1][2][3][4]. A direct confirmation of this hypothesis could be achieved by the discovery of an excited top quark (t * ).
In models that describe the proposed excited top quark [5,6], weak isodoublets are used to represent both left-and righthanded components of the t * quark, allowing for a description of finite masses prior to the onset of electroweak symmetry breaking. Thus, in contrast to the heavy top quark from a sequential fourthgeneration model, in these models the existence of t * quarks is not strongly constrained by the discovery of a SM-like Higgs boson [7][8][9]. In string realizations of the Randall-Sundrum (RS) model [10,11], the right-handed t * quark is expected to be the lightest spin-3/2 excited state [12].
A spin-3/2 t * quark is described by the Rarita-Schwinger [13] vector spinor Lagrangian. At the energy of LHC, the production cross section of spin-3/2 quarks is proportional toŝ 3 , where ŝ is the square of the energy in the parton-parton collision rest frame, rather than ŝ −1 , as it is for spin-1/2 quarks [14]. Therefore, when integrating over the parton momentum fractions (x) in proton-E-mail address: cms-publication-committee-chair@cern.ch. proton collisions, spin-3/2 quarks receive a contribution at large x values that is greater than that from spin-1/2 quarks. In the RS model, the spin-3/2 t * quark is expected to have a pair production cross section of the order of a few picobarns at √ s = 13 TeV, for a t * of mass m t * = 1 TeV [1,14,15], which dominates over single t * production for most of the parameter space in the model [12]. The t * quark decays predominantly to a top quark through the emission of a gluon [1,12,15,16].
In this Letter, we present a search for pair-produced t * quarks, where each t * quark decays exclusively to a top quark (t) and a gluon (g). We use data recorded in 2016 with the CMS detector in proton-proton (pp) collisions at

The CMS detector and simulated samples
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [18].
Simulated t * t * signal events are generated in 100 GeV steps with m t * in the range 700-1600 GeV, using the MadGraph5_ amc@nlo [19] event generator and NNPDF3.0 [20] for the parton distribution functions (PDFs). The t * t * production cross section ranges from ≈5 pb at m t * = 700 GeV, down to ≈4 fb at m t * = 1600 GeV. This cross section is calculated at leading order in perturbation theory, with the factorization and renormalization scales set to m t * ; the calculation is cut off at 7m t * to prevent unitarity violation. The Rarita-Schwinger Lagrangian, included in the Mad-Graph 5 generator, is used for simulating spin-3/2 t * t * events. This implementation and the corresponding physics parameters are provided by the authors of Ref. [14]. The width of the t * quark is assumed to be 10 GeV, which is much narrower than the detector resolution. Parton shower and hadronization processes are modeled using pythia 8.212 [21]. The generated events are processed through a simulation of the CMS detector based on Geant4 [22], and are reconstructed using the same algorithms as used for data.
We estimate SM backgrounds using a data-derived approach. Simulated samples for SM processes are used to study the modeling of the background and to provide a cross-check of the analysis procedures. The simulated SM samples relevant to this analysis are: tt production; single top quark production via the s-channel, t-channel, and tW processes; W and Z boson production in association with jets; the tt+W, tt+H, and tt+Z processes. The tt and tt+H processes are simulated using powheg 2.0 [23][24][25][26][27], while the other SM processes are simulated using MadGraph5_amc@nlo up to next-to-leading order [19,28,29]. All simulated samples include the additional contributions from overlapping pp collisions within the same and nearby bunch crossings ("pileup") at large instantaneous luminosity. Simulated events are given individual weights to match the distribution of the average number of pileup interactions in data.

Event reconstruction
Event reconstruction is based on the CMS particle-flow (PF) algorithm [30], which takes into account information from all subdetectors, including measurements from the tracking system, energy deposits in the ECAL and HCAL, and tracks reconstructed in the muon detectors. Given this information, all particles in the event are reconstructed as electrons, muons, photons, and charged or neutral hadrons. Photons are identified as ECAL energy clusters not linked to the extrapolation of any charged-particle trajectory to the ECAL. Muons are identified as a track in the central tracker consistent with either a track or several hits in the muon system, and not associated with energy clusters in the calorimeters. Electrons are identified as a primary charged particle track that extrapolates to at least one ECAL energy cluster. The track may be associated with bremsstrahlung photons emitted along the way through the tracker material. Charged hadrons are identified as charged-particle tracks neither identified as electrons, nor as muons. Finally, neutral hadrons are identified as HCAL energy clusters not linked to any charged-hadron trajectory, or to ECAL and HCAL energy excesses with respect to the expected charged hadron energy deposits.
For each event, jets from these reconstructed particles are clustered with the infrared and collinear safe anti-k T algorithm [31], using a distance parameter R = 0.4. Charged hadrons associated with pileup vertices are excluded from jet reconstruction. The jet momentum is the vectorial sum of the momenta of all particles contained in the jet. The reconstructed jet momentum is found in simulation to be within 5 to 10% of the true momentum over the whole p T spectrum and detector acceptance. Jet energy corrections are derived from the simulation and measurements in collision data [32]. The jet energy resolution amounts typically to 15% at 10 GeV, 8% at 100 GeV, and 4% at 1 TeV [32]. The jet energy resolution in simulation is degraded to match that observed in data.
Jets are identified as originating from a bottom quark through a combined secondary vertex algorithm CSVv2 [33,34]. The algorithm uses a multivariate discriminator to combine information on the significance of the impact parameter, the jet kinematics, and the location of the secondary vertex. A working point of the discriminator with ≈70% b quark identification efficiency and ≈1% mistag efficiency for light quarks and gluons is used in this analysis. Small differences in b tagging efficiencies and mistag rates between data and simulated events are accounted for by applying additional corrections to simulation.
The missing transverse momentum vector is defined as the negative vector sum of the momenta of all reconstructed PF candidates in an event projected onto the plane perpendicular to the beams. Its magnitude is referred to as p miss T .

Event selection
This analysis searches for t * t * production, with each t * decaying to t+g and the tt pair in the event reconstructed in the lepton + jets final state. Events are required to contain exactly one isolated lepton, p miss T , and at least six jets, exactly two of which must be b tagged.
Events containing a muon are selected with a single-muon trigger that requires the presence of an isolated muon with transverse momentum p T > 27 GeV. Events containing an electron are selected with a single-electron trigger that requires the presence of an isolated electron with p T > 32 GeV. The background rate for the single electron trigger was much higher than for the single muon trigger, requiring more stringent selection criteria for the electron channel. A deterministic annealing algorithm is used to reconstruct the candidate primary vertices [35]; the vertex with the highest track multiplicity is selected as the primary event vertex. Selected events are required to have this primary vertex within 2 cm of the center of the detector in the x-y plane, and within 24 cm along the z-direction.
The track associated with a muon is required to have hits in the pixel and muon detectors, a good quality fit, and transverse and longitudinal impact parameters with respect to the primary vertex smaller than 2 and 5 mm, respectively. An isolation factor I is defined as the scalar sum, divided by the muon p T , of the p T of all photons, charged hadrons, and neutral hadrons within an angular cone of R ≡ ( η) 2 + ( φ) 2 < 0.4 (where φ is the azimuthal angle) around the track, corrected for the effects of pileup [36]. An isolation selection I < 0.15, corresponding to an efficiency of ≈95% is used.
Electrons are required to have p T > 35 GeV and to be within the region |η| < 2.1. Electrons within 1.44 < |η| < 1.56, corresponding to the ECAL barrel-endcap transition region, are rejected to avoid poor reconstruction performance. Electrons are selected using a Table 1 Expected numbers of selected events for the simulated signal process as a function of m t * . Also shown are the expected numbers of events predicted by the SM, together with the systematic uncertainties discussed in Section 7 and the uncertainties in the cross sections of the various processes, as well as the numbers of selected events observed in data. cutoff-based selection method [37] based on the shower shape, the track quality, the spatial match between the track and the electromagnetic cluster, the fraction of total cluster energy in the HCAL, and the resulting level of activity in the surrounding tracker and calorimeter regions. The criteria imposed in these electron selection algorithms have a combined efficiency of ≈70%.
In addition to the selections above, the leptons are required to have an angular separation R < 0.1 with respect to the lepton reconstructed by the trigger system. The lepton selection efficiencies for data and simulation are measured using the tag-and-probe method [37]. Additional corrections are applied to simulation to account for observed differences in the efficiencies between data and simulation. The p miss T is required to be greater than 20 GeV, while the jets are required to have p T > 30 GeV, |η| < 2.4, and angular separation R > 0.4 with respect to well-identified electrons or muons. In order to reject misreconstructed, poorly reconstructed, and noisy jets, the fractional energy contribution from both ECAL and HCAL must be non-zero and non-unity. Exactly two jets are required to pass the b tagging criteria. The expected yields after event selection are summarized in Table 1. Simulated signal events pass the selection criteria with acceptance times efficiency of 1.4-2.2%, depending on the channel and on the signal mass. After the application of all selections, 44 573 events are observed in the μ + jets channel and 28 942 events in the e + jets channel. The yields predicted from the simulated SM background processes are 46 600 events in the μ + jets channel and 30 700 events in the e + jets channel.
Small differences between data and the SM predictions are within the estimated uncertainties of the simulation, with the dominant uncertainty being the choice of the renormalization and factorization scales used in the generator of the tt events. Details of the uncertainties are given in Section 7. Furthermore, the differential distributions of kinematic variables of simulated SM processes are also in agreement with data, as shown in Fig. 1. In particular, the distribution of the invariant mass of a t + jet system (m t+jet , see Section 5 for details) in data is in agreement with the background estimation.

Mass reconstruction
Since the dominant background is SM tt production with extra jets, the reconstructed invariant mass spectrum of the t + jet systems is used to distinguish between t * t * signal and tt background. The p miss T is assumed to be carried away entirely by the neutrino from the leptonically decaying W boson (W lep ). We assume that the parent W boson is on shell and the neutrino is massless in order to determine the longitudinal momentum of the neutrino.
Given the high jet multiplicity of the event selection, a measure was designed for evaluating different associations of the reconstructed jets with the parton objects in the final state. For the jets, the six jets with the highest p T values are taken into consideration. The b tagged jets are assigned to one of the b quark partons, and the other jets are associated with the decay daughters of the hadronically decaying W (W had ) or with the gluons from t * decay. The quality of the jet-parton assignment for a single event is evaluated with an S value based on how well the intermediate physical objects are reconstructed: where m qq is the invariant mass of the jets assigned to W had daughters. Invariant masses of the physical objects assigned to hadronically and leptonically decaying t (t * ) quarks are denoted tively. These estimates are obtained by reconstructing the t * t * , tt and W had in the decay topology using the truth information from simulated signal samples. Additional studies have shown that the mass reconstruction is insensitive to changes in the detector resolution values. The jet-parton assignment with the smallest S value is taken to represent the decay topology of a single event, under the t * hypothesis. The average value of the m qq bg and m ν l bg computed for this assignment is taken to represent the reconstructed t * mass of an event, notated as m t+jet . The rate at which all six jets are all correctly assigned is around 11%, with the main difficulty being the correct assignment of the jets from the hadronically decaying W.

Background modeling
To determine the presence of signal events in data, an unbinned extended maximum likelihood fit of a signal-plus-background model is performed on the m t+jet > 400 GeV spectrum.
The mass template of the t * t * signal is constructed by smoothing the mass distribution from simulations, using an adaptive kernel estimation [39] with a Gaussian kernel and with no restriction on the boundary. The smoothness parameter ρ introduced in Ref.
[39] is determined by the square root of the standard deviation of the signal distribution over the subset with ≥4 correctly assigned partons.
The background distribution is modeled using a log-normal function (up to a normalization factor): where m is the mass, and a 2 and m 0 are the parameters that determine the shape of the background. During the fit to the observed data, the number of background events, as well as the shape parameters of the background function, are free parameters.
To verify whether the fit is sensitive to the presence of t * t * signal, a pseudo-data set is generated with the m t+jet spectrum of the simulated backgrounds and then injected with the expected m t+jet signal spectrum for various hypotheses of the signal cross section. Performing the same fit over multiple sets of pseudo-data with varying signal cross sections showed no evidence of bias.
To ensure that the log-normal function is sufficient to model the background, a likelihood ratio test is conducted by comparing the results of fitting the spectrum of the simulated SM background to an extended log-normal functions of the form: Increasing the number of parameters does not improve the description of the background.
The results of the fit performed on data with the 800 GeV signal spectrum are shown in Fig. 2. The distribution of events in data is in agreement with a null hypothesis. Based on the results of the Kolmogorov-Smirnov tests, the signal + background model and the background-only model both yield good fits to the data.

Systematic uncertainties
The impact of experimental and theoretical sources of uncertainties is considered and summarized in Table 2. For each source   Because of uncertainties in the total inelastic pp cross section, when calculating the data pileup scenario alternative pileup corrections are made with the inelastic cross section scaled by ±1 s.d.
Variations in the pileup corrections have an average impact on the signal acceptance of 0.7%. The number of signal events is also affected by the uncertainty on the integrated luminosity, which is known to a precision of 2.5% [40].
The theoretical uncertainties considered are those associated with the choice of the PDF, and the renormalization and factorization scales used by the event generator. The effects of the theo-retical uncertainties are obtained by changing the various generator parameters within their estimated uncertainties and generating new m t+jet fit templates that are used to calculate new sensitivities.
In addition to the statistical uncertainty originating from the signal + background fit, systematic uncertainties are introduced to cover the choice of modeling. Alternative signal templates are generated with different choices of ρ by changing the subset to require ≥3 and ≥5 correctly assigned partons. The background shape is determined from data. Simulated events with different configurations, as well as several alternative models have been tested. The chosen model, with the parameters floated in the limit computation, has proven to describe the data and cover the associated systematic uncertainties sufficiently well.

Statistical analysis and extraction of limits
No excess above SM background is observed. We set an upper bound on the t * t * production cross section using the asymptotic modified frequentist CL s criterion [42][43][44][45]. The null hypothesis likelihood function is taken from the background component of the signal + background fit described in Section 6. For the uncertainties described in Section 7, a joint template is used, where the nominal template is linearly interpolated to the templates generated with the relevant parameters shifted by ±1 standard deviation. Each of the interpolation variables is taken as a nuisance parameter with a standard Gaussian prior.
The fit is performed separately in the muon and electron channels, and the results of both are used to obtain combined limits. Fig. 3 shows the observed and expected upper limits at 95% confidence level for the product of the t * t * production cross section and the square of the branching fraction, as a function of the t * mass. The lower limit for m t * is given by the value at which the upper limit intersects with the theoretical cross section from Ref. [14]. Both the observed and expected lower limits of m t * for the combined muon and electron data are 1.2 TeV, within uncertainties.

Summary
A search has been conducted for pair production of spin-3/2 excited top quarks t * in proton-proton interactions, with each t * decaying exclusively to a standard model top quark and a gluon. Events that have a single muon or electron and at least six jets, exactly two of which must be identified as originating from a bottom Fig. 3. The expected and observed 95% confidence level upper limits for the product of the production cross section of t * t * and the square of the branching fraction, as a function of the t * mass, for the combined lepton + jets analysis. The theoretical production cross section assuming a 100% t * → tg branching fraction is shown along with its uncertainties, described in Section 7. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.) quark, are selected for the analysis. Assuming t * t * production, the final-state objects are associated with the t * candidates in each event. No significant deviations from standard model predictions are observed in the t + jet system, and an upper limit is set at 95% confidence level on the pair production cross section of t * t * , as a function of the t * mass. Interpreting the results in the framework of a spin-3/2 t * model, assuming a 100% branching fraction of its decay into a top quark and a gluon, t * masses below 1.2 TeV are excluded. These are the best limits to date on the mass of spin-3/2 excited top quarks and the first at 13 TeV.

Acknowledgements
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses.