Search for pair production of vector-like quarks in leptonic final states in proton-proton collisions at $\sqrt{s}$ = 13 TeV

A search is presented for vector-like T and B quark-antiquark pairs produced in proton-proton collisions at a center-of-mass energy of 13 TeV. Data were collected by the CMS experiment at the CERN LHC in 2016-2018, with an integrated luminosity of 138 fb$^{-1}$. Events are separated into single-lepton, same-sign charge dilepton, and multilepton channels. In the analysis of the single-lepton channel a multilayer neural network and jet identification techniques are employed to select signal events, while the same-sign dilepton and multilepton channels rely on the high-energy signature of the signal to distinguish it from standard model backgrounds. The data are consistent with standard model background predictions, and the production of vector-like quark pairs is excluded at 95% confidence level for T quark masses up to 1.54 TeV and B quark masses up to 1.56 TeV, depending on the branching fractions assumed, with maximal sensitivity to decay modes that include multiple top quarks. The limits obtained in this search are the strongest limits to date for $\mathrm{T\overline{T}}$ production, excluding masses below 1.48 TeV for all decays to third generation quarks, and are the strongest limits to date for $\mathrm{B\overline{B}}$ production with B quark decays to tW.


Introduction
A decade ago, the ATLAS and CMS Collaborations announced the discovery of a Higgs boson (H) with a mass near 125 GeV at the CERN LHC [1][2][3]. The discovery experimentally confirmed the last fundamental piece of the standard model (SM) of particle physics, and subsequent precision measurements have remained consistent with the SM description of the properties and interactions of elementary particles. However, the apparent fine-tuning of SM parameters and other unexplained phenomena indicate the incompleteness of the SM [4], and thus motivate searches for beyond-the-SM (BSM) physics. To restore "naturalness" to the SM, many BSM theories (Little Higgs [5,6], Composite Higgs [7,8], etc.) introduce new heavy fermions, such as "vector-like quarks" (VLQs). The presence of VLQs is an extension of the SM that is not currently excluded by experiments.
The VLQs are hypothetical fermions whose left-and right-handed components transform identically under the SM electroweak gauge group SU(2) L ⊗ U(1) Y , in contrast to the behavior of the SM chiral quarks. This chiral symmetry allows a mass term to be included in the Lagrangian, which means that the VLQ masses are not dependent on Higgs Yukawa couplings. The existence of such VLQs could cancel out the leading quantum loop corrections to the observed H mass from top quarks, thus stabilizing it [9,10].
In most of the relevant models, VLQs are assumed to mix primarily with the third-generation SM quarks, as is consistent with the expected large top quark coupling [11]. At the LHC, VLQs can be produced in pairs via the strong interaction or singly via an electroweak interaction. At high masses, VLQ pair production typically has a lower cross section than single production, but for the narrow-width VLQs assumed by this analysis the pair production cross section is independent of the electroweak couplings, simplifying interpretation. A vector-like top quark (T) with an electric charge of 2e/3, analogous to the SM top quark, can decay in three modes that produce characteristic high-momentum signatures: T → tH, tZ, and bW [9]. Similarly, a vector-like bottom quark (B) with an electric charge of −e/3 can decay through B → bH, bZ, and tW. Examples of tree-level Feynman diagrams are shown in Fig. 1.  In minimal models, these VLQs may only exist as electroweak singlets T and B, in a doublet (T, B), or in doublets and triplets with further VLQs that would have exotic charges. Each scenario results in different T and B branching fractions. For singlets, the branching fractions are 50% for T → bW and B → tW, and 25% for T → tH, T → tZ, B → bH, and B → bZ. In various doublet scenarios, the T decays only to tH and tZ with equal branching fractions of 50%, and similarly the B decays only to bH and bZ with equal branching fractions [10,12]. These singlet and doublet branching fraction scenarios are used as benchmarks.
In this paper, a search for pair production of T and B quarks at the LHC is presented using three final states containing charged electrons or muons: a single-lepton channel, a same-sign charge (SS) dilepton channel, and a "multilepton" channel with at least three leptons. The data were collected from proton-proton (pp) collisions at √ s = 13 TeV by the CMS experiment at the LHC from 2016-2018, with a total integrated luminosity of 138 fb −1 . It is assumed that only one flavor of VLQ is present, and results are independently derived for the production of T and B quarks. Previous searches for T and B quark pairs by the ATLAS and CMS Collaborations at √ s = 7 TeV [13-15], [8][9][10][11][12][13][14][15][16][17][18][19], and 13 TeV [20][21][22][23][24][25] have excluded T quark masses below 1.31 TeV (singlet) to 1.42 TeV (100% tH) and B quark masses below 1.22 TeV (singlet) to 1.58 TeV (100% bH) at 95% confidence level (CL). For decays to W bosons, pair production of either VLQ flavor is currently excluded for masses below ∼1.35 TeV [20,25].
In the following, Section 2 describes the CMS detector and reconstruction algorithms for common physics objects. Section 3 describes the simulations used in the search, and Section 4 presents physics object and event selection requirements shared by all channels. The overall analysis strategy is described in Section 5, with the event selection, background estimation, and event categorization presented in detail for each channel in Sections 6-8. Treatment of systematic uncertainties for the combined search and the search results are presented in Sections 9 and 10, respectively. The search is summarized in Section 11. Tabulated results are provided in the HEPData record for this search [26].

The CMS detector and event reconstruction
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [27].
The particle-flow (PF) algorithm [28] aims to reconstruct and identify each individual particle in an event, with an optimized combination of information from the various elements of the CMS detector. The energy of photons is obtained from the ECAL measurement. The energy of electrons is determined from a combination of the electron momentum as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. Muon reconstruction is based on extrapolating between hits in the inner tracker and the outermost muon chamber. The energy of muons is obtained from the curvature of the reconstructed track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies.
For each event, hadronic jets are clustered from these reconstructed particles using the infrared and collinear safe anti-k T algorithm [29] implemented by the FASTJET package [30], with distance parameters of 0.4 ("small-radius") and 0.8 ("large-radius"). Jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found from simulation to be, on average, within 5-10% of the true momentum over the entire p T spectrum and detector acceptance. Additional pp interactions within the same or nearby bunch crossings (pileup) can contribute additional tracks and calorimetric energy deposits, increasing the apparent jet momentum. To mitigate this effect in small-radius jets, tracks identified as originating from pileup vertices are discarded and an offset correction is applied to correct for remaining contributions. In large-radius jets, the pileup-per-particle identification algorithm [31,32] is used to mitigate the effect of pileup at the reconstructed-particle level, making use of local shape information, event pileup properties, and tracking information. Jet energy corrections are derived from simulation studies so that the average measured energy of jets becomes identical to that of particle-level jets. In situ measurements of the momentum balance in dijet, photon+jet, Z+jet, and multijet events are used to determine any residual differences between the jet energy scale in data and in simulation, and appropriate corrections are made [33]. The jet energy resolution amounts typically to 15-20% at 30 GeV, 10% at 100 GeV, and 5% at 1 TeV [33], and the energy of jets in simulated samples is corrected such that the energy resolution agrees with data.
The missing transverse momentum vector p miss T is computed as the negative vector p T sum of all the PF candidates in an event. Its magnitude is denoted as p miss T . Anomalous high-p miss T events can be due to a variety of reconstruction failures, detector malfunctions or noncollision backgrounds. Such events are rejected by event filters that are designed to identify more than 85-90% of the spurious high-p miss T events with a mistagging rate less than 0.1% [34]. The p miss T is modified to account for corrections to the energy scale of the reconstructed jets in the event.
Events of interest are selected using a two-tiered trigger system. The first level (L1), composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 µs [35]. The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [36].

Simulated samples
Signal and background processes are simulated using the Monte Carlo (MC) method with different matrix element generators. Simulations with conditions appropriate for the 2016 data are generated using the NNPDF 3.0 parton distribution function (PDF) set at leading order (LO) or next-to-LO (NLO) [37]. Background simulations with 2017-2018 conditions are generated using the NNPDF 3.1 PDF set at next-to-NLO (NNLO) [38], and signal simulations for 2017-2018 conditions use the PDF4LHC15 PDF set at NLO [39].
The POWHEG v2 generator [40][41][42] is used to simulate tt [43], ttH, and most single top quark production [44,45], as well as WZ (2016 and 2018) and ZZ production [46,47] [50] to simulate Drell-Yan and W boson production with leptonic decays and up to four additional partons, as well as production of multijet events from quantum chromodynamics (QCD) interactions. They are also used at NLO to simulate ttW, ttZ, tttt, W + W + , and triboson production, as well as single top quark s-channel production with 2016 and 2018 conditions, and WZ production with 2017 conditions. The FxFx [51] matching scheme is applied to the ttW and WZ samples, and the ttW samples are also interfaced with MADSPIN.
An alternate sample of tt production for neural network training is also simulated at LO with MADGRAPH5 aMC@NLO.
Pair production of vector-like T and B quarks is simulated at LO using MADGRAPH5 aMC@NLO 2.2.2 (2016) and 2.4.2 (2017-2018). Samples with masses in the range 0.9-1.8 TeV are generated with a narrow width, chosen to be 10 GeV. The theoretical cross sections of vector-like T and B quark pair production via the strong interaction, calculated at NNLO with the TOP++2.0 program [52], range from 90 ± 4 fb at 0.9 TeV to 0.39 ± 0.04 0.03 fb at 1.8 TeV. All MC generators are interfaced with PYTHIA 8 [53] to simulate the parton showering and underlying event kinematics, with versions 8.212, 8.226, and 8.230 used for 2016, 2017, and 2018 conditions, respectively. In the 2016 simulation, the CUETP8M1 underlying event tune [54] is applied to all processes except t quark production processes, which use the dedicated tune CUETP8M2T4 [55]. In the 2017-2018 simulation, the tune CP5 is applied to all processes [56]. PYTHIA 8 is also used as a generator to simulate diboson production at LO in the single-lepton channel. The GEANT4 [57] program is used to simulate the response of the CMS detector. The effects of pileup are included in the simulation, and for each year simulated events are weighted to ensure that the mean number of interactions per bunch crossing agrees with that observed in the data.
For the single-lepton channel, background simulations are grouped into "QCD", "TOP" and "EW" categories. The QCD category consists of multijet production; the TOP category consists of tt, single t, ttH, and ttV processes, where V denotes W and Z; and the EW category consists of W+jets, Drell-Yan, and diboson processes. For the SS dilepton and multi-lepton channel, the "ttV' group consists of ttW and ttZ processes, and the "VV(V)" group consists of all vector diboson and triboson processes.

Physics object and event selection
The primary vertex (PV) is taken to be the pp collision vertex corresponding to the hardest scattering in the event, evaluated using tracking information alone, as described in Section 9.4.1 of Ref. [58]. Events selected for this search are required to have at least one reconstructed vertex with longitudinal position |z| < 24 cm and radial position r < 2 cm relative to the mean collision point and the nominal beam axis, respectively.
All selected events must have at least one electron or one muon candidate. For the single-lepton channel, events are required to have satisfied a trigger requiring the presence of an electron or muon. The primary trigger selected events with an electron or muon of p T > 15 GeV that is very loosely isolated from surrounding energy deposits. These events also have transverse hadronic energy greater than 350 GeV (2016) or 450 GeV (2017-2018). Additional triggers selected events with a muon of p T > 50 GeV, or an isolated electron with p T > 32 (38) GeV in 2016 (2017-2018) data. In 2017-2018 data, the SS dilepton and multilepton channels required events to have passed dielectron, dimuon, or electron-muon triggers. For the dielectron trigger, the leading electron was required to have p T > 23 GeV and the subleading electron p T > 12 GeV. The dimuon triggers selected events with two muons of p T > 17 and 8 GeV, respectively. For triggers selecting mixed-flavor events, the leading lepton was required to have p T > 23 GeV and the subleading electron (muon) p T > 12 (8) GeV.
During 2016 and 2017, a gradual shift in the timing of the inputs of the ECAL L1 trigger in the region |η| > 2.0 caused a specific trigger inefficiency. The effect is a function p T , η, and time, and for events containing an electron (a jet) with p T larger than ≈50 (≈100 GeV), in the region 2.5 < |η| < 3.0 the efficiency loss is 10-20%. To model this effect in simulation, correction factors are computed from data.
Reconstructed electrons are required to satisfy several quality criteria on observables such as the shower shape and the ratio of energy deposition in the ECAL to that in the HCAL [59]. A multivariate discriminator is used to determine the quality of reconstructed electrons with two different degrees of stringency: "tight" and "loose", with average electron identification efficiencies of 90% and 98%, respectively. Misidentification rates are in the ranges 2-3% for tight identification and 5-15% for loose identification, depending on the electron's η [60]. The reconstructed electrons are required to lie within the range |η| < 2.5, excluding the barrelendcap interface region of 1.44 < |η| < 1.57. During 2018 data taking, a detector failure in the HCAL caused jets to be misidentified as electrons, so electrons within the affected region (a range of 1.02 in η and 0.65 in azimuthal angle φ) are rejected in both the 2018 data and the corresponding simulation.
In the SS dilepton channel, different measurements of each electron's charge must be consistent. Three measurements are considered: two are based on electron track reconstruction, and a third is based on the φ angle difference between the pixel detector seed hits of the electron track and the linked ECAL cluster [61]. For each electron, the two charge measurements based on track reconstruction must agree, and for electrons with p T < 100 GeV all three measurements must agree.
Muons are selected at two quality levels, "tight" and "loose", based on the number of muon chamber hits, the inner track impact parameter with respect to the PV, and the track fit χ 2 . The tight level has an efficiency of 95-99% for identifying reconstructed muons and the loose level is over 99% efficient, with misidentification rates for hadrons below 0.5% in all cases [62]. Selected muons are required to be located within the detector acceptance region of |η| < 2.4.
To reduce background from leptons produced within jets, usually from semileptonic decays of hadrons, an isolation requirement is applied to leptons using an observable I mini . This quantity is defined as the scalar p T sum of charged hadron, neutral hadron, and photon PF candidates within a cone in η-φ space around the lepton is calculated, corrected for pileup using an effective area method [63], and then divided by the lepton p T . The cone size depends on the lepton p T and has radius R, defined as R = 10 GeV min(max(p T , 50 GeV), 200 GeV) . (1) Using a p T -dependent cone radius improves the efficiency of selecting isolated leptons from highly Lorentz-boosted particles, such as VLQ decay products. Tight (loose) leptons require I mini < 0.1 (0.4), with efficiencies >95% (98%) for both lepton flavors. To account for observed small differences in reconstruction, identification, and isolation between data and simulation, the simulation is corrected by factors estimated from data using the "tag-and-probe" method [62].
Small-radius (large-radius) jets with p T > 30 (200) GeV and |η| < 2.4 are selected if they pass selection criteria designed to remove jets dominated by instrumental effects or reconstruction failures [64]. Additionally, tight (single-lepton channel) or loose (SS dilepton and multi-lepton channels) leptons are removed from small-radius (large-radius) jets if their angular separation from the jet axis ∆R = √ (∆η) 2 + (∆φ) 2 < 0.4 (0.8), i.e., if the leptons lie within the jet distance parameter of the jet axis. Leptons are removed by subtracting the four-momentum of the matched lepton candidate from the four-momentum of the jet. Jet energy corrections are applied after lepton removal.
Small-radius jets are tagged as coming from a bottom quark using the DEEPJET algorithm [65].
The working point used here provides 60-85% efficiency for identifying bottom quark jets, varying with jet p T , while misidentifying only 15-25% of charm quark jets and 1-7% of lightquark or gluon jets as bottom quark jets. Corrections based on the jet p T are applied to account for the differences between efficiencies in data and simulation [66]. Small-radius jets with p T > 30 GeV and |η| < 2.4 are considered in this search.
A large-radius jet may contain all the products of the hadronic decay of a top quark, or H, W, or Z boson that has been produced in the decay of a heavy VLQ and is therefore highly boosted. The DEEPAK8 [67] algorithm is used to identify the most probable parent particle of each large-radius jet as either a top or bottom quark, H, Z, or W boson, or light quark/gluon. The identification is made by summing the DEEPAK8 scores for all individual decays of a specific massive particle, and determining which massive particle hypothesis has the largest score for each jet. In the 1.4 TeV VLQ signal simulation this tagging method has efficiencies of 54-74% for identifying SM bosons, 76% efficiency for identifying top quarks, 30% efficiency for identifying bottom quarks, and 60% efficiency for identifying light quarks or gluons. The most significant sources of misidentification cause ≈25% of Z bosons to be identified as W bosons, and 28% of bottom quarks to be identified as Higgs bosons. Variations in the identification efficiency for each massive particle based on jet p T or simulated VLQ mass range from 5 to 25%, and are within the assigned uncertainties described in Section 9.2. Large-radius jets are also characterized using the ratio between the 2-subjettiness and 1-subjettiness variables [68], τ 21 = τ 2 /τ 1 , and the "groomed" jet mass. The groomed jet mass is calculated after applying a modified mass-drop algorithm [69,70], known as the "soft-drop" algorithm [71], to largeradius jets using parameters β = 0, z cut = 0.1, and R 0 = 0.8. Large-radius jets with p T > 200 GeV and |η| < 2.4 are considered in this search.
We define an observable H T as the scalar p T sum of all selected small-radius jets, H lep T as the scalar sum of H T and the p T of all tight leptons, and S T as the sum of H lep T and p miss T . These quantities provide a measure of the total event energy, which is typically larger for VLQ signal events than SM background events.

Analysis strategy
The present search features three leptonic channels with sensitivity to different potential VLQ decays.
The single-lepton channel, described in Section 6, provides broad sensitivity to all TT decays, as well as sensitivity to B quark decays to tW. The decay of VLQ pairs to third-generation quarks and SM bosons produces two bottom or top quarks, and two W, Z, or Higgs bosons. In the single-lepton final state, one of these top quarks or W bosons decays leptonically and produces the charged lepton and a neutrino, while the other three initial products decay hadronically and produce large-radius jets. The parent particles of the large-radius jets can be identified using the DEEPAK8 algorithm. In this channel, VLQ candidates are reconstructed from the lepton, p miss T , and large-radius jets, and a multi-layer perceptron (MLP) neural network [72] is trained to identify events as tt background, W+jets background, or VLQ signal events. Events are categorized by lepton flavor, electron or muon, and then based on the particle identification of the VLQ candidates' decay products.
The SS dilepton channel, described in Section 7, is primarily sensitive to VLQ pair production with T → tH (with H → WW) and B → tW decays. With up to six W bosons produced (including those from the top quark decays), two same-sign W bosons can decay leptonically to produce two final-state leptons with the same electric charge. Events are categorized by lepton flavor combinations.
The multilepton channel, described in Section 8, is primarily sensitive to contributions from T → tZ and B → tW decays. Leptonic decays of these Z or W bosons, combined with possible leptonic decays of the W bosons from the decay of the top quarks, can produce three or more leptons-a rare final state in SM processes. The high-energy signature of the VLQ signal in the H lep T and S T distributions is used to discriminate the signal from the background in the SS dilepton and multilepton channels, respectively. Events are categorized by lepton flavor combinations. Table 1 summarizes the main event selection criteria used to form control regions (CRs) and signal regions (SRs) for the three channels, beyond the common selection criteria presented in Section 4. In the single-lepton channel all 2016-2018 data are analyzed according to the methods presented in this paper. In the SS dilepton and multilepton channels, data from 2017-2018 are analyzed using the methods described here, while data from 2016 are reproduced from a previous search with the same analysis strategy [21]. Template histograms from a variety of kinematic variables are taken from the SRs of all three channels, as well as some CRs in order to constrain uncertainties in the background estimation. The template histograms are combined in a maximum likelihood fit, described in Section 10, to determine the presence of signal. Table 1: Summary of event selection criteria for the primary CRs and SRs in the three search channels. The label "OSSF" refers to opposite-sign charge, same-flavor lepton pairs, and the phrase "max MLP" refers to the largest score from the single-lepton MLP network.

Channel
Event selection Overall CR SR 2 small-radius jets ≥ 3 small-radius jets

6 Single-lepton channel
Events selected in the single-lepton channel must have exactly one electron or muon with p T > 55 GeV that passes the tight selection requirements described in the previous section. No additional charged leptons passing the loose selection requirements with p T > 10 GeV are permitted. Since this channel targets signal events with a leptonic W boson decay, selected events must have p miss T > 50 GeV, which rejects multijet background events. Additionally, at least three large-radius jets are required. Large-radius jets are discarded if they lie within ∆R < 0.8 from the charged lepton and the component of their momentum in the direction perpendicular to the lepton's momentum is less than 20 GeV. These selection criteria are summarized in Table 1.
In the single-lepton channel the dominant SM background processes are W+jets production and tt production. To improve the modelling of the background estimate in this channel, discrepancies in the modeling of the H T distribution in the W+jets samples are corrected for by applying a scaling function described in Ref. [21]. Corrections to the modeling of the tt simulation are derived in the H T distribution of the tt-enriched CR described in Section 6.2 below.
The VLQ candidates are reconstructed by first identifying a "leptonic particle" candidate. The W boson is reconstructed from the charged lepton and p miss T , applying a W boson mass constraint. The z-component of the missing momentum is taken to be the real part of the solution that results in a W boson mass nearest to 80.2 GeV. If the minimum mass that can be formed by pairing this W boson with any small-radius jet is less than 150 GeV (a value chosen from studies performed on signal events) it is likely that this W boson is the decay product of an SM top quark. In this case, the leptonic particle candidate is formed from the W boson and either a b-tagged small-radius jet within ∆R < 0.8 or the small-radius jet that yielded the minimum mass pairing, if no such b-tagged small-radius jet is found (likely because of its misidentification as a light quark or gluon jet). Otherwise, the leptonic particle candidate is the W boson itself.
Two VLQ candidates can be formed in events that contain at least three large-radius jets that are separated from the leptonic particle candidate by ∆R > 0.8, to ensure that the leptonic particle candidate and each jet represent unique decay products from the VLQ pair. A parent particle hypothesis is provided for each jet from the DEEPAK8 tag. Wherever possible, VLQ candidates are formed from the leptonic particle candidate and three large-radius jets according to the expected VLQ decay modes: bW, tZ, tH, tW, bZ, or bH. Events with two VLQ candidates matching the expected decay modes are observed in simulation to have well-reconstructed VLQ mass peaks, particularly for hadronic VLQ candidates, and are used to form "high-purity" (HP) categories. Events with VLQ candidates that do not match the expected decay modes are used to form "low-purity" (LP) categories. In these events, the leptonic particle candidate and large-radius jets are paired so as to minimize the mass difference between the resulting VLQ candidates.

Multilayer perceptron network
The HP event categories have very high background rejection rates, but also low signal acceptance rates. The majority of simulated signal events are reconstructed as LP events because of misidentification of the decay products. To maximize the potential sensitivity of the LP events, both the TT and BB analyses have an associated MLP, a type of fully connected neural network, with three output nodes, which is used to distinguish between either the TT or BB signal, the W+jets background, and the tt background. The MLPs are trained using events from W+jets and MADGRAPH5 aMC@NLO tt background simulation samples, and either the TT or BB sig-nal sample with a VLQ mass of 1 TeV. They provide strong classification performance across the entire VLQ mass range considered in this search. The MADGRAPH5 aMC@NLO and POWHEG tt simulations were compared for all input distributions to the MLP and only negligible differences were observed.
The "training region" consists of single-lepton events with at least three large-radius jets that either do not have reconstructed VLQ candidates, or were categorized as LP events. Some of the LP events from each sample form a testing data set for evaluating network performance. For the signal and tt samples, which are not used elsewhere in this search, the training data sets contain as many of the remaining LP VLQ events as possible. The W+jets events are restricted to those without VLQ candidates, to maintain separation from the SRs described below in Section 6.2. Approximately equal-sized subsets of events from the three samples are provided for the training.
The MLP inputs include both event-level and jet-level observables that were chosen to maximize overall network accuracy and minimize the misidentification of tt events as signal events. Event-level observables include H T , S T , p miss T , the minimum angular separation between the highest p T large-radius jet and any other large-radius jet, and the numbers of small-radius jets, large-radius jets, and b-tagged small-radius jets. If a leptonically decaying top quark is reconstructed in the event, its p T , mass, and the angular separation between the W boson and the bottom quark are also included. Jet-level variables include the p T , the DEEPAK8 light quark or gluon score, and the N-subjettiness ratio τ 2 /τ 1 of the three highest p T large-radius jets, as well as the softdrop mass of the highest p T large-radius jet. The observed data are shown using black markers, predicted TT signal with mass of 1.2 (1.5) TeV in the singlet scenario using solid (dashed) lines, and backgrounds, using filled histograms. Statistical and systematic uncertainties in the background prediction before performing the fit to data are shown by the hatched region. The lower panels show the difference between the data and the background estimate as a multiple of the total uncertainty in both sources. The signal predictions have been scaled for visibility by the factors indicated in the figures.
Modeling of the DEEPAK8 light quark or gluon score is studied in a validation region containing events with only two large-radius jets, in which the predicted signal is negligible compared to the background. The model is improved by a binned shape correction for the 2017 and 2018 simulations, taken from the data-to-background ratios in the light quark or gluon score distributions. This ratio ranges from 1.2 in jets with lower scores to 0.7 in jets with very high scores.
The correction is applied to the simulation by weighting events by the product of the correction for the three leading large-radius jets, normalized so that the cross section of each simulated sample remains unchanged.
For jets produced by Higgs bosons, the DEEPAK8 light quark or gluon score distribution is calibrated using jets originating from g → bb fragmentation [73]. These jets are selected from samples of multijet data and background simulation using a boosted decision tree classifier so that their DEEPAK8 light quark or gluon score distribution resembles the distribution of H jets in the simulated 1.5 TeV TT sample. Corrections are derived by fitting the simulation to the data in several intervals of the DEEPAK8 light quark or gluon score. Correction factors range from 1.2-1.4 in jets with high scores to 0.8-0.9 in jets with very low scores, depending on the data-taking period. The corrections are applied only to events with jets originating from Higgs bosons. All MLP input observables and pairs of input observables are then tested using the full training region to ensure that the background simulations accurately model the observed data. Any bins expected to contain a significant signal component are removed from the test distributions. Figure 2 shows data and predictions in the training region for two example input variables, S T and the DEEPAK8 light quark or qluon score for the highest-p T large-radius jet, after all the corrections described here are applied. These observables show particularly strong separation between signal and background.
The MLP is implemented using the SCIKIT-LEARN platform [74] and has three fully connected hidden layers with 10 nodes each. It is trained using the "Adam" optimizer [75] to minimize a cross-entropy loss function, with rectified linear unit activation functions between hidden layers and a softmax activation function for the output layer. Training is halted when a validation sample, consisting of 10% of the training events, indicates that the prediction accuracy has reached a plateau. For the TT-(BB-)optimized MLP, the signal is identified with 92 (89)% efficiency, while 14 (11)% of tt events and 2% of W+jets events are misclassified as signal events. The three outputs of the MLP are labelled as the W+jets node score, the tt node score, and the VLQ node score.

Control and signal regions
The MLP predictions provide a powerful separation between signal and background in the single-lepton channel, and are used to define an SR and CRs for each signal hypothesis (TT or BB). The SR contains all HP VLQ events and LP VLQ events for which the VLQ node score was larger than either background score. The CRs contain LP events in which one of the background node scores was larger than the VLQ node score. Figure 3 shows the strong distinction between the shape of the signal and the background in the VLQ node score distribution in the SR, as well as the separation between the tt and W+jets background processes in the W+jets node score in the CRs.
Three mutually exclusive CRs are formed in this channel. A DEEPAK8 CR is constructed from all CR events in which the highest-p T large-radius jet was tagged as a massive particle, along with a randomly chosen half of all other CR events. The remaining CR events are separated into either a tt-enriched CR or a W+jets enriched CR, based on which of their MLP background node scores is larger. Distributions from the three CRs are included in the fit to data (described in Section 10) because they relate to regions in which the background modeling can be constrained. In the DEEPAK8 CR the observable is the distribution of DEEPAK8 jet tags, which provides information about the efficiencies and misidentification rates for DEEPAK8 massive particle tagging. In the tt and W+jets CRs the observable is the H T distribution, which is very sensitive to the overall energy scale of the events. Figure 4 shows the data and simulation in  the TT CR categories after the fit, and the corresponding event yields are listed in Table 2.

Same-sign dilepton channel
Events with exactly two tight leptons with the same sign of electric charge are selected. They must have satisfied a dilepton trigger and the leading (subleading) lepton must have p T > 40 (30) GeV. Additionally, the events are required to contain at least four small-radius jets. Events are categorized based on the flavors of the two leptons: ee, eµ, or µµ.
To reject low mass dilepton resonances, it is required that the invariant mass of the SS lepton pair is greater than 20 GeV. To veto Z → ee decays with charge misidentification it is required that the invariant mass does not lie in the range 76-106 GeV, and further required that neither of the leptons forms a pair within this same mass range with any same-flavor loose lepton in the event. The SR consists of selected events with H lep T above 1000 GeV, and a CR is formed using events with H lep T below 1000 GeV. These selection criteria are summarized in Table 1.

Background modeling
Three categories of background are considered: prompt, nonprompt, and charge misidentification. Prompt background refers to SM processes with SS dilepton final states and is estimated using simulation. Such SM processes include VV, VVV, ttV, ttH, and tttt production. Nonprompt background refers to events with nonprompt leptons passing the tight lepton identification and/or jets misidentified as leptons.
Processes producing a pair of oppositely charged prompt leptons can also contribute to the background when the sign of the charge of one of the leptons has been misidentified. Because of the design of the CMS muon system, the charge misidentification rate for muons is negligible for muons with p T below the TeV scale [76], so this is only significant in the ee and eµ   4). The charge misidentification rates derived vary between 0.005% and 5%, increasing with both p T and |η|. The background from charge misidentification is then estimated by weighting opposite-sign dilepton events that pass all other event selection requirements by the charge misidentification rate per electron. For dielectron events, the cases in which either of the electrons is misidentified are considered.
Nonprompt background is estimated from events with at least one loose lepton using "prompt rates" and "nonprompt rates", following the "matrix" method explained in Ref. [77]. The prompt rate, the probability of prompt loose leptons to pass the tight categorization, is measured using the tag-and-probe method in Drell-Yan events in data. The events used for this measurement are required to pass dilepton triggers and have a tight lepton with p T > 30 GeV and |η| < 2.4, as well as a second "probe" lepton that satisfies the loose requirements. The prompt rate is the fraction of probe leptons passing the tight requirements. The η-averaged prompt rate of muons is 0.928 ± 0.015 and 0.931 ± 0.011 in 2017 and 2018 data, respectively. For electrons, the prompt rate depends on p T and varies between 0.78 and 0.83 in both years.
The nonprompt rate is the probability of a nonprompt lepton or jet to pass the tight lepton identification. As a result of changes in trigger thresholds with respect to 2016, the method for evaluating nonprompt rates used in Ref.
[21] has been superseded for 2017-2018 data by evaluating the lepton p T distributions in the multilepton channel CR, as described in Section 8.  Table 3.

Multilepton channel
In the multilepton channel events with three or more tight leptons are selected for the SR. The leptons must have p T > 30 GeV, and the event must have passed a dilepton trigger. Events are categorized based on the flavors of the three leading leptons: eee, eeµ, eµµ, and µµµ. Events are required to have p miss T > 20 GeV, since at least one neutrino is expected in the final state, and at least three small-radius jets among which at least one is b tagged, as expected from top quark decays. To reduce background from low-mass resonance decays to leptons, it is required that the mass of any opposite-sign same-flavor lepton pairs in the event must be greater than 20 GeV. To optimize signal discrimination in the SR, it is further required that each lepton must have a p T component orthogonal to the closest jet that is greater than 8 GeV, and there must be at least one b-tagged jet with p T > 45 (50) GeV in 2017 (2018).
Events are also selected for a mutually exclusive CR. Events in this CR are required to have exactly three leptons that pass the loose requirements and exactly two small-radius jets. The CR is used to estimate the nonprompt background (as described in the next section), with the assumption that the nonprompt background contributes at the same level to the CR and the SR. This assumption is cross-checked in another mutually exclusive region with exactly

Background modeling
Two categories of background are considered: prompt and nonprompt. Prompt background, estimated from simulation, refers to SM processes that can produce multilepton final states, including VV, VVV, and ttV production. Nonprompt background is again estimated by the Matrix method, but extended to three leptons.
The prompt rates applied are identical to those used in the SS dilepton channel. Nonprompt rates are extracted by fitting the predicted lepton p T distributions in the multilepton CR, shown in Fig. 5, through the minimization of the χ 2 between the data and the total estimated background. The electron and muon nonprompt rates ( f e and f µ ) are varied independently in steps of 0.01 from 0.0 to 0.5, producing a 2D χ 2 distribution. The χ 2 distribution is then converted to a 2D Gaussian probability distribution using the relation P( f µ , f e ) ∝ exp(−χ 2 /2). Uncertainties are determined by marginalizing the nonprompt rates individually, i.e., summing the distribution along one axis producing a 1D Gaussian distribution for each nonprompt rate, with the width of the Gaussian assigned as the uncertainty. This ignores correlations between lepton flavors, which are considered by other systematic uncertainties.
A cross-check of the nonprompt rate measurement is performed using a sample of simulated decays of top quark pairs in which both top quarks decay leptonically. Events are required to contain at least three leptons, of which two are prompt and one is nonprompt. The number of true nonprompt events in which the nonprompt lepton passes the tight identification  is compared to the prediction made using the numbers of nonprompt leptons and the nonprompt rates, The discrepancies between predicted and observed events in this sample range from 1.3% to 7.5% across the 2017 and 2018 lepton flavor categories, and are incorporated in the nonprompt rate systematic uncertainties. The discrepancies are studied across the leptons' p T distribution and no significant trends are observed.
In addition, to check that the nonprompt rates measured in the CR are applicable to the SR, a χ 2 minimization is performed on the tt simulated samples, comparing the true nonprompt distributions to the predicted distributions in both the CR and the SR. The measured f e is the same in both regions in 2017 and 2018, while f µ deviates by 0.02 and 0.01 in the 2017 and 2018 samples, respectively.

Systematic uncertainties
Systematic uncertainties can affect both the normalization and the shape of the predicted background and signal distributions, and are summarized in Table 5.

Common uncertainties
Several uncertainties are found in common throughout the search and are correlated across the three analysis channels. The effects of all shape-based uncertainties are evaluated by varying inputs to the analysis by their respective uncertainties. • Muon identification and isolation scale factors: uncertainties in these corrections are 2% per muon for identification and 1.5% per muon for isolation. In the single-lepton channel the uncertainties for the three data periods are combined in quadrature.
• Electron reconstruction and isolation: uncertainties in these corrections are 1% per electron for reconstruction and 1.5% per electron for isolation. In the single-lepton channel the uncertainties for the three data periods are combined in quadrature.
• Electron identification: the uncertainty in the correction is applied as a two-dimensional function of p T and η.
• Pileup correction: the uncertainty in the pileup weighting for simulation is evaluated by varying the total inelastic cross section (σ inel ) of 69.2 mb by ±4.6% [81]. • L1 trigger timing: the uncertainty in this correction is applied as a two-dimensional function of p T and η to data from 2016 (in the single-lepton channel) and 2017.
• Jet energy scale and resolution: the uncertainties in these corrections affect both small-radius and large-radius jet momenta, and are propagated to the p miss T distribution and all observables calculated from jets. In the single-lepton channel, the LOWESS algorithm [82,83] is used to smooth the resulting shifted histograms.
• DEEPJET b tagging: uncertainties in these corrections are applied separately for bottom and charm quark tagging, and for light quark or gluon misidentification [66].
• Scale uncertainties: the uncertainty in the choice of the renormalization (µ R ) and factorization (µ F ) scales in the simulation is used to estimate the effect of not including higher-order matrix elements. Scale uncertainties are treated independently for each group of simulated physics processes described in Section 3. The uncertainty is computed in each bin of the final observable by varying µ R up and down by a factor of two, and also varying µ F up and down by a factor of two, including symmetric shifts of both scales, and forming an envelope from the seven resulting distributions. In Table 5 this envelope is summarized as "env(×2, ×0.5)". For the signal, the normalization impact of this uncertainty is scaled down to reflect only the effect of the analysis selection.
• PDFs: the PDF uncertainty is correlated across all samples that use the same PDF set. Specifically, uncertainties in the PDF sets applied in the simulation are treated separately for 2016 and 2017-2018 to account for the update from NNPDF 3.0 to NNPDF 3.1 PDF sets. Additionally, the 2017-2018 uncertainty is treated separately for signal, to which the PDF4LHC15 PDF set was applied. In the 2016 simulation, the NNPDF 3.0 uncertainty is computed bin-by-bin in the final observable from 100 PDF replicas using a quantile method to identify the RMS [39], summarized as "±RMS(replicas)" in Table 5. In the 2017-2018 simulation, Hessian uncertainties are summed in quadrature to form a total uncertainty in each bin ("±RMS(Hessian)"). For the signal, the normalization impact of the PDF uncertainty is also scaled down to reflect only the effect of the analysis selection.
The uncertainties in electron identification, jet energy scale, and jet energy resolution are uncorrelated across the 2016, 2017, and 2018 data periods. Other common uncertainties are correlated between run periods, unless described otherwise above.

Single-lepton uncertainties
Uncertainties affecting only the single-lepton channel are: • Single-lepton triggers: the uncertainty in these corrections to the simulation are applied as 2D functions of p T and η, independently for each lepton flavor.
• H T distribution correction for the W+jets background: the uncertainty in this correction is formed by repeating the fits with all points shifted up or down by their statistical uncertainties to form an envelope-like uncertainty. In Table 5 this procedure is summarized as "env(upper, lower fits)".
• H T distribution correction for the tt background: the uncertainty in this correction is calculated using the fit parameter covariance matrix and depends on the value of H T in each event.
• DEEPAK8 light quark or gluon score corrections: for jets not associated with a generatorlevel H, a single-sided uncertainty applied to 2017-2018 simulation is formed by turning on and off the correction. The uncertainty in the H jet correction is evaluated by shifting the correction factors by their uncertainty. For all jets the value of this uncertainty varies, based on the DEEPAK8 light quark or gluon score in each event.
• DEEPAK8 heavy-particle tagging and misidentification: independent uncertainties in the efficiency for the DEEPAK8 algorithm to identify correctly or incorrectly each massive particle within a jet are formed in order to perform an in situ correction of any differences in efficiency between data and simulation. To form a large input uncertainty for correct and incorrect identification of each flavor, the DEEPAK8 tag of each large-radius jet is compared to its "true" parent particle, which is determined by spatial matching between the jets and generated particles. For each jet, the relevant flavor's correct or incorrect identification uncertainty is incremented by a large value (25%). The misidentification uncertainties are constrained significantly in the fit for all jet flavors, and the correct identification uncertainties are constrained for all flavors except H and Z bosons, which are very rare in the background events.
Of these uncertainties, the trigger scale factor uncertainties, DEEPAK8 light quark or gluon score uncertainty for H jets, and DEEPAK8 heavy-particle misidentification uncertainties are treated independently for each data period. The dominant uncertainties in the single-lepton background predictions are the renormalization and factorization scale uncertainties, and the signal predictions are most sensitive to the DEEPAK8 heavy-particle misidentification uncertainties.

SS dilepton uncertainties
Uncertainties affecting the SS dilepton channel are: • Dilepton triggers: uncertainties in the trigger scale factors for each lepton flavor combination are applied as 2D functions of lepton p T and η.
• Charge misidentification: the uncertainty in the charge misidentification background has two contributions. The first is the statistical uncertainty in the minimization procedure. The second arises from systematic differences in event topology between the Drell-Yan process used to measure the misidentification rate and the tt-like SR. The latter is determined from simulation by counting the number of tight electrons with p T > 30 GeV and an incorrectly reconstructed charge. Comparing the rates measured in the Drell-Yan and tt samples, the uncertainties in the charge misidentification yields are 31% and 59% for 2017 and 2018 data, respectively.
• Prompt rates: uncertainties in the nonprompt background estimate due to the prompt rate measurement are evaluated by varying the prompt rates by their measurement uncertainty separately for each flavor.
• Nonprompt rates: uncertainties in the nonprompt background estimate due to the nonprompt rates are estimated by recalculating the nonprompt background in the H lep T < 1000 GeV region, with the differences between the best fitted values in H lep T < 1000 GeV and those in the multilepton CR taken as one-sided uncertainties affecting the shape of the nonprompt background estimate.
With the exception of the trigger scale factor uncertainties, all of the SS dilepton uncertainties are treated independently for each data period, including the uncertainties affecting 2016 data that are unchanged from the previously published result. The dominant background uncertainties are those affecting the nonprompt background estimation.

Multilepton uncertainties
Uncertainties affecting the multilepton channel are: • Dilepton triggers: uncertainties in the trigger scale factors for each lepton flavor combination are applied as 2D functions of lepton p T and η.
• Prompt rates: the prompt rates are measured in dilepton events so the corresponding uncertainties must cover the possible effect of different event topologies, as well as the measurement uncertainty itself. Since the measured values are not too far from unity and as using a prompt rate of unity does not significantly affect the nonprompt rates, uncertainties are conservatively estimated by comparing the nonprompt background event yields when applying the measured prompt rates and unity prompt rates, separately for each lepton flavor.
• Nonprompt rates (e/µ): the nonprompt rates for each lepton flavor are varied by the sum in quadrature of the measurement error from marginalization and the deviation in the nonprompt rates obtained from fitting simulated tt events in the SR and the CR. This variation of nonprompt rates from the best fit values presented in Section 8.1 results in shape uncertainties in the nonprompt background estimate.
• Nonprompt background (by flavor): a normalization uncertainty is assigned to the nonprompt background estimate in each of the four lepton flavor categories to account for remaining discrepancies observed during cross-checks. This uncertainty is the quadrature sum of two sources: the differences between the data and the predicted background in the CR, and the difference between the true nonprompt and the predicted number of events in the cross-check using simulated tt events.
• Nonprompt rate (η(µ)): the SS dilepton channel measurement in the 2016 analysis observed η-dependence in the muon nonprompt rate [21]: this effect is studied in 2017-2018 data and included as an uncertainty.
With the exception of the trigger scale factor uncertainties, all of these multilepton uncertainties are treated independently for each data period, including the uncertainties affecting 2016 data that are unchanged from the previously published result. As in the SS dilepton channel, the dominant background uncertainties are those affecting the nonprompt background estimation.

Results
The possible presence of a signal is determined by simultaneously fitting template histograms, shown below, from a variety of discriminating variables in all three channels. In the singlelepton channel, the H T and jet tag CR distributions are included in the fit to constrain uncertainties in the background modeling. In the SR, the VLQ score from the MLP is used to form template histograms for both HP events, in which both VLQ candidates contain the expected particle labels, and for LP events, which have at least one VLQ Candidate without the expected particle labels. The SR data are subdivided into 24 (TT) or 18 (BB) exclusive categories based both on the lepton flavor and the set of DEEPAK8 jet tags observed. The categorization according to DEEPAK8 jet tags, applied to electron and muon events separately, is described in Table 6. Figure 6 shows the template histograms for each single-lepton SR category listed in Table 6 for the TT analysis, with lepton flavor categories combined for illustration. The histograms are binned such that the total background in each bin has a statistical uncertainty smaller than 20%. Tables 7-8 list the numbers of events selected for each category in both the TT and BB analyses, the latter of which is not shown in the figures.
In the SS dilepton channel, the H lep T distribution is used to form template histograms in the three lepton flavor categories for 2017 and 2018 data, while results from 2016 data are included as a counting experiment. In the multilepton channel, the S T distribution is fitted in the four lepton flavor categories for all data-taking periods. In both of these channels the template histograms from 2016 data are reproduced from Ref. [21]. Figure 7 shows the H lep T templates for 2017 and 2018 data in the SS dilepton channel, in which the total background in each bin has a statistical uncertainty smaller than 30%. Figure 8 shows the S T templates for 2017 and 2018 data in the multilepton channel, with histograms binned to match Ref. [21]. Tables 9-10 list the numbers of events selected for each category of the SS dilepton and multilepton channels. Event yields for 2016 data in these two channels are reproduced from Ref. [21]. No significant excess of data over the SM background estimate is observed in any channel.  Figure 6: Template histograms of the VLQ score in single-lepton SRs 1 and 2 combined, and SRs 3-12 (left-to-right, upper-to-lower). The observed data are shown using black markers, the predicted TT signal for a mass of 1.2 (1.5) TeV in the singlet scenario using solid (dashed) lines, and the post-fit background estimates, using filled histograms. Statistical and systematic uncertainties in the background estimate after performing the fit to data are shown by the hatched region. The lower panels show the difference between the data and the background estimate as a multiple of the total uncertainty in both sources. Electron and muon categories have been combined for illustration with their uncertainties added in quadrature.   Figure 8: Template histograms of S T in the multilepton signal region for eee, eeµ, eµµ, and µµµ categories (left-to-right, upper-to-lower). The observed data from 2017-2018 (combined for illustration) are shown using black markers, the predicted TT signal for a mass of 1.2 (1.5) TeV in the singlet scenario using solid (dashed) lines, and the post-fit background estimates, using filled histograms. Statistical and systematic uncertainties in the background estimate after performing the fit to data are shown by the hatched region. The lower panels show the difference between the data and the background estimate as a multiple of the total uncertainty in both sources. Table 6: Category labels and definitions for the SRs of the single-lepton channel TT analysis. Electron and muon events are analyzed separately in all categories. The VLQ candidate tag describes the pairings formed from the leptonic particle candidate and three large-radius jets. The hadronic VLQ candidate is reconstructed from two large-radius jets, and a VLQ candidate tag of "other" indicates that the hadronic VLQ candidate did not consist of bW-, tZ-, or tH-tagged jets. In the BB analysis, the VLQ candidate tags considered are tW, bZ, and bH. Categories 4, 6, and 7 are not included in the BB analysis. Upper limits at 95% CL on the production cross sections of TT and BB pairs are set using a binned maximum likelihood fit to all categories for a total of 54 (48) templates in the TT (BB) analysis. The searches for each VLQ flavor are independent, with only one flavor considered in the signal templates. Uncertainties due to limited event counts in simulated samples are included as Poisson-distributed nuisance parameters using the Barlow-Beeston method [84,85]. Systematic uncertainties listed in Table 5 are included as lognormal-distributed nuisance parameters if they affect only the normalization of the background or signal predictions. Systematic uncertainties affecting the shapes of the template distributions are treated using template morphing with Gaussian probability distributions [85]. The fit model is validated by studying the results of fits to pseudo-data generated from the background estimate, with and without known rates of TT or BB signal injected. In the TT (BB) analysis the fit to data has 1361 (1219) parameters and a goodness-of-fit measure from a saturated χ 2 model of 1379 (1207). Expected limits at 95% CL are calculated using a profile likelihood test statistic in the asymptotic approximation [86], with the CL s method [87,88].
Limits on the TT and BB production cross sections for both benchmark branching fraction scenarios are shown in Figure 9, where the band around the theory prediction indicates the scale and PDF uncertainties expected at NNLO. The corresponding mass limits are listed in Table 11, along with 95% CL limits for scenarios with 100% branching fractions to H, W, or Z bosons. The maximum likelihood fit produces a slight deficit of data in the signal-enriched regions of single-lepton SRs 2, 3, 5, and 9, contributing to an observed limit that is stronger than the expected limit in branching fractions with significant W boson contributions.   Table 8: Numbers of predicted and observed events in 2016-2018 data (138 fb −1 ) in the BB SR categories considered in the single-lepton channel, after a background-only fit to data. Electron and muon categories have been combined for illustration. Predicted numbers of signal events before the fit to data are included for comparison, using the singlet branching fraction scenario. Uncertainties include statistical and systematic components, with the uncertainties in the electron and muon categories added in quadrature.  Table 9: Numbers of predicted and observed SR events in 2017-2018 data (101 fb −1 ) in the SS dilepton channel, after a background-only fit to data. Predicted numbers of signal events before the fit to data are included for comparison, using the singlet branching fraction scenario. Uncertainties include statistical and systematic components. Predictions for 2017 and 2018 are combined for illustration with their uncertainties added in quadrature.  These limits are the strongest to date for TT production and for BB production with B quark decays to W bosons. In this analysis, the use of the neural network jet identification and event classification methods in the single-lepton channel contributed significantly to limits that reach beyond those expected simply from the increased size of the present data set over that used previously.

Summary
A search has been presented for vector-like T and B quark-antiquark pairs produced in protonproton collisions at a center-of-mass energy of 13 TeV. Data collected by the CMS experiment   at the LHC in 2016-2018 are analyzed in the single-lepton final state, and data from 2017-2018 are analyzed in the same-sign charge dilepton and multilepton final states. In the single-lepton channel, parent particles of large-radius jets are identified using the DEEPAK8 algorithm, and vector-like quark candidates are reconstructed. A multilayer perceptron network is trained to separate signal events from standard model backgrounds. In the same-sign charge dilepton and multilepton channels, low background rates and the large energy signature of the signal are exploited by studying jet and lepton momentum scalar sum distributions. Pair production is excluded at 95% confidence level for T quarks with masses up to 1.54 TeV and for B quarks with masses up to 1.56 TeV, depending on the branching fraction scenario, and T quarks with masses below 1.48 TeV are excluded in any scenario. The limits obtained in this search are the strongest limits to date for TT production with all T quark decay modes, and are the strongest limits to date for BB production with B quark decays to tW.