Search for vector-like T and B quark pairs in final states with leptons at $\sqrt{s} =$ 13 TeV

A search is presented for pair production of heavy vector-like T and B quarks in proton-proton collisions at $\sqrt{s} =$ 13 TeV. The data sample corresponds to an integrated luminosity of 35.9 fb$^{-1}$, collected with the CMS detector at the CERN LHC in 2016. Pair production of T quarks would result in a wide range of final states, since vector-like T quarks of charge 2$e$/3 are predicted to decay to bW, tZ, and tH. Likewise, vector-like B quarks are predicted to decay to tW, bZ, and bH. Three channels are considered, corresponding to final states with a single lepton, two leptons with the same sign of the electric charge, or at least three leptons. The results exclude T quarks with masses below 1140-1300 GeV and B quarks with masses below 910-1240 GeV for various branching fraction combinations, extending the reach of previous CMS searches by 200-600 GeV.


Introduction
The discovery of a Higgs boson [1][2][3] (H) has further encouraged searches for new physics at the CERN LHC. Potentially divergent loop corrections to the Higgs boson mass require either significant fine tuning of the standard model (SM) parameters or new particles at the TeV scale. The existence of heavy top quark partners is particularly well motivated to cancel the largest corrections from SM top quark loops. In supersymmetric theories bosonic partners of the top quark serve this purpose, but in several other theories, such as little Higgs [4,5] or composite Higgs [6][7][8][9] models, this role is filled by fermionic top quark partners. These heavy quark partners interact predominantly with the third generation of the SM quarks [10,11] and have vector-like transformation properties under the SM gauge group SU(2) L × U(1) Y × SU(3) C , inspiring the name "vector-like quarks" (VLQs). A heavy fourth generation of chiral quarks has been excluded by precision electroweak measurements from electron-positron collisions [12,13] and by the measurement of Higgs-boson-mediated cross sections [14,15], but VLQs are not excluded by these experimental data.
We search for a vector-like T quark with charge 2e/3 that is produced in pairs with its antiquark, T, via the strong interaction in proton-proton collisions at √ s = 13 TeV. Our search uses a data sample corresponding to an integrated luminosity of 35.9 fb −1 , collected with the CMS detector in 2016. Many models in which VLQs appear assume that T quarks may decay to three final states: bW, tZ, or tH [16], as illustrated by the diagrams in Fig. 1. The partial decay widths depend on the particular model [17], but for VLQ masses significantly larger than the W boson mass, as considered here, an electroweak singlet T quark is expected to have branching fractions (B) of 50% for T → bW, and 25% for both T → tZ and tH [17,18]. A doublet T quark decays only to tZ and tH, each with 50% branching fraction. Although this search is optimized for TT production, vector-like bottom (B) quark decays can produce similar final state signatures, as illustrated in Fig. 1 (right), and are also considered. A B quark with charge −e/3 is expected to decay to tW, bH, or bZ with branching fractions equal to those of the corresponding T quark decays to the same SM boson. In the interpretation of this search we assume that only one type of new particle is present, either the T or the B quark. The singlet branching fraction scenario is used as a benchmark for both T and B quarks.   [26][27][28][29][30]. Previous searches by CMS in single lepton final states have excluded T quark masses below 1295 GeV for B(bW) = 100% [29], and masses below 790 to 900 GeV for any possible choice of branching fractions to the three decay modes [28]. This search focuses on channels with exactly one lepton, a same-sign (SS) dilepton pair, and at least three leptons (trilepton). For background categorization, the latter two channels distinguish between leptons produced directly in decays of W, H, or Z bosons (prompt) and leptons produced from other sources (nonprompt), such as heavy flavor hadron decays. This paper is organized as follows: Section 2 describes the CMS detector and how events are reconstructed, Section 3 describes the simulated background samples, and Section 4 describes the physics objects. In Sections 5-7 we describe strategies for the three channels of the search, and in Section 8 we describe the systematic uncertainties. Lastly, in Sections 9-10 we present our results and give a summary.

The CMS detector and event reconstruction
The central feature of the CMS detector is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [31].
A particle-flow (PF) algorithm aims to reconstruct and identify each individual particle in an event with an optimized combination of information from the various elements of the CMS detector [32]. The energy of photons is directly obtained from the ECAL measurement. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster including the energy sum of all bremsstrahlung photons compatible with originating from the electron track. The momentum of muons is obtained from the curvature of the corresponding track. The energies of charged hadrons are determined from a combination of their momenta measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for zero-suppression effects and for the response function of the calorimeters to hadronic showers. Finally, the energies of neutral hadrons are obtained from the corresponding corrected ECAL and HCAL energies.
Jets are reconstructed from the individual particles produced by the PF event algorithm, clustered with the anti-k T algorithm [33,34] with distance parameters of 0.4 ("AK4 jets") and 0.8 ("AK8 jets"). Jet momentum is determined as the vector sum of all particle momenta in the jet, and is found from simulation to be within 5-15% of the true momentum over the whole transverse momentum (p T ) spectrum and detector acceptance. Additional proton-proton interactions within the same or nearby bunch crossings ("pileup") can contribute additional tracks and calorimetric energy depositions to the jet momentum. To mitigate this effect, tracks identified to be originating from pileup vertices are discarded, and an offset correction [35] is applied to correct for remaining contributions. Jet energy corrections are derived from simulation, and are confirmed with in situ measurements of the energy balance in dijet, multijet, and photon/Z (→ e + e − /µ + µ − ) + jet events. A smearing of the jet energy is applied to simulated events to mimic detector resolution effects observed in data [36]. Additional selection criteria are applied to each event to remove spurious jet-like features originating from isolated noise patterns finding algorithm [33,34] with the tracks assigned to the vertex as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the p T of those jets. Each event must have at least one charged lepton (electron or muon) candidate that is reconstructed within the detector acceptance region of |η| < 2.5 (2.4) for electrons (muons), excluding the barrel-endcap transition region (1.44 < |η| < 1.57) for electrons.
Events containing leptons are initially selected using the HLT. For the single-lepton channel events must pass a set of triggers requiring one electron or muon with p T > 15 GeV and jets with p T that sums to at least 450 GeV. A secondary set of triggers selects events with one isolated electron (p T > 35 GeV) or one muon (p T > 50 GeV). For the SS dilepton channel events must pass triggers based on double lepton combinations, with momentum thresholds that varied over time. The dielectron trigger requires two electrons with p T > 37 and 27 GeV. Triggers for electron-muon events have a variety of thresholds: both leptons with p T > 30 GeV, or one lepton with p T > 37 GeV and the other flavor lepton with p T > 27 GeV. The dimuon trigger requires one muon with p T > 30 GeV, and another muon with p T > 11 GeV. For the trilepton channel, dilepton triggers with lower momentum thresholds were used to select events with isolated leptons. The dielectron channel requires an electron with p T > 23 GeV and another electron with p T > 12 GeV. Events with both lepton flavors are selected with triggers that require one lepton with p T > 23 GeV and a different flavor lepton with p T > 8 GeV. The dimuon trigger selects events featuring one muon with p T > 17 GeV and another muon with p T > 8 GeV.
Dedicated event filters remove events that are affected by: known noise patterns in the HCAL, accelerator-induced particles traveling along the beam direction at large radius (up to 5m), anomalously high energy deposits in certain ECAL "superclusters" [59], ECAL cell triggers that are not performing optimally, and muon candidates with large track uncertainties matched to misreconstructed tracks or charged hadrons.
Electrons are reconstructed [59] taking into account track quality, association between the track and electromagnetic shower, shower shape, and the likelihood of the electron being produced in a photon conversion in the detector. A multivariate discriminant is used to identify well-reconstructed electrons at two quality levels: a tight level with ≈88% efficiency (≈4% misidentification efficiency) and a loose level with ≈95% efficiency (≈5% misidentification efficiency).
Muons are reconstructed using information from both the CMS silicon tracker and the muon spectrometer in a global fit, matching deposits in the silicon tracker with deposits in the muon detector [60]. Identification algorithms consider the global fit χ 2 value, the number or fraction of deposits in the trackers and muon detectors, track kinks, and the distance between the track from the silicon tracker and the primary interaction vertex. We consider two quality levels: a tight level with ≈97% efficiency, and a loose level with ≈100% efficiency, in the barrel region of the detector. Both levels have a hadronic misidentification efficiency of <1%.
The large Lorentz boost of the decay products of the T quarks can produce final-state leptons that are in close proximity to hadronic activity, and are similar to background events with jets that contain a lepton from semileptonic hadron decays. The isolation of a lepton from surrounding particles is evaluated using a variable I mini , defined as the p T sum of PF candidate within a p T -dependent cone around the lepton, corrected for the effects of pileup using the effective area of the cone [35] and divided by the lepton p T . The radius of the isolation cone in Using a p T -dependent cone size allows for greater efficiency at high energies when jets and leptons are more likely to overlap. The reconstructed electrons and muons must have I mini < 0.1 to be labeled tight, and I mini < 0.4 to be labeled loose. Scale factors to describe efficiency differences between data and MC simulation for the lepton reconstruction, identification, and isolation algorithms are calculated using the "tag-and-probe" method [60], and are applied to simulated events.
All AK4 jets with p T > 30 GeV that lie within the tracker acceptance of GeV < 2.4 are considered in this search (unless otherwise noted, "jets" refers to AK4 jets). Additional selection criteria are applied to reject events containing noise and mismeasured jets. Leptons that pass tight identification and isolation requirements in the single-lepton channel, or loose requirements in the SS dilepton and the trilepton channels are removed from jets that have an angular separation of ∆R = √ (∆η) 2 + (∆φ) 2 < 0.4 with the leptons (where φ is azimuthal angle in radians), before jet energy corrections are applied. This is done by matching PF candidates in the lepton and jet collections and subtracting the four-momentum of a matched lepton candidate from the jet four-momentum. In the SS dilepton and the trilepton channels, loose leptons, as well as tight leptons, are removed from jets because these leptons are used to estimate nonprompt lepton backgrounds.
The missing transverse momentum vector p miss T is defined as the projection onto the plane perpendicular to the beam axis of the negative vector sum of the momenta of all reconstructed PF objects in an event. Its magnitude is referred to as p miss T . The energy scale corrections applied to jets are propagated to p miss T . We define H T as the scalar p T sum of all reconstructed jets in the event that have p T > 30 GeV and |η| < 2.4. In addition, we define the S T as the scalar sum of p miss T , the p T of leptons, and the H T in the event. This search relies on techniques to analyze the internal structure of jets and to identify the parton that created the jet. Jets are tagged as b quark jets using a multivariate discriminant, specifically the combined secondary vertex (CSVv2) algorithm [61], which uses information about secondary vertices within the jet. For simulated tt events, our requirement on this discriminant has an efficiency for tagging true b quark jets of ≈65%, averaged over jets with p T > 30 GeV.
The efficiency for falsely tagging light-quark or gluon jets, measured in multijet event data, is ≈1%. Efficiency differences in data and simulation are corrected by applying scale factors, which are functions of jet p T [61].
Heavy VLQ decays can produce top quarks and W, Z, or Higgs bosons with high momenta, causing their decay products to merge into a single AK8 jet. The "N-subjettiness" algorithm [62] creates jet shape variables, τ N , that quantify the consistency of the jet's internal structure with an N-prong hypothesis. Ratios of τ N /τ N−1 are powerful discriminants between jets predicted to have N internal energy clusters and jets predicted to have fewer clusters. Techniques called "pruning" or "softdrop" [63][64][65] remove soft and wide-angle radiation from the jet so that the mass of its primary constituents can be measured more accurately. The softdrop algorithm identifies two smaller subjets within the AK8 jet, and these can be identified as b quark subjets using the same algorithm as that applied to AK4 jets. The AK8 jets are reconstructed independently of AK4 jets, so they will frequently overlap. Unless otherwise stated, jet multiplicity criteria assume that AK4 and AK8 jets are clustered independently and may share constituents.
An AK8 jet is labeled as W tagged if it has p T > 200 GeV, |η| < 2.4, pruned jet mass between 65 and 105 GeV, and the ratio of N-subjettiness variables τ 2 /τ 1 < 0.6. These requirements yield a W tag efficiency of 60-70%, depending on AK8 jet momentum. The pruned mass distribution in simulation is smeared such that the resolution of the W mass peak matches the resolution observed in data [37]. Scale factors describing efficiency differences between data and simulation for the τ 2 /τ 1 selection are applied to the AK8 jets matched to true boosted hadronic W boson decays [37]. An AK8 jet is labeled as H tagged if it has p T > 300 GeV, |η| < 2.4, pruned mass between 60 and 160 GeV, and at least one b-tagged subjet. Having a larger mass than the W boson mass, the Higgs boson requires more momentum for the b quarks to merge into one AK8 jet. This algorithm exploits the large branching fraction of the Higgs boson to bb pairs and has an efficiency of ≈65%. If an AK8 jet is both H and W tagged, the H tag is given precedence.

Single-lepton channel
The single-lepton channel includes events with exactly one charged lepton. Boosted hadronic decay products of W and Higgs bosons are identified in AK8 jets and used to categorize events. This final state is highly sensitive to TT production with at least one T → bW or T → tH decay, as well as BB production with at least one B → tW or B → bH decay.

Event selection and categorization
Each event must have one electron or muon that passes the tight selection requirements described previously. The tight lepton must have p T > 60 GeV, and events with extra leptons passing the loose quality requirements with p T > 10 GeV and |η| < 2.5 (2.4) for electrons (muons) are rejected. We also require p miss T of at least 75 GeV to account for the presence of a neutrino from a W boson decay and to reduce multijet background events.
Each selected event must have at least three jets with p T > 300, 150, and 100 GeV. Events must also have at least two AK8 jets with p T > 200 GeV and |η| < 2.4, which are permitted to overlap with the AK4 jets. The requirement of at least two AK8 jets is highly efficient for signal in all decay modes (>98%) and reduces the background contribution.
Events are divided into 16 categories based on lepton flavor and the presence of H-, W-, and b-tagged jets: • H1b: events failing the H2b criterion, but having one or more H-tagged jets with only one b-tagged subjet; • W1: events with zero H-tagged jets but at least one W-tagged jet; • W0: events with zero H-tagged jets and zero W-tagged jets.
In both the H1b and H2b categories, we require an extra b-tagged jet that does not overlap with the H-tagged jet, since signal events with a Higgs boson always contain at least one top quark decay as well. In the W0 category we require a fourth jet with p T > 30 GeV and |η| < 2.4. Events in the W0 and W1 categories are subcategorized by the number of b-tagged jets (1, 2, ≥3).
Discrepancies in the modeling of top quark momentum are corrected by applying a weight that depends on the generated top quark p T [66] to simulated tt events. Discrepancies observed in H T -binned MADGRAPH samples are corrected by applying a scaling function that describes the observed difference in the H T spectrum between binned and inclusive simulations [28, 67,68].
To maximize signal efficiency in the search regions, and to create signal-depleted control regions, we calculate the minimum angular separation between the highest p T AK8 jet and any other AK8 jet in the event. In background processes there are often only two AK8 jets, usually emitted back to back from each other. In signal processes there are typically more than two AK8 jets and the minimum separation will be significantly smaller. The search region is therefore defined by requiring 0.8 < ∆R min (leading AK8, other AK8) < 3.0, and the control region by requiring ∆R min (leading AK8, other AK8) > 3.0. Signal efficiencies in the search region for the singlet decay mode are 9-15%, increasing with VLQ mass. Figure 2 shows distributions of W tagging input variables after all selection requirements: pruned mass in AK8 jets with τ 2 /τ 1 < 0.6, showing a clear W boson contribution in signal events, and τ 2 /τ 1 in AK8 jets with pruned mass inside the mass window of 65-105 GeV. The distribution of τ 2 /τ 1 shows that background processes with primarily one-prong jets, such as W+jets or multijet events, are concentrated at higher values, while signal events and top quark decays tend toward lower values. Figure 2 also shows the pruned mass in AK8 jets with two b-tagged subjets, and the number of b-tagged subjets in AK8 jets with a pruned mass within the range 60-160 GeV. The H tag algorithm is efficient for both H → bb and Z → bb decays. The systematic difference between data and background (bkg) is due to known issues in simulation of tt that are only partially corrected using the techniques mentioned previously. The residual difference is described by the uncertainty in the renormalization and factorization energy scales, discussed further in Section 8.
To search for VLQ events in the W0 and W1 categories, we analyze the minimum mass constructed from the lepton ( ) and a b-tagged jet, labeled min[M( , b)]. This distribution provides strong discrimination between tt events and signal events with a T → bW decay. Reconstructing the mass of two out of three leptonic SM top quark decay products, namely the lepton and b quark jet, produces a sharp edge below the top quark mass, while T → bW decays will produce a similar edge near the T mass. Since the H tagged categories have relatively few T → bW decays, the S T distribution is used as the search variable in these categories. Compared to other possibilities, such as using S T as the search variable in all categories, this combination of discriminating variables provides the best sensitivity to T quark production in the 1 TeV mass range in the singlet branching fraction scenario. Distributions of min[M( , b)] and S T in the search regions are shown in Section 9.  Figure 2: Distributions of W and H tagging input variables after all selection requirements: pruned mass in AK8 jets with τ 2 /τ 1 < 0.6 (upper left), N-subjettiness τ 2 /τ 1 ratio in AK8 jets with pruned mass between 65-105 GeV (upper right), pruned mass in AK8 jets with two btagged subjets (lower left), and number of b-tagged subjets in AK8 jets with pruned mass in the range 60-160 GeV (lower right). Vertical dashed lines mark the selection windows for each distribution. The black points are the data and the filled histograms show the simulated background distributions, grouped into categories as described in Section 3. The expected signal is shown by solid and dotted lines for T quark masses of 1.0 and 1.2 TeV. The final bin includes overflow events. Uncertainties, indicated by the hatched area, include both statistical and systematic components. The lower panel shows the difference between data and background divided by the total uncertainty.

Background modeling
Backgrounds are modeled from simulation in this channel and we perform a closure test in a control region, categorizing events as done in the search regions. The control region is defined by requiring ∆R min (leading AK8, other AK8) > 3. Further selection criteria are applied to form regions with significant amounts of H-tagged jets, W+jets events, or tt events. To form the tt control region, events from the W1 and W0 categories are split according to lepton flavor and b tag content: 1, 2 or ≥3 b-tagged jets. In the W+jets control region, events from the W1 and W0 categories without b-tagged jets are categorized based on W tag content: zero or at least one W-tagged jets. The H-tagged jet control region includes events from the H1b or H2b categories, split according to lepton flavor and number of b-tagged jets (0 or ≥1) that do not overlap any H-tagged jet. Signal efficiencies in the control regions are negligibly small (<1%) for both TT and BB production.
The comparison between data and simulation in these regions is used to evaluate the level of remaining differences after the event selection, efficiency corrections, and generator-level corrections, such as the differences in the rate of misidentified W-or H-tagged jets. In all control regions the data agree with simulation, within the systematic uncertainties described in Section 8.
To provide background-dominated regions in the statistical interpretation of the results, the control regions are aggregated into fewer categories. These aggregate regions target the tt + jets background in events with zero H-tagged jets and at least one b-tagged jet, the W+jets background in events without any H-or b-tagged jets, and misidentified H-tagged jets in events with at least one H-tagged jet and any number of b-tagged jets (including zero). To provide sensitivity to the energy scale of the background events, the H T distribution is used in these categories. Predicted and observed event yields in the control regions are listed in Table 2.

Same-sign dilepton channel
The SS dilepton channel attempts to make use of a unique feature of VLQ signals, namely the presence of prompt SS dilepton pairs. In TT production SS lepton pairs are most common in events having at least one T → tH decay, where the Higgs boson decays to a pair of W bosons. Since at least one W boson is produced in the decay of the other T quark, at least four W bosons are present in the final state, two of each charge. In BB production SS lepton pairs are more frequent, arising from events with at least one B → tW decay, since at least one other W boson is produced in the decay of the other B quark.

Event selection and categorization
We require events to have exactly two leptons with the same electric charge that are within the detector acceptance ( GeV < 2.4). Different triggers were used during early and late 2016 data taking, with different p T requirements for the leptons. We require the leading (subleading) lepton to have p T greater than 40 (35) GeV for the early data set and greater than 40 (30) GeV for the later data set. The two leptons must pass the tight identification and isolation requirements described in Section 4 and the events are divided into three categories based on the flavors: ee, eµ, and µµ.
After requiring two tight SS leptons, we apply additional selection criteria to reduce the background rate. To remove quarkonia decays we require M( , ) > 20 GeV. To remove Z boson decays we reject dielectron events with invariant lepton pair mass 76.1 < M( , ) < 106.1 GeV. This cut is not applied to dimuon events because muons have a negligibly small rate of charge misidentification. We require the number of jets to be ≥4 and the scalar sum of selected jet and lepton transverse momenta, H lep T to exceed 1200 GeV. To search for VLQ events in data in the SS dilepton channel, we perform a counting experiment using the yield of events passing the selections. Signal efficiencies for this channel after applying all selection criteria are 0.42 (0.5)% for a singlet T quark of mass 1.0 (1.2) TeV.

Background modeling
We consider three categories of backgrounds associated with this channel: SM processes with SS dilepton signatures; opposite-sign (OS) prompt leptons misreconstructed as SS leptons; and nonprompt leptons from heavy flavor hadron decays, jets misidentified as leptons, or photons converting to electrons. Leptons from tau decays are likely to be interpreted as prompt electrons or muons, whereas hadronic tau decays are likely to be considered to be nonprompt leptons. The background contribution from prompt SS dilepton processes is obtained from simulated samples in the VV(V) and tt + X groups.
Prompt OS dileptons can contribute background events when one lepton is assigned the wrong charge, leading to an SS dilepton final state. Muon reconstruction in CMS provides very reliable charge identification, leading to a very small rate of charge misidentification that is considered negligible for this search. The rate of charge misidentification for electrons is derived from a data sample dominated by Z → ee decays, by computing the ratio of SS dilepton events to all events. Misidentification efficiencies are derived as a function of |eta| for electrons with p T < 100 GeV, 100 < p T < 200 GeV, and p T > 200 GeV. The values are about 1% in the barrel region and about 5% in the endcap region. The number of SS dilepton events arising from charge misidentification is estimated by weighting the number of observed OS dilepton events that pass all other selection criteria, by the misidentification efficiency per electron.
Same-sign dilepton events arising from the presence of one or more nonprompt leptons is the primary reducible background. Two components of this background are jets misidentified as leptons and nonprompt leptons that pass tight isolation criteria. This contribution is estimated using the "tight-to-loose" method [69], in which events with one or more loose leptons are weighted by the tight-loose ratios expected for prompt and nonprompt leptons. The efficiency for prompt leptons to pass the tight selection criteria, or "prompt lepton efficiency," is determined using events with a lepton pair invariant mass within 10 GeV of the Z boson mass.
For muons the average prompt efficiency, found to be generally constant over p T and η, is 0.943 ± 0.001. For electrons the prompt efficiency depends on p T and ranges from 0.80 to 0.95.
The "misidentified lepton efficiency," or efficiency for nonprompt leptons to pass the tight selection criteria, is determined using a data sample enriched in nonprompt leptons. The fraction of prompt leptons from W and Z boson decays is reduced by requiring exactly one loose lepton per event, low p miss T , and that the lepton and p miss T be inconsistent with a W boson decay (transverse mass <25 GeV). At least one jet is required with large angular separation from the lepton (∆R > 1.

Trilepton channel
The trilepton final state is highly sensitive to VLQ pair production with at least one T → tZ, B → bZ, or B → tW decay, all of which can produce two or more prompt leptons. When combined with the decay of the other T or B quark, three or more prompt leptons can exist in the final state, a signature that is rare in SM processes.

Event selection and categorization
We select events with at least three leptons, each with p T > 30 GeV, that pass the tight identification and isolation requirements described in Section 4. The background from nonprompt leptons is estimated in a control sample with less restrictive selection criteria, including events with three leptons that pass the loose identification and isolation requirements. Leptons are sorted first based on tight or loose quality, and then based on p T in descending order. The events are divided into the four categories of the flavors (e or µ) of the first three leptons: eee, eeµ, eµµ, and µµµ.
Additionally, to reject background events with leptons originating from low-mass resonances, no OS same flavor lepton pair with invariant mass M( , ) OS < 20 GeV is allowed. We also require the events to have at least three jets with p T > 30 GeV and |η| < 2.4, at least one of which is b-tagged, since top quark decays and/or b quarks are expected in signal events. Lastly, we require p miss T > 20 GeV. These requirements create a sample with many leptons from Z and W boson decays, together with several jets to account for hadronic decays products of the T or T. To search for VLQ events in data in the trilepton channel, we use the S T distribution to discriminate the signal from the background. With respect to the expected number of events before any selections, the signal efficiencies for this channel after all selections are 0.65 (0.66)% for a singlet T quark of mass 1.0 (1.2) TeV.
We define a signal-depleted control region for the purpose of calculating misidentified lepton efficiencies. This control region is defined using the initial selection requirements above, except that we require exactly two jets instead of at least three. Processes containing nonprompt leptons contribute almost equally to this region and to the signal region.

Background modeling
Backgrounds are divided into two categories, prompt and nonprompt. The prompt category contains events originating from SM processes capable of producing three or more prompt leptons in the final state. These include the WZ, ZZ, and triboson processes in the "VV(V)" group, and the ttZ and ttW processes in the "tt+V" group. We use simulation to predict the yields of these background processes. The nonprompt category contains events with nonprompt leptons that pass the tight lepton identification and isolation criteria, and jets misidentified as leptons, such as trilepton events coming from tt or Z +jets processes. We use a three-lepton extended version of the tight-to-loose technique to estimate the rate of nonprompt background events.

Prompt and misidentified lepton efficiencies
Prompt lepton efficiencies are the same in the same-sign dilepton and trilepton channels. Misidentified lepton efficiencies are obtained from measurements in the control region using events with exactly three leptons. The misidentified lepton efficiencies are obtained by calculating the minimum of a χ 2 statistic from fits of the predicted background to data. The predicted background is the sum of the nonprompt background estimate and the prompt MC background. Specifically, we use the bins (i) of the lepton p T distribution to calculate χ 2 : where r represents the prompt and misidentified lepton efficiencies, N data is the number of events observed in data, N NP (r) is the number of nonprompt background events (as a function of r) estimated from data, N MC is the number of prompt background events estimated from MC simulation.
Using Eq. (2), we calculate χ 2 for each of the four flavor categories, while varying both the misidentified electron and muon efficiencies from 0.01 to 0.5, and sum the individual terms. The minimum of the χ 2 per degree of freedom is found to be 1.58 which corresponds to misidentified electron and muon efficiencies of 0.20 ± 0.02 and 0.14 ± 0.01, respectively. The uncertainties are the standard deviations of a Gaussian probability distribution constructed from the χ 2 values. Figure 4 shows distributions of lepton p T and S T in the control regions, where the nonprompt background is estimated using the misidentified lepton efficiencies that correspond to the minimum of the χ 2 .
We perform a closure test for the nonprompt background estimation by measuring the misidentified lepton efficiencies in a tt MC sample. This measurement is used to predict the number of events with three tight leptons, two of which are prompt and one nonprompt. The following discrepancies are observed between the number of observed and predicted events: 28% in the eee channel, 31% in the eeµ channel, 17% in the eµµ channel, and 20% in the µµµ channel.
In addition, we perform misidentification efficiency measurements using the χ 2 minimization method described above in the tt MC sample in both the control region and signal region selections. We observe that there is a change of 0.04 in the misidentified electron efficiency between regions and negligible change in the misidentified muon efficiency. The change in the misidentified electron efficiency is assigned as a systematic uncertainty. The misidentified lepton efficiencies are taken to be p T -independent and any dependency on η is included as a systematic uncertainty.

Systematic uncertainties
We consider sources of systematic uncertainties that can affect the normalization and/or the shape of expected background distributions. A summary of the systematic uncertainties and how they are applied to signal and background samples can be found in Table 3.
The uncertainty in the integrated luminosity is 2.5% [70] and is applied to all samples. Lepton reconstruction, identification, and isolation efficiency scale factor uncertainties are applied based on the number of leptons in each channel. Trigger efficiency uncertainties are applied as a function of lepton flavor, p T , and GeV in the single lepton channel, and as flat percentages in the SS dilepton and trilepton channels. In the single-lepton channel a 15% uncertainty is applied to the cross section of diboson samples [71][72][73], and a 16% uncertainty is applied to single tW production.
In the SS dilepton channel, closure tests are performed in the tt MC simulation by comparing the predicted nonprompt background using the tight-to-loose method and the observed nonprompt background from truth information, based on which an uncertainty of 50% is applied for the nonprompt background yield. An uncertainty of 30% is applied to the OS prompt background to account for possible p T variations in the rate of charge misidentification within the p T bins, and for differences in rates of charge misidentification calculated in Drell-Yan versus tt MC.
In the trilepton channel, an uncertainty in the nonprompt background yield is calculated by varying the misidentified lepton efficiencies by their uncertainties of 0.04 for electrons and 0.01 for muons. These are obtained by summing in quadrature the statistical uncertainties and the systematic uncertainties due to the possible discrepancies between misidentification efficiencies measured in the control region and in the signal region. It results in an uncertainty of 12-30% (4-12%) in the nonprompt background yield. From the closure test described in Section 7, we also apply an uncertainty of 17-31% in the nonprompt background yield based on discrepancies between the tight-to-loose method prediction and the observed yields in simula-   Table 3: Summary of values for normalization uncertainties and dependencies for shape uncertainties. The symbol σ denotes one standard deviation of the uncertainty and "env" denotes an envelope of values. Background from opposite-sign dilepton events is denoted "OS", background from nonprompt leptons is denoted "NP", while other backgrounds modeled from simulation are denoted "MC". For signals, theoretical uncertainties are labeled as "Shape" for shape-based searches, and "Accept." for counting experiments. Additionally, "CR" denotes control region and "RMS" denotes root mean square. As an additional source of systematic uncertainty, we evaluate the remaining difference in yield between the background estimate and data in the control region, using the misidentification efficiencies measured in that region. These differences range from 2% in the µµµ channel to 35% in the eee channel. In the SS dilepton channel the muon fake rate can be modeled by a quadratic dependence on η, while the trilepton channel uses an η-independent value. The change in trilepton nonprompt background yield if an η-dependent muon fake rate is adopted is 12-33%, and an additional uncertainty is applied to take account of this. Finally, the prompt lepton efficiencies were calculated in a control sample selected using a trigger with less stringent lepton isolation requirements than those in the triggers used to select the trilepton channel events. Because the true prompt efficiency in the trilepton channel is expected to be slightly higher than the values used for the SS dilepton channel, an uncertainty is assigned by comparing the trilepton nonprompt background yields with yields obtained when using prompt lepton efficiencies of unity. These uncertainties ranges from 2-9% (1-7%) in the nonprompt background yield for electrons (muons), with the smallest values in the categories with only one lepton of a given flavor and the largest uncertainties in the same-flavor channels.
Uncertainties affecting both the shape and normalization of the distributions in multiple channels include uncertainties related to the jet energy scale, jet energy resolution, and b tagging and light-parton mistag rates [61]. The uncertainty due to the pileup simulation is evaluated by adjusting the total inelastic cross section (σ inel. ) used to calculate the correction by ±4.6% [74].
The uncertainties in the PDFs used in MC simulations are evaluated from the set of NNPDF3.0 MC replicas [39]. Renormalization and factorization energy scale uncertainties are calculated by varying the corresponding scales up and down (both independently and simultaneously) by a factor of two and taking the envelope, or largest spread, of all observed variations as the uncertainty. These theoretical uncertainties are applied to the signal simulations primarily as shape uncertainties. The normalization uncertainty is small and associated with changes in acceptance. For backgrounds, the full theoretical uncertainties are applied. All common uncertainties are treated as correlated across the three analysis channels.
In the single-lepton channel we also associate shape uncertainties with the W tagging scale factors for the pruned mass scale and smearing, the τ 2 /τ 1 selection efficiency, and its p T dependence [37]. An uncertainty of 5% is applied to account for the effects of propagating corrections derived from the W mass peak to the Higgs mass peak. These corrections are anticorrelated between categories with and without H tags. The uncertainty in the generator-level top quark p T reweighting is estimated as the difference between weighted and unweighted distributions. This uncertainty is excluded from fits because of strong correlations with the renormalization and factorization energy scale uncertainties. The uncertainty in the H T scaling procedure is the difference between scaling functions obtained by fitting the inclusive-to-binned H T ratio after shifting values up or down by their statistical uncertainties.

Results
The strongest overall sensitivity to TT and BB production is achieved by combining the three leptonic channels, since each channel is sensitive to different VLQ decay modes. Table 4 shows the selection efficiency for all three channels in each TT or BB decay mode, with respect to the total number of expected events for a given decay mode (e.g., tHtH). The most sensitive decay modes for each channel are noted in bold. Comparing efficiencies across TT decay modes, the single-lepton channel has the highest efficiency for decay modes with at least one T → bW decay, the SS dilepton channel is sensitive to B → tW decays, and the trilepton channel has high efficiency for decay modes with at least one T → tZ decay. Table 4: Signal efficiencies in the single-lepton, same-sign dilepton, and trilepton channels, split into the six possible final states of both TT and BB production, for three mass points. Efficiencies, stated in percent, are calculated with respect to the expected number of events in the corresponding decay mode, before any selection. The most sensitive decay modes for each channel are noted in bold. The efficiency for bWbW events in the same-sign dilepton and trilepton channels is negligible, as is the efficiency for bZbZ events in the same-sign dilepton channel.  Figs. 5 and 6 for the single-lepton channel categories. The distributions are binned such that the simulated background has a statistical uncertainty of <30% in each bin. Figure 7 shows the S T distribution in each category of the trilepton channel. The slight excess of data in the low S T region is within the systematic uncertainty in the misidentified lepton efficiencies that describes the rate difference between the control and signal regions. Predicted and observed event yields for the single-lepton, SS dilepton, and trilepton channels are listed in Tables 5-7. The T quark distributions and event yields are for the singlet branching fraction benchmark. No significant excess of data above the background prediction is observed.     the SS dilepton channel, and S T distributions for the 4 trilepton categories. Statistical uncertainties in the background estimates are treated using the Barlow-Beeston light method [77,78]. Other systematic uncertainties are treated as nuisance parameters, as listed in Table 3. Normalization uncertainties are given log-normal priors, and shape uncertainties with shifted templates are given Gaussian priors with a mean of zero and width of one. The signal cross section is assigned a flat prior distribution. Figure 8 shows 95% CL upper limits on the production of T and B quarks in the benchmark branching fraction scenarios. We exclude singlet T quark masses below 1200 GeV (1160 GeV expected), doublet T quark masses below 1280 GeV (1240 GeV expected), singlet B quark masses below below 1170 GeV (1130 GeV expected), and doublet B quark masses below 940 GeV (920 GeV expected). Masses below 800 GeV were excluded in previous searches. For T and B quark masses in the range 800-1800 GeV, cross sections smaller than 30.4-9.4 fb (21.2-6.1 fb) and 40.6-9.4 fb (101-49.0 fb) are excluded for the singlet (doublet) scenario. Figure 9 shows the expected and observed limits for scans over many possible T and B quark branching fraction scenarios. Based on the branching factions, lower limits on T and B quark masses range from 1140 to 1300 GeV, and from 910 to 1240 GeV.

Summary
A search has been presented for pair-produced vector-like T and B quarks in a data sample of proton-proton collisions recorded during 2016 by the CMS experiment, and corresponding to an integrated luminosity of 35.9 fb −1 . The search is performed in channels with one lepton, two same-sign leptons, or at least three leptons in the final state and makes use of techniques to identify Lorentz-boosted hadronically decaying W and Higgs bosons. Combining these channels, we exclude T (B) quarks at 95% confidence level with masses below 1200 (1170) GeV in the singlet branching fraction scenario and 1280 (940) GeV in the doublet branching fraction scenario. For other branching fraction scenarios this search excludes T (B) quark masses below 1140-1300 GeV (910-1240 GeV). This represents an improvement in sensitivity of typically 200-600 GeV, compared to previous CMS results. These results are the strongest exclusion limits to date for T quarks with B(tZ) greater than ≈0.5 and for B quarks with B(tW) less than ≈0.6.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMWFW and FWF (Aus  [26] ATLAS Collaboration, "Search for pair production of vector-like top quarks in events with one lepton, jets, and missing transverse momentum in √ s = 13 TeV pp collisions with the ATLAS detector", JHEP 08 (2017) 052, doi:10.1007/JHEP08(2017)052, arXiv:1705.10751.