Search for pair production of vector-like T and B quarks in single-lepton final states using boosted jet substructure in proton-proton collisions at sqrt(s) = 13 TeV

A search for pair production of massive vector-like T and B quarks in proton-proton collisions at sqrt(s) = 13 TeV is presented. The data set was collected in 2015 by the CMS experiment at the LHC and corresponds to an integrated luminosity of up to 2.6 inverse femtobarns. The T and B quarks are assumed to decay through three possible channels into a heavy boson (either a W, Z or Higgs boson) and a third generation quark. This search is performed in final states with one charged lepton and several jets, exploiting techniques to identify W or Higgs bosons decaying hadronically with large transverse momenta. No excess over the predicted standard model background is observed. Upper limits at 95% confidence level on the T quark pair production cross section are set that exclude T quark masses below 860 GeV in the singlet, and below 830 GeV in the doublet branching fraction scenario. For other branching fraction combinations with B(tH) + B(bW)>= 0.4, lower limits on the T quark range from 790 to 940 GeV. Limits are also set on pair production of singlet vector-like B quarks, which can be excluded up to a mass of 730 GeV. These limits are among the most stringent to date for vector-like T quarks. The techniques showcased here for understanding highly-boosted final states are important as the sensitivity to new particles is extended to higher masses.


Introduction
The discovery of a light mass Higgs boson (H) [1][2][3] motivates searches for new interactions and particles at the LHC [4].Cancellation of the loop corrections to the Higgs boson mass without precise fine tuning of parameters requires new particles at the TeV scale.Such new particles are the bosonic partners of the top quark, in supersymmetric models, or the fermionic top quark partners predicted by many other theories, such as little Higgs [5,6] and composite Higgs [7][8][9][10] models.These heavy quark partners predominantly mix with the third-generation quarks of the standard model (SM) [11,12] and have vector-like transformation properties under the SM gauge group SU(2) L × U(1) Y × SU(3) C , hence the term "vector-like quarks" (VLQ).While a chiral extension of the SM quark family has been strongly disfavored by precision electroweak studies at electron-positron colliders [13,14] and by observed production cross sections and branching fractions of the Higgs boson [15], models with VLQs are not excluded by present data.
We search for a vector-like T quark with charge 2/3 (in units of the electron charge) that is produced via the strong interaction in proton-proton collisions along with its antiquark, T.Many models in which VLQs appear assume that T quarks decay to three final states: bW, tZ, or tH [16].Leading-order Feynman diagrams of these three processes are shown in Fig. 1, created with the tools of Ref. [17].The partial decay widths depend on the particular model [18], so that the branching fractions of these decay modes can take on various possible values, with the sum of all three branching fractions equal to unity.An electroweak isospin singlet T quark is expected to have a branching fraction of approximately 50% for T → bW, and 25% for each of T → tZ and tH, and is used as a benchmark for figures and tables.A T quark in a weak isospin doublet has no decays to bW and equal branching fractions for tZ and tH decays [18][19][20].As these are, however, not the only possible representations of T quarks, the final results are interpreted for many allowed branching fraction combinations.
Though this search is optimized for TT production, decays of vector-like bottom quark partners (B quarks) can produce similar topologies and BB production is also considered.The B quark with charge −1/3 is expected to decay to tW, bH, or bZ and can also transform either as a singlet or doublet under the electroweak symmetry group.The respective branching fractions are equal to those of the corresponding T quark decays to the same SM bosons.For this search we assume that only one new particle is present, either the T or B quark.
Most recently, searches for pair-produced T and B quarks were performed by both the ATLAS and CMS collaborations at √ s = 8 TeV [21][22][23][24][25][26].Depending on the assumed combination of branching fractions to the three decay modes, the CMS collaboration observed lower limits on the T quark mass with values ranging from 720 to 920 GeV and on the B quark mass with values ranging from 740 to 900 GeV at 95% confidence level (CL) [21,25].The ATLAS collaboration found similar lower mass limits, so that vector-like T and B quarks with masses below 720 GeV are already excluded for all possible branching fraction combinations.We therefore only consider VLQ masses above 700 GeV in this search.The ATLAS collaboration has also searched for pair production of T and B quarks at √ s = 13 TeV [27, 28].
We require one electron or one muon in the final state, along with several jets.All decay modes of the T and B quarks produce t quarks and/or W bosons, which are the dominant sources of leptons.In the high mass region that we consider, the decay products can have a large Lorentz boost and result in highly collinear final state particles.This search makes use of techniques to identify b quark jets and reconstruct hadronic decays of massive particles that are highly Lorentz-boosted in the reference frame of the TT system.The data are analyzed in two channels that are optimized for sensitivity to either boosted W or Higgs bosons, referred to as the "boosted W" and "boosted H" channels.The boosted W channel is most sensitive to scenarios where the T quark has a large branching fraction for bW decays (such as the electroweak singlet benchmark) while the boosted H channel has the highest sensitivity to scenarios with a large branching fraction to tH (such as the electroweak doublet benchmark).The T → tZ decay mode is not a particular target of this search, but Lorentz-boosted Z bosons decaying hadronically can be selected in either channel since the signatures are similar to those of boosted hadronic W or Higgs boson decays, thus providing some sensitivity to the tZ decay mode.

The CMS detector and event reconstruction
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections.Forward calorimeters extend the pseudorapidity (η) [29] coverage provided by the barrel and endcap detectors.Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid.
A particle-flow (PF) algorithm [30] is used to reconstruct and identify each individual particle in an event with an optimized combination of information from the various elements of the CMS detector.The energy of photons is directly obtained from the ECAL measurement, corrected for zero-suppression effects.The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track.The momentum resolution for electrons with transverse momentum p T ≈ 45 GeV from Z → e + e − decays ranges from 1.7% for low-bremsstrahlung electrons in the barrel region to 4.5% for showering electrons in the endcaps [31].The energy of muons is obtained from the curvature of the corresponding track.
Matching muons to tracks measured in the silicon tracker results in a relative transverse momentum resolution for muons with 20 < p T < 100 GeV of 1.3-2.0% in the barrel and better than 6% in the endcaps.The p T resolution in the barrel is better than 10% for muons with p T up to 1 TeV [32].The energy of charged hadrons is determined from a combination of their momenta measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for zero-suppression effects and for the response function of the calorimeters to hadronic showers.Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energy.
Jets are reconstructed from the individual particles produced by the PF event algorithm, clustered using the anti-k T algorithm [33,34] with distance parameters of 0.4 ("AK4 jets") or 0.8 ("AK8 jets").Jet momentum is defined as the vectorial sum of all particle momenta in the jet, and is found from simulation to be within 5 to 10% of the true momentum over the whole p T spectrum and detector acceptance.All jets are required to have |η| < 2.5 and AK4 (AK8) jets must have p T > 30 (200) GeV.An offset correction is applied to jet energies to take into account the contribution from additional proton-proton interactions within the same or nearby bunch crossings (pileup) [35].Jet energy corrections are derived from simulation, and are confirmed with in situ measurements of the energy balance in dijet and photon/Z(→ ee/µµ) + jet events [36].A smearing of the jet energy is applied to simulated events to mimic the energy resolution observed in data, typically 15% at 10 GeV, 8% at 100 GeV, and 4% at 1 TeV.Additional selection criteria are applied to each event to remove spurious jet-like features originating from isolated noise patterns in the HCAL [37], anomalously high energy deposits in certain regions of the ECAL, and cosmic ray and beam halo particles that are detected in the muon chambers.
The missing transverse momentum vector is defined as the projection on the plane perpendicular to the beams of the negative vector sum of the momenta of all reconstructed particles in an event.Its magnitude is referred to as E miss T .The energy scale corrections applied to jets are propagated to E miss T .
A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [29].

Data and simulated samples
The data used in this analysis were collected during 2015 when the LHC collided protons at √ s = 13 TeV with a bunch spacing of 25 ns.The data set for the boosted W channel corresponds to an integrated luminosity of 2.3 fb −1 .The data set for the boosted H channel in the electron (muon) channel corresponds to 2.5 (2.6) fb −1 and includes additional data collected with poor forward calorimeter performance where the E miss T has been re-computed excluding the affected region of the detector.
To compare the SM expectation with the experimental data, samples of events for all relevant SM background processes and the TT signal are produced using Monte Carlo (MC) simulation.Background processes are simulated using several matrix element generators.The POWHEG v2 generator [38][39][40][41] is used to simulate tt events, as well as single top quark events in the tW channel at next-to-leading order (NLO).The MADGRAPH5 aMC@NLO 2.2.2 generator [42] is used for generation at NLO of Drell-Yan + jets and tt + W events, as well as tt + Z events, and sand t-channel production of single top quarks.The FxFx scheme [43] for merging matrix element generation to the parton shower is used.The MADGRAPH v5.2.2.2 generator is used with the MLM scheme [44] to generate W + jets, Drell-Yan + jets, and multijet events at leading order.PYTHIA 8.212 [45,46] is used for the simulation of multijet and diboson events.
The boosted W channel uses the NLO Drell-Yan + jets simulation and the MADGRAPH multijet simulation.The boosted H channel uses the MADGRAPH Drell-Yan + jets simulation, and the PYTHIA multijet simulation which is filtered for processes likely to pass the lepton selection in this channel.Background samples are grouped into three categories for presentation: "TOP", dominated by tt and including single top quark and tt + W/Z samples; "EW", dominated by W + jets and including Drell-Yan + jets and diboson samples; and "QCD", including multijet samples.
Signal samples for both TT and BB production are simulated using MADGRAPH for mass points between 700 and 1800 GeV in steps of 100 GeV.A narrow width of 10 GeV is assumed for the vector-like quarks.Predicted cross sections, which depend only on the vector-like quark mass, are computed at next-to-next-to-leading order (NNLO) with the TOP++2.0program [47][48][49][50][51][52] and are listed in Table 1.
Parton showering and the underlying event for all simulated samples are obtained with PYTHIA using the CUETP8M1 tune [53,54].To simulate the momentum spectrum of partons inside the colliding protons, the NNPDF3.0[55] parton distribution functions (PDFs) are used.Detector simulation for all MC samples is performed with GEANT4 [56] and includes the effect of pileup.

Reconstruction methods
We perform a search for T quarks that decay to final states with an electron or a muon, and jets.Selected events must have one or more pp interaction vertices within the luminous region (longitudinal position |z| < 24 cm and radial position ρ < 2 cm), reconstructed using a deterministic annealing filter algorithm [57].The primary interaction vertex is the vertex with the largest ∑ p 2 T from its associated jets, leptons, and E miss T .The number of pileup interactions differs between data and simulation, so simulated events are weighted to reflect the pileup distribution expected in data given a total inelastic cross section of 69 mb [58].
Two observables that are useful in discriminating signal from background events, exploiting the fact that the decays of T quarks to single-lepton final states produce a large number of hadronic objects, are the following: the quantity H T , defined as the scalar p T sum of all reconstructed AK4 jets with p T > 30 GeV and |η| < 2.4, and the quantity S T , defined as the scalar sum of E miss T , the p T of the lepton, and H T .

Lepton reconstruction and selection
This search requires one charged lepton, either an electron or a muon, to be reconstructed within the acceptance region of |η| < 2.4.The event must satisfy a single-electron or singlemuon trigger.The choice of triggers is adapted to the particular final state targeted in each channel.In T → bW decays, the W boson is generally well separated from the associated bottom quark since the T quark has low p T compared to its mass, leading to a low level of hadronic activity in close proximity to the lepton.In contrast, a lepton originating from a top quark decay (e.g., from a T → tH decay) becomes increasingly collinear with the associated bottom quark as the T quark mass increases and the Lorentz boost of the top quark rises.
As a consequence of the above, the boosted W channel uses triggers selecting leptons that are isolated with respect to nearby PF candidates, either electron candidates with p T > 27 GeV and |η| < 2.1, or muon candidates with p T > 20 GeV.The triggers used in the boosted H channel do not require that the leptons are isolated.In the electron channel, events with at least one electron candidate with p T > 45 GeV, one AK4 jet with p T > 200 GeV, and another AK4 jet with p T > 50 GeV are selected by the trigger.The muon channel trigger selects events with a muon candidate with p T > 45 GeV and |η| < 2.1.Methods to evaluate lepton isolation efficiency after trigger selection are described below.
Additional lepton identification quality criteria are required to reduce the contribution from background events containing other particles misidentified as leptons.For electrons these quality requirements [31] combine variables measuring track quality, the association between the track and electromagnetic shower, shower shape, and the likelihood of the electron to originate from a photon.Electrons are identified in the boosted H channel using a set of selection criteria with an efficiency of ≈88% and misidentification rate of ≈7%.In the boosted W channel, two working points are defined based on a multivariate identification algorithm: a tight level with ≈88% efficiency (≈4% misidentification rate) and a loose level with ≈95% efficiency (≈5% misidentification rate).
Muons are reconstructed by fitting hits in the silicon tracker together with hits in the muon detectors [32].Identification algorithms consider the quality of this fit, the number or fraction of valid hits in the trackers and muon detectors, track kinks, and the minimum distance between the extrapolated track from the silicon tracker and the primary interaction vertex.Several working points are defined: the boosted W channel uses so-called "tight" ("loose") muons with ≈97% (100%) efficiency in the barrel region, and the boosted H channel uses "medium" muons with ≈99% efficiency in the barrel region.All muon identification working points have hadron misidentification rates of <1%.
Leptons that pass the requirements in the two channels are removed from jets that have an angular separation of ∆R < 0.4 from the lepton.This is done by matching PF candidates identified as leptons to the ones identified as jets and subtracting the four-momentum of a matched lepton candidate from the jet four-momentum.
In order to reduce the rate of background events that contain a soft lepton (e.g., from semileptonic bottom quark decays in multijet events), several metrics can be used to evaluate the isolation of a lepton from surrounding particles.In the boosted H channel, either an angular separation of ∆R( , j) > 0.4, or p rel T ( , j) > 40 GeV is required.Here, denotes the highest p T lepton, j is the jet closest to that lepton in angular separation, and p rel T ( , j) is the projection of the lepton momentum on the direction perpendicular to the jet momentum in the -j plane.These criteria, also referred to as "2D isolation", ensure a high signal efficiency for decays such as T → tH, with leptons produced close to jets, while rejecting a large fraction of the multijet background.
In the boosted W channel, where fewer leptons with nearby b quarks are expected, isolation is evaluated using mini-isolation (I mini ), defined as the sum of the transverse momenta of PF candidates within a p T -dependent cone around the lepton, corrected for the effects of pileup and divided by the lepton p T .The radius of the isolation cone, R I , is defined as: R I = 10 GeV min(max(p T , 50 GeV), 200 GeV) .
(1) Using a p T -dependent cone size allows for greater efficiency at high energies where jets and leptons are more likely to overlap."Tight" electrons (muons) must have I mini < 0.1 (0.2) while "loose" electrons and muons satisfy I mini < 0.4.In addition, the 2D isolation requirement is applied to remove any residual overlap between mini-isolated leptons and jets.
Scale factors that account for selection efficiency differences between data and simulation are calculated as a function of lepton p T and η using a "tag-and-probe" method [31,32,59].These were calculated in separate measurements for the single-lepton trigger, lepton identification, and I mini requirements.
These scale factors are applied to simulated events for both lepton flavors.For the 2D isolation requirement, no significant difference is found between the selection efficiencies in data and simulation and hence no scale factor is applied.

Hadronic W and H tagging
In the decay of a heavy T quark, particles are produced with high momentum and large Lorentz boost.The decay products of top quarks and W, Z, or Higgs bosons are therefore often collimated.This can be seen in Fig. 2 in which the angular separation ∆R between the products of simulated W → qq and H → bb decays are shown for several T quark masses.Even for the lightest considered mass point this separation often has values of ∆R < 0.8, where the decay products of heavy bosons can merge into a single AK8 jet.
A jet shape variable called "N-subjettiness" [60], denoted as τ N , is defined as the sum of the transverse momenta of k constituent particles weighted by their minimum angular separation from one of N subjet candidates (∆R N,k ), which are in a jet of characteristic radius R 0 : This variable quantifies the consistency of a jet with originating from an N-prong particle decay.The ratio τ 2 /τ 1 provides high sensitivity to two-prong decays such as W → qq .Jet grooming techniques ("pruning" and "soft drop") are used to remove soft and wide-angle radiation so that the mass of the hard constituents can be measured more precisely [61,62].The pruning procedure reclusters the jet, removing soft or large-angle particles, while the soft drop algorithm recursively declusters the jet, removing sub-clusters until two subjets are identified within the AK8 jet.AK8 jets are reconstructed independently of AK4 jets, so they will frequently overlap.Unless otherwise stated, such overlapping jets are not removed when applying selections based on jet multiplicity.
The AK4 jets and subjets of AK8 jets can be tagged as originating from b quarks based on information about secondary vertices and displaced tracks within the jet.The efficiency for tagging b hadron jets in simulation is approximately 65%, averaged over jet p T (slightly lower for subjets of AK8 jets), and the probability of mistagging a charm (light) quark jet is 13% (1%) [63].Scale factors, which are functions of jet p T and flavor, are applied to account for efficiency differences between data and simulation.
To account for these differences, pruned jet mass scale factors and mass resolution smearing factors are applied in simulation to all AK8 jets.A τ 2 /τ 1 selection scale factor is applied in simulation to jets that are spatially matched to true boosted products of a hadronic W boson decay.
Higgs boson candidate jets are reconstructed by exploiting the significant branching fraction of the Higgs boson to bb pairs.AK8 jets are marked as "H tagged" if they have p T > 300 GeV, soft drop jet mass in the range 60-160 GeV, and if at least one of the two subjets from the soft drop algorithm is tagged as a bottom subjet.
5 Boosted H channel

Event selection and categorization
In this channel, one electron with p T > 50 GeV and |η| < 2.4, or one muon with p T > 47 GeV and |η| < 2.1 is required.In events with an electron, at least one AK4 jet with p T > 250 GeV and a second AK4 jet with p T > 70 GeV are required to select events with a nearly constant trigger efficiency.Furthermore, selected events must have S T > 800 GeV, at least three AK4 jets, and at least two AK8 jets, since we expect a hadronic decay of a boosted Higgs boson in each event along with at least one other hadronic t quark, W, Z, or further Higgs boson decay.
For the rejection of non top quark backgrounds, at least one b-tagged AK4 jet is required.
Distributions of the variables used in the H-tagging algorithm, as described in Section 4, are shown in Fig. 3.These distributions are from events that pass all selection criteria outlined above except for the b-tagging requirement, and that have the corrections described in Section 5.2 applied.The distribution of the number of b-tagged subjets for the highest p T AK8 jet with soft drop jet mass within 60-160 GeV is shown along with the mass of the highest p T AK8 jet with two b-tagged subjets, before the mass requirement.To illustrate the sensitivity of the H-tagging algorithm to the presence of boosted Higgs bosons, the TT signal with a mass of 1200 GeV is split into two curves: the solid curve shows TT events where at least one Higgs boson is present in the decay chain and the dashed curve shows TT events with only T → tZ or T → bW decays.It can be seen that signal events with at least one T → tH decay produce a clear peak at 125 GeV in the mass distribution of the H-tagged jet.Signal events without a Higgs boson in the decay chain have a less pronounced increase at 90 GeV because of hadronic Z boson decays.
After passing the selection defined above, events are split into two exclusive categories, which depend on the number of b-tagged subjets of H-tagged jets, and are defined as follows:

CMS
Figure 3: Distributions of the number of b-tagged subjets of the highest p T H-tagged jet candidate with p T > 300 GeV and M jet in the range [60,160] GeV (left), and M jet of the highest p T H-tagged jet candidate with p T > 300 GeV and two subjet b tags (right).A T quark signal with M(T) = 0.8 TeV is shown (right), normalized to the predicted cross section and scaled by a factor of 20, with the singlet benchmark branching fractions assumed.The solid (dashed) curve shows TT events with at least one (zero) Higgs boson decay, where contributions from each decay mode are weighted to reflect the singlet branching fraction scenario.The uncertainty in the background includes the statistical and systematic uncertainties described in Section 7.
• H2b: events with at least one H-tagged jet with exactly two b-tagged subjets.
• H1b: events with at least one H-tagged jet with exactly one b-tagged subjet.
To avoid an overlap between the two categories, any event is first checked whether it falls into the H2b category and only if it does not, it can enter into the H1b category.

Background modeling
To evaluate the modeling of tt and W + jets production, the dominant background processes, two control regions that are enriched in events from these processes are defined by modifying the event selection defined in Section 5.1.In the tt control region, at least two b-tagged jets are required instead of at least one.In the W + jets control region, the requirement of at least one b-tagged jet is inverted and events with any b-tagged jets are rejected.Events with an Htagged jet are rejected in both control regions to reduce the signal contribution in these regions, and E miss T > 100 GeV is required to reject events from multijet production.The signal to background ratio is about six times smaller than the one in the H2b category in the tt control region and about 30 times smaller in the W + jets control region.Events are corrected for all known sources of discrepancies between the data and simulation such as differing reconstruction or tagging efficiencies.It is observed that jets have a harder p T spectrum in simulation, leading to significant discrepancies from observed distributions of quantities such as H T .The discrepancies in both control regions are well described by 2-parameter linear fits with negative slopes to the ratio between data and simulation in the H T distributions [65,66].Modeling of the tt and W + jets background samples is corrected using the results of these fits.The S T distributions for both control regions are shown in Fig. 4 with all corrections applied.
To evaluate the uncertainty in the normalization of the tt and W + jets background processes, a binned maximum likelihood fit [67] of the background-only hypothesis is performed in the two control regions using the THETA framework [68].All systematic uncertainties (discussed in more detail in Section 7) are accounted for, except for uncertainties in the rate of tt and W + jets backgrounds that are constrained using this fit.The resulting uncertainties in the normalizations of the two backgrounds are 8.7% for tt and 6% for W + jets.These uncertainties are included in the final statistical interpretation of the results (discussed in Section 8) as rate uncertainties.In both control regions, data and simulation agree within the systematic uncertainties described in Section 7.
6 Boosted W channel

Event selection
The selection in this channel is optimized for the identification of boosted W boson decays.Selected events are required to have no H-tagged jets ensuring that the event sample in this channel is complementary to that for the boosted Higgs channel, allowing a straightforward combination of the two channels.Events are selected that have one electron or muon, usually from the decay of a W boson in the T → bW decay mode or from a leptonic top quark decay in the T → tZ or tH decay modes.Electrons (muons) must have p T > 40 GeV, |η| < 2.1 (2.4) and pass the tight identification and isolation requirements described in Section 4. Events having additional loose electrons or muons with p T > 10 GeV are rejected.
Each event must have three or more AK4 jets, and the three highest p T jets must satisfy p T > 300, 150, and 100 GeV, respectively.Since a neutrino is expected from a leptonic W boson decay, E miss T is required to be greater than 75 GeV, which also significantly reduces the background from multijet events.Control regions are separated from the signal region based on the angular separation between the lepton and the second-highest p T jet in the event, ∆R( , j 2 ).
In both TT and background processes, the lepton is usually observed back-to-back with the highest transverse momentum AK4 jet, and in TT events the second-highest p T jet also tends to be back-to-back with the lepton, as seen in Fig. 5.The signal region selection requires ∆R( , j 2 ) > 1. Figure 5 shows the distribution of ∆R( , j 2 ) after all selection requirements except for ∆R( , j 2 ) > 1.All selection efficiency corrections for differences between data and simulation are applied, as well as the H T -based reweighting described in Section 5.2.
To maximize sensitivity to the presence of TT production, events are divided into 16 categories based on lepton flavor (e, µ), the number of b-tagged jets (0, 1, 2, ≥3), and the number of boosted W-tagged jets (0, ≥1).In events with no W-tagged jet, we require a fourth jet with p T > 30 GeV. Figure 6 shows the distributions used for tagging boosted W bosons as well as the number of b-tagged and W-tagged jets.The pruned mass distribution for AK8 jets with τ 2 /τ 1 < 0.6 shows a significant contribution of boosted W bosons in signal events weighted to correspond to the singlet branching fraction benchmark.The τ 2 /τ 1 distribution in AK8 jets with pruned mass between 65-105 GeV shows that W + jets and multijet backgrounds are concentrated at higher values, as expected for jets without substructure.
We finally analyze the minimum mass constructed from the lepton ( ) and a b-tagged AK4 jet, labeled min[M( , b)].In leptonic top quark decays, forming a mass from two of the three decay products, the lepton and b quark jet, produces a sharp edge near the top quark mass.Therefore this distribution is particularly suited to identifying T → bW decays, where the corresponding edge forms at much higher masses, near M(T).In the categories with zero b-tagged AK4 jets, we consider the minimum mass of the lepton and any AK4 jet, denoted min[M( , j)].This combination of discriminating variables provides the best sensitivity to low mass T quark production ( 1 TeV) in the singlet branching fraction scenario.Figure 7 shows distributions of min[M( , j)] and min[M( , b)] after the final selection but before the likelihood fits described in Section 8.

Background modeling
To cross check the modeling of background processes, we consider two control regions enriched by two dominant background processes, W + jets and tt.To define these regions we invert the signal region requirement of ∆R( , j 2 ) > 1 and modify the requirement on the number of btagged jets to maximize either W + jets or tt yield.For an 800 GeV T quark we expect only 3 events in both control regions compared to a total background of 444, for a signal to background ratio that is a factor of ≈3 smaller than in the signal region.
The W + jets control region has zero b-tagged jets and events are categorized according to the number of W-tagged jets (0, ≥1).The tt region has one or more b-tagged jets and events are categorized according to the number of b-tagged jets (1, ≥2).( ( : Distributions of (left-to-right, upper-to-lower) pruned jet mass for AK8 jets with τ 2 /τ 1 < 0.6, τ 2 /τ 1 for AK8 jets with pruned mass within 65-105 GeV, number of b-tagged AK4 jets, and number of W-tagged AK8 jets in the boosted W channel with all categories combined.Also shown are the distributions of TT signal events with T quark masses of 0.8 and 1.2 TeV, scaled by factors of 20 and 60, respectively, in the upper figures.The uncertainty in the background includes the statistical and systematic uncertainties described in Section 7.
gions show that simulation-based background predictions agree with data within the systematic uncertainties described in Section 7. Observed and predicted event yields in the control regions for all categories are compared as a closure test, and differences in yields are assigned as an additional systematic uncertainty.This uncertainty accounts for any background mismodeling after selection and scale factor application.

Systematic uncertainties
We consider sources of systematic uncertainty that can affect the normalization and/or the shape of both background and signal distributions.A summary of these systematic uncertainties along with their numerical values and whether they are applied to signal or background samples can be found in Table 2.
The uncertainty in the integrated luminosity is 2.3% [69] and is applied to all simulated samples.Normalization uncertainties in the rates of SM processes include 20% for single top quark production and 15% for diboson production, based on CMS measurements [70,71].For multijet production a rate uncertainty of 100% is assigned in the boosted H channel since the simulation used in this channel does not contain either the PDF or matrix element scale uncertainties, unlike those used in the boosted W channel.No rate uncertainty is applied to Z + jets production since for this process experimental and theoretical uncertainties are small compared to the energy scale and PDF uncertainties described below.Additionally, both channels derive normalization uncertainties for tt and W + jets samples from control regions, with values of 5-12% and 4-20% in the boosted W channel, and 8.7% and 6.0% in the boosted H channel. Trigger, lepton identification, and lepton isolation efficiency scale factor uncertainties are also applied as normalization uncertainties.
Uncertainties in both channels affecting the shape and normalization of the distributions include uncertainties related to jet energy scale, jet energy resolution, pruned or soft drop jet mass scale and resolution, and b tagging and light-flavor mistagging efficiencies.These are evaluated by raising and lowering their values with respect to the central values by one standard deviation of the respective uncertainties and recreating a distribution using shifted values at each step of the analysis.An additional uncertainty of 5% is applied in the boosted H channel to account for potential differences when propagating the jet mass scale and resolution scale factors, measured using hadronic W boson decays, to Higgs boson candidate jets.This uncertainty has been determined by comparing samples simulated with the PYTHIA 8 and HERWIG++ [72] (with the CUETP8M1 tune [53,54]) hadronization programs and evaluating the difference between the two programs in the jet mass distributions for hadronically decaying W and Higgs bosons.In the boosted W channel we also apply shape uncertainties to the W boson tagging corrections for the τ 2 /τ 1 selection efficiency and its p T dependence.To account for small differences in the H-tagging efficiency between the boosted W and boosted H channel, a 3% normalization uncertainty is assigned that is correlated with the b tagging uncertainty in the boosted H channel and anticorrelated in the boosted W channel.
The uncertainty due to pileup modeling is evaluated by varying by ±5% the total inelastic cross section used to calculate the pileup distribution.The systematic uncertainty in the H Tbased background reweighting procedure is taken to be the difference between the unweighted distribution and a distribution where the correction factor is applied twice.
Events / 20 GeV  The uncertainties in the PDFs used in MC simulation are evaluated from the set of NNPDF3.0 fitted replicas, following the standard procedure [55].Renormalization and factorization scale uncertainties are calculated by varying the corresponding scales up or down (either independently or simultaneously) by a factor of two and taking as uncertainty the envelope, or largest spread, of all possible variations.These theoretical uncertainties are applied to the signal simulation as shape uncertainties, together with small normalization uncertainty contributions due to changes in acceptance.
The PDF and scale variation uncertainties affect both the normalization and shape of background distributions for multijet (in the boosted W channel), Z + jets, and single top quark MC samples.For the tt and W + jets backgrounds the theoretical and H T reweighting uncertainties dominate the total uncertainty in this search, and theoretical uncertainties are treated differently across the two channels.Changes of energy scale or parton momentum strongly influence H T and therefore these uncertainties are correlated with the uncertainty in the H T reweighting method.In the boosted H channel, only the uncertainty in the H T reweighting procedure is considered as this uncertainty dominates over energy scale variations and PDF uncertainties, especially in the tails of the S T distribution.In the boosted W channel the uncertainty in the H T Table 2: Summary of the systematic uncertainties, along with numerical values and application to signal and/or background samples.The second column gives the magnitude of normalization uncertainties or the procedure used to evaluate shape uncertainties.The symbol σ indicates one standard deviation of the corresponding systematic uncertainty.Renormalization and factorization energy scale uncertainties are treated as shape-only for signal but include normalization uncertainties in background.Values stated for shape uncertainties indicate a representative range over the categories for the dominant backgrounds and/or signal.reweighting dominates over the PDF uncertainty, but is comparable in shape and magnitude to the scale variation uncertainty, with scale variations providing the dominant uncertainty at low values of min[M( , b)].In this channel both H T reweighting and scale variation uncertainties are considered for tt and W + jets backgrounds.All of these shared uncertainties are treated as correlated between the two analysis channels in the statistical interpretation of the results.

Results
Signal efficiencies for all possible final states of TT and BB production in the boosted W and boosted H channels (after combining all categories in each channel) are listed in Table 3 for two signal hypotheses with a high and a low vector-like quark mass.The values are derived by dividing the number of signal events that have the corresponding decay mode in each category by the number of expected events in the same decay mode before any selection.It can be seen that the selection applied in the boosted H channel is most efficient if a Higgs boson is present in the final state, whereas the selection in the boosted W channel favors T → bW decays, thus showing how the combination of the two channels improves sensitivity to most branching fraction combinations of the T quark.For B quark decays the boosted W channel has high efficiency for the tW decays and reduced efficiency for the bZ/bH decays owing to the lack of semileptonic top quark decays.Similarly, the boosted H channel is most efficient for the bHtW final state since a leptonic decay is required as well as an H-tag.
In Fig. 9, min[M( , j)] or min[M( , b)] distributions are shown for each of the 8 tagging categories in the boosted W channel after the final event selection, with the electron and muon channels combined.Figure 10 shows distributions of S T in the H1b and H2b categories after combining the electron and muon channels.As these two variables provide good discrimination between signal and background in their respective categories, they are used for the final statistical interpretation of the data.In all plots, the TT signal distributions assume the singlet benchmark branching fractions.The event yields are given in Table 4.
After the final event selection, no significant excess above the SM expectations is observed in data.We set 95% CL upper limits on the cross section of TT production in various branching fraction scenarios.These limits are defined as Bayesian credible intervals [67] and are derived using the THETA [68] program.Statistical uncertainties due to the finite size of the MC samples are accounted for using the Barlow-Beeston lite method [73].Systematic uncertainties are ( ( ( ( ( (  treated as nuisance parameters with log-normal priors for normalization uncertainties, Gaussian priors for shape uncertainties with shifted templates, and a flat prior on the signal cross section.The limits are then calculated by simultaneously fitting the binned marginal likelihoods obtained from the min[M( , b)] distributions in all boosted W categories and the S T distributions in all boosted H categories.This creates a combined search with 20 categories after dividing into electron and muon channels: 16 categories from the boosted W channel and 4 categories with a boosted Higgs boson.The systematic uncertainties for these categories are correlated, as described in Section 7.
Results for the individual channels are shown in Fig. 11.The boosted W channel excludes T quarks decaying only to bW with masses below 910 GeV (870 GeV expected), and the boosted H channel excludes T quarks decaying only to tH for masses below 890 GeV (860 GeV expected).In Fig. 12 we present combined 95% CL upper limits on the TT production cross section for two VLQ benchmark branching fraction combinations: singlet (50% bW, 25% tZ/tH) and doublet (50% tZ/tH).For an electroweak singlet T quark, the observed (expected) upper limits on the production cross section range from 0.26 to 0.04 pb (0.31 to 0.04 pb) and we exclude masses below 860 GeV (790 GeV).For a doublet T quark, the observed (expected) upper limits on the production cross section range from 0.37 to 0.04 pb (0.34 to 0.03 pb) and we exclude masses below 830 GeV (780 GeV).The corresponding benchmarks for B quark production are shown in Fig. 13, and we can exclude masses below 730 GeV (720 GeV expected) for the singlet branching fraction combination while for the doublet scenario, no lower mass limit above 700 GeV was observed.Sensitivity to BB production in this search is limited by the single lepton selection efficiency for bZ and bH decays, as noted above.The combinations benefit from the difference in discriminating variables between the channels: the min[M( ,b)] distributions used in the boosted W channel provide good sensitivity to low-mass T quarks, while the peaking signal shape in the S T distribution drives the combination at high masses.The observed exclusion limits are stronger than expected due to an over-prediction of the background that remains after the H T -based reweighting, particularly in categories with a W-tagged jet and several btagged jets.This effect is not significant given the systematic uncertainty in the reweighting procedure.Figure 14 shows expected and observed exclusion limits at 95% CL on the T quark mass, for a scan of possible branching fractions: we set lower mass limits with values ranging from 790 to 940 GeV for combinations with B(T → tH) + B(T → bW) ≥ 0.4.Compared to the combination of many leptonic and hadronic search channels in √ s = 8 TeV collision data corresponding to an integrated luminosity of 19.7 fb −1 , the current combination of two single lepton channels produces similar expected exclusion limits.This represents an improved sensitivity to TT pair production at √ s = 13 TeV due to the increase in the TT production cross section from 8 to 13 TeV as well as to significant improvements in techniques for identifying boosted hadronic massive-particle decays.For branching fraction scenarios with B(T → tH) + B(T → bW) ≥ 0.4 these results extend the excluded mass range of the 8 TeV search by up to 160 GeV.

Summary
The first search by CMS for pair-produced vector-like T and B quarks at √ s = 13 TeV is presented, using data from proton-proton collisions recorded in 2015 corresponding to integrated luminosities of 2.3-2.6 fb −1 .The search requires at least one lepton in the final state and is op-

CMS
Figure 14: The expected (left) and observed (right) at 95% CL lower limits (Bayesian) on the T quark mass for a variety of T → tH and T → bW branching fraction combinations, indicated by the coordinates at the center of each box, after combining the boosted W and boosted H channels.A limit of <700 GeV indicates that this search is not sensitive to T quark decays with that branching fraction combination.
timized for cases where a T quark decays to a boosted W or Higgs boson.No excess above the standard model background is observed and 95% confidence level upper limits are placed on the cross section of TT and BB production.For an electroweak singlet T quark, masses below 860 GeV are excluded, and for a doublet T quark, masses below 830 GeV are excluded.Considering other possible branching fraction combinations for T quarks, and assuming that the sum of the branching fractions to bW, tH and tZ is equal to unity, we set lower mass limits that range from 790 to 940 GeV for combinations with B(T → tH) + B(T → bW) ≥ 0.4.These results extend the sensitivity of previous CMS searches for many possible T quark decay scenarios, and showcase the importance of new techniques for understanding highly-boosted final states in extending searches for new particles to higher masses.
[27] ATLAS Collaboration, "Search for pair production of vector-like top quarks in events with one lepton, jets, and missing transverse momentum in √ s = 13 TeV pp collisions with the ATLAS detector", (2017).arXiv:1705.10751.Submitted to JHEP.
[28] ATLAS Collaboration, "Search for pair production of heavy vector-like quarks decaying to high-p T W bosons and b quarks in the lepton-plus-jets final state in pp collisions at √ s=13 TeV with the ATLAS detector", (2017).arXiv:1707.03347.Submitted to JHEP.

Figure 1 :
Figure 1: Examples of leading-order Feynman diagrams showing production of a TT pair with the T quark decaying to bW (left), tH (middle), and tZ (right).

Figure 2 :
Figure 2: Angular separations ∆R between the products of simulated W → qq (left) and H → bb (right) decay processes for three different mass points of the T quark.Even for the lowest mass point shown, the final state particles are typically emitted with a separation of ∆R < 0.8 and are merged into an AK8 jet.

Figure 4 :
Figure4: Distributions of S T in the tt (left) and W + jets (right) control regions of the boosted H channel after applying all corrections to their shape and normalization.The TT signal, shown for T quark masses of 0.8 and 1.2 TeV, is normalized to the theoretical cross section and the singlet benchmark branching fractions are assumed.The uncertainty in the background includes statistical and systematic uncertainties described in Section 7.

Figure 5 :
Figure5: Distribution of ∆R( , j 2 ) in the boosted W channel after all selection requirements except for ∆R( , j 2 ) > 1.Also shown are the distributions of TT signal events with T quark masses of 0.8 and 1.2 TeV, scaled by factors of 20 and 60, respectively.The uncertainty in the background includes the statistical and systematic uncertainties described in Section 7.

Figure 8
shows distributions of min[M( , j)] in the W + jets control region and min[M( , b)] in the tt control region.Both re-

Figure 6
Figure6: Distributions of (left-to-right, upper-to-lower) pruned jet mass for AK8 jets with τ 2 /τ 1 < 0.6, τ 2 /τ 1 for AK8 jets with pruned mass within 65-105 GeV, number of b-tagged AK4 jets, and number of W-tagged AK8 jets in the boosted W channel with all categories combined.Also shown are the distributions of TT signal events with T quark masses of 0.8 and 1.2 TeV, scaled by factors of 20 and 60, respectively, in the upper figures.The uncertainty in the background includes the statistical and systematic uncertainties described in Section 7.

Figure 7 :
Figure 7: Distributions of min[M( , j)] in events without b-tagged AK4 jets (left) and min[M( , b)] in events with ≥1 b-tagged AK4 jets (right) in the boosted W channel with all categories combined.Also shown are the distributions of TT signal events with T quark masses of 0.8 and 1.2 TeV, scaled by factors of 20 and 60, respectively.The uncertainty in the background includes the statistical and systematic uncertainties described in Section 7.

Figure 8 :
Figure 8: Distributions of min[M( , j)] in the W + jets control region of the boosted W channel (upper) for 0/≥1 W tag categories (left/right), and min[M( , b)] in the t t control region of the boosted W channel (lower) for 1/≥2 b tag categories (left/right).Also shown are the distributions of TT signal events with T quark masses of 0.8 and 1.2 TeV.The uncertainty in the background includes the statistical and systematic uncertainties described in Section 7.

Figure 9 :
Figure 9: Distributions of min[M( , j)] or min[M( , b)] in the combination of electron and muon channels in the boosted W categories with 0 (left) or ≥1 (right) W-tagged jets and (upper to lower) 0, 1, 2, or ≥3 b-tagged jets.Also shown are the distributions of TT signal events with T quark masses of 0.8 and 1.2 TeV.The uncertainty in the background includes the statistical and systematic uncertainties described in Section 7.

Figure 10 :
Figure 10: Distributions of S T in the H1b (left) and H2b (right) categories in the combination of electron and muon channels.The TT signal, shown for T quark masses of 0.8 and 1.2 TeV, is normalized to the theoretical cross section and the singlet benchmark branching fractions are assumed.The uncertainty in the background includes the statistical and systematic uncertainties described in Section 7.

Figure 11 :
Figure11: The expected and observed upper limits (Bayesian) at 95% CL on the cross section of TT production for 100% T → bW in the boosted W channel (left), and 100% T → tH in the boosted H channel (right).The theoretically predicted cross section for TT production calculated at NNLO is shown as red line, with the uncertainties in the PDFs and renormalization and factorization scales indicated by the shaded area.Masses below 700 GeV were excluded previously.

Figure 12 :
Figure12: The expected and observed upper limits (Bayesian) at 95% CL on the cross section of TT production for the singlet benchmark (left) and the doublet benchmark (right) after combining the boosted W and boosted H channels.The theoretically predicted cross section for TT production calculated at NNLO is shown as red line, with the uncertainties in the PDFs and renormalization and factorization scales indicated by the shaded area.Masses below 700 GeV were excluded previously.

Figure 13 :
Figure13: The expected and observed upper limits (Bayesian) at 95% CL on the cross section of BB production for the singlet benchmark (left) and the doublet benchmark (right) after combining the boosted W and boosted H channels.The theoretically predicted cross section for BB production calculated at NNLO is shown as red line, with the uncertainties in the PDFs and renormalization and factorization scales indicated by the shaded area.Masses below 700 GeV were excluded previously.

Table 1 :
Predicted cross sections for pair production of T or B quarks for various masses.Uncertainties include contributions from energy scale variations and from the PDFs.

Table 3 :
Signal efficiencies in the boosted W and boosted H event categories, split into the six possible final states, of both TT and BB production for two illustrative mass points.Efficiencies are calculated with respect to the expected number of events in the corresponding final state before any selection.The relative uncertainty in the efficiencies after combining systematic and statistical uncertainties in the MC samples is about 8% in the boosted W categories and about 12% in the boosted H categories.

Table 4 :
Number of events in each category after combining the electron and muon channels.Uncertainties include statistical and systematic components from Table2, with uncertainty in the total background yield accounting for correlations across background processes.Yields of TT signal assume the theoretically predicted production cross section within the singlet branching fraction scenario.