Search for pair production of heavy vector-like quarks decaying into high-$p_T$ $W$ bosons and top quarks in the lepton-plus-jets final state in $pp$ collisions at $\sqrt{s}$=13 TeV with the ATLAS detector

: A search is presented for the pair production of heavy vector-like B quarks, primarily targeting B quark decays into a W boson and a top quark. The search is based on 36 : 1 fb (cid:0) 1 of pp collisions at p s = 13 TeV recorded in 2015 and 2016 with the ATLAS detector at the CERN Large Hadron Collider. Data are analysed in the lepton-plus-jets (cid:12)nal state, characterised by a high-transverse-momentum isolated electron or muon, large missing transverse momentum, and multiple jets, of which at least one is b -tagged. No signi(cid:12)cant deviation from the Standard Model expectation is observed. The 95% con(cid:12)dence level lower limit on the B mass is 1350 GeV assuming a 100% branching ratio to W t . In the SU(2) singlet scenario, the lower mass limit is 1170 GeV. The 100% branching ratio limits are found to be also applicable to heavy vector-like X production, with charge +5/3, that decay into W t . This search is also sensitive to a heavy vector-like B quark decaying into other (cid:12)nal states ( Zb and Hb ) and thus mass limits on B production are set as a function of the decay branching ratios. uncertainties include statistical and systematic uncertainties; for the t (cid:22) t background no cross-section uncertainty is included since it is a free parameter of the (cid:12)t. The contributions from dibosons, Z +jets, ttV and multi-jet production are included in the ‘Others’ category for the RECOSR, whereas they are counted separately within the BDTSR. Modelling errors on the small t (cid:22) tV background are neglected. In the post-(cid:12)t case, the uncertainties in the individual background components can be larger than the uncertainty in the sum of the backgrounds, which is constrained by data. Both signal models correspond to m B = 1300 GeV.


Introduction
The discovery of the Higgs boson by the ATLAS and CMS collaborations is a major milestone in high-energy physics [1,2]. However, the underlying nature of electroweak symmetry breaking remains unknown. Naturalness arguments [3] require that, to avoid fine-tuning, quadratic divergences arising from radiative corrections to the Higgs boson mass are cancelled out by one or more new particles. Several such mechanisms have been proposed in theories beyond the Standard Model. In supersymmetry, the cancellation comes from assigning superpartners to the Standard Model (SM) bosons and fermions. Alternatively, Little Higgs [4,5] and Composite Higgs [6,7] models introduce a spontaneously JHEP08(2018)048 broken global symmetry, with the Higgs boson emerging as a pseudo Nambu-Goldstone boson [8]. These latter models predict the existence of vector-like quarks (VLQs), defined as colour-triplet spin-1/2 fermions whose left-and right-handed chiral components have the same transformation properties under the weak-isospin SU(2) gauge group [9,10]. Depending on the model, vector-like quarks are produced in SU(2) singlets, doublets, or triplets of flavours T , B, Y or X, in which the first two have the same charge as the SM top quark and b-quark while the vector-like Y and X quarks have charge 1 −4/3 and +5/3, respectively. In addition, in these models, VLQs are expected to couple preferentially to third-generation quarks [9,11] and can have flavour-changing neutral-current decays at leading order in addition to the charged-current decays characteristic of chiral quarks. As a result, an up-type T quark can decay not only into a W boson and a b-quark, but also into a Z or Higgs boson and a top quark (T → W b, Zt, and Ht). Similarly, a down-type B quark can decay into a Z or Higgs boson and a b-quark, in addition to decaying into a W boson and a top quark (B → W t, Zb, and Hb). For each type, the sum of the three branching fractions is assumed to be 1, i.e. other decays are not considered. Due to their charge, vector-like Y quarks can only decay into W b while vector-like X quarks must decay into W t. To be consistent with the results from precision electroweak measurements, the mass-splitting between VLQs belonging to the same SU(2) multiplet is required to be small, but no requirement is placed on which member of the doublet is heavier [12]. Cascade decays such as T → W B → W W t are thus assumed to be kinematically forbidden. Decays of VLQs into final states with first-and second-generation quarks, although not favoured, are not excluded by precision electroweak or flavour measurements [13,14].
This search targets the BB pair-production with the subsequent decay mode B → W t using the pp collision data collected at the Large Hadron Collider (LHC) in 2015 and 2016 at a centre-of-mass energy of 13 TeV, although it is also sensitive to a wide range of branching ratios to the other two decay modes as well as to production of vector-like X quarks. Contrary to single production the BB pair-production cross section depends only on the B quark mass. An example of a leading-order production diagram is shown in figure 1. Previous searches in this decay mode by the ATLAS and CMS collaborations did not observe a significant deviation from the SM predictions. Those searches excluded VLQ masses below 740 GeV for any combination of branching ratios and below 1020 GeV for the assumption of B(B → W t) = 1 [15,16]. A recent search by the ATLAS Collaboration at √ s = 13 TeV, primarily targeting the T quark decaying into W b, was also found to be sensitive to B and X quarks decaying into W t. The results included interpretations which provide a 95% confidence-level observed (expected) lower limit on the B quark mass at 1250 (1150) GeV assuming a 100% branching ratio to W t; in the SU(2) singlet scenario, the limit is 1080 (980) GeV [17]. In this context, the event selection for this new search is optimised for high-mass BB production with subsequent decay into two high-p T W bosons and two top quarks, where one of the four W bosons decays leptonically and the others decay hadronically. To suppress the SM background, boosted-jet reconstruction techniques [18,19] are used to improve the identification of hadronically decaying high-p T 1 All electric charges are quoted in units of e. Figure 1.
Example of a leading-order BB production diagram in the targeted W t decay mode.
W bosons and top quarks. The decay products of a hadronically decaying high-momentum W boson are likely to be contained within a single large-radius jet. The two signal regions used in this search are based on the number of reconstructed large-radius jets. The first signal region aims to reconstruct the BB system using the mass of the purely hadronically decaying B candidate to discriminate between SM and VLQ events. The second, more inclusive, signal region uses a Boosted Decision Tree (BDT) to discriminate between SM and VLQ events. Finally, a profile likelihood fit is used to test for the presence of a VLQ signal as a function of the B quark mass and the decay branching ratios. The results are found to be equally applicable to either singlet or doublet weak-isospin configurations as well as to the decays of X quarks.

ATLAS detector
The ATLAS detector [20] at the LHC is a multipurpose particle detector with a forwardbackward symmetric cylindrical geometry that covers nearly the entire solid angle around the collision point. It consists of an inner detector surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadronic calorimeters, and a muon spectrometer. The inner detector covers the pseudorapidity range 2 |η| < 2.5. It consists of a silicon pixel detector, including the insertable B-layer installed after Run 1 of the LHC [21,22], and a silicon microstrip detector surrounding the pixel detector, followed by a transition radiation straw-tube tracker. Lead/liquid-argon sampling calorimeters provide electromagnetic energy measurements with high granularity and a hadronic (steel/scintillator-tile) calorimeter covers the central pseudorapidity range (|η| < 1.7). The endcap and forward regions are instrumented with liquid-argon calorimeters for both the electromagnetic and hadronic energy measurements up to |η| = 4.9. The outer part of the detector consists of a muon spectrometer with high-precision tracking chambers for 2 The ATLAS Collaboration uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the z-axis along the beam pipe. The x-axis points from the IP to the centre of the LHC ring, and the y-axis points upwards. Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis. The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2). Angular distance is measured in units of ∆R ≡ (∆η) 2 + (∆φ) 2 . JHEP08(2018)048 between 1.5 × m top and 3 × m top . The tt samples are normalised to the NNLO crosssection, including NNLO QCD corrections and soft-gluon resummation to NNLL accuracy, as performed for the signal samples.
Single top quark production (called 'single top' in the following) in the s-channel and in W t final states was also generated with Powheg-Box v2 interfaced with Pythia 6.428 [43], while single top production in the t-channel was generated with Powheg-Box v1 interfaced with Pythia 6.428 for the parton shower and hadronisation. Single-top samples were generated using the Perugia2012 tune [44] and the CT10 PDF set [38]. The diagram removal method was used to remove the overlap between NLO W t production and LO tt production [45]. The single-top cross-sections for the t-and s-channels are normalised to their NLO predictions using Hathor v2.1 [46,47], while for the W t final states the cross-section is normalised to its NLO+NNLL prediction [48]. For W + jets, Z + jets, and diboson (W W , W Z, ZZ) samples, the Sherpa 2.2.1 generator [49] was used with the CT10 PDF set. The W + jets and Z + jets production samples are normalised to the NNLO cross-sections [50][51][52]. For diboson production, the generator cross-sections (already at NLO) are used for the sample normalisation. The ttV background is modelled using samples produced with MadGraph5 aMC@NLO 2.1.1 interfaced with Pythia 8.186, using the A14 tune and the NNPDF2.3 LO PDF set. The tt+V samples are normalised to their respective NLO cross-sections [42].
All simulated events, except those from Sherpa, use EvtGen v1.2.0 [53] for the modelling of b-hadron decays. All simulated event samples for the nominal predictions were produced using the ATLAS simulation infrastructure [54], using the full Geant 4 [55] simulation of the ATLAS detector. The alternative tt generator samples were processed with a fast simulation [56] of the ATLAS detector with parameterised showers in the calorimeters. Simulated events were then reconstructed with the same software as used for the data. Multiple overlaid pp collisions in the same or nearby bunch crossings (pile-up) were simulated at rates matching those in the data; they were modelled as low-p T multi-jet production using the Pythia 8.186 generator and the A2 tune [57]. Additional corrections are applied to the simulated samples to correct for residual deviations of efficiencies and resolutions from those observed in the data.

Analysis object selection
Reconstructed objects are defined by combining information from different detector subsystems. This section outlines the criteria used to identify and select the reconstructed objects used in the analysis. Events are required to have at least one vertex candidate with at least two tracks with p T > 400 MeV. The primary vertex is taken to be the vertex candidate with the largest sum of squared transverse momenta of all associated tracks.
To reconstruct jets, three-dimensional energy clusters in the calorimeter [58], assumed to represent massless particles coming from the primary vertex, are grouped together using the anti-k t clustering algorithm [59][60][61] with a radius parameter of 0.4 (1.0) for small-R (large-R) jets. Small-R jets and large-R jets are clustered independently using the same inputs.

JHEP08(2018)048
Small-R jets are calibrated using an energy-and η-dependent calibration scheme, with in situ corrections based on data [62], and are selected if they have p T > 25 GeV and |η| < 2.5. A multivariate jet vertex tagger (JVT) selectively removes small-R jets below 60 GeV that are identified as having originated from pile-up collisions rather than the hard scatter [63]. Jets containing b-hadrons are identified via an algorithm that uses multivariate techniques to combine information from the impact parameters of displaced tracks as well as topological properties of secondary and tertiary decay vertices reconstructed within the jet [64,65]. A jet is considered b-tagged if the value for the multivariate discriminant is above the threshold corresponding to an efficiency of 77% for tagging a jet containing bhadrons. The corresponding light-jet rejection factor is ∼ 130 and the charm-jet rejection factor is ∼ 6, as determined in simulated tt events [66].
Large-R jets are built using the energy clusters in the calorimeter and then trimmed [67] to mitigate the effects of contamination from pile-up and to improve background rejection. The jet energy and pseudorapidity are further calibrated to account for residual detector effects using energy-and pseudorapidity-dependent calibration factors derived from simulation. The k t -based trimming algorithm reclusters the jet constituents into subjets with a more finely grained resolution with an R-parameter set to R sub = 0.2. Subjets that contribute less than 5% to the p T of the large-R jets are discarded. The properties (e.g. transverse momentum and invariant mass) of the jet are recalculated using only the constituents of the remaining subjets. Trimmed large-R jets are only considered if they have p T > 200 GeV and |η| < 2.0. No dedicated overlap-removal procedure between large-R and small-R jets is performed. To identify large-R jets that are likely to have originated from the hadronic decay of W bosons (W had ), jet substructure information is exploited using both the ratio of the energy correlation functions D β=1 2 [68,69] and the combined jet mass [70]. The combined jet mass is constructed using a combination of the calorimeterderived jet mass, based on calorimeter cell cluster constituents, and the track-assisted jet mass, where the calorimeter momentum is augmented by information from the tracks associated with the large-R jet. Selected large-R jets must pass both the substructure and mass requirements of the 50%-efficient W -tagging working point [18]. To reduce the contribution from the tt background, W had candidates must not overlap any b-tagged small-R jets within ∆R < 0.75.
Electrons are reconstructed from energy deposits in the electromagnetic calorimeter matched to inner detector tracks. Electron candidates are required to satisfy likelihoodbased identification criteria [71] and must have p ele T > 30 GeV and |η| < 2.47. Electron candidates in the transition region between the barrel and endcap electromagnetic calorimeters, 1.37 < |η| < 1.52, are excluded. A lepton isolation requirement is implemented by calculating the quantity I R = ∆R(track,ele)<Rcut p track T , where R cut is the smaller of 10 GeV/p ele T and 0.2, and the track associated with the lepton is excluded from the calculation. The electron must satisfy I R < 0.06 · p ele T . Additionally, electrons are required to have a track satisfying |d 0 |/σ d 0 < 5 and |z 0 sin θ| < 0.5 mm, where d 0 is the transverse impact parameter and z 0 is the r-φ projection of the impact point onto the z-axis. An overlap-removal procedure prevents double-counting of energy between an electron and nearby jets by removing jets if the separation between the electron and jet is within ∆R < 0.2 and removing electrons -6 -

JHEP08(2018)048
if the separation is within 0.2 < ∆R < 0.4. Subsequently, a large-R jet is removed if the separation between the electron and the large-R jet is within ∆R = 1.0.
Muons are reconstructed from inner detector tracks matched to muon spectrometer tracks or track segments [72]. Candidate muons are required to satisfy quality specifications based on information from the muon spectrometer and inner detector. Furthermore, muons are required to be isolated using the same criterion that is applied to electrons and their associated tracks must satisfy |z 0 sin θ| < 0.5 mm and |d 0 |/σ d 0 < 3. Muons are selected if they have p T > 30 GeV and |η| < 2.5. An overlap-removal procedure is also applied to muons and jets. If a muon and a jet with at least three tracks are separated by ∆R < min(0.4, 0.04 + 10 GeV/p T µ ) the muon is removed; if the jet has fewer than three tracks, the jet is removed. For a given reconstructed event, the negative vector sum of the p T of all reconstructed leptons and small-R jets is defined as the missing transverse momentum ( E miss T ) [73]. An extra term is included to account for 'soft' energy from inner detector tracks that are not matched to any of the selected objects but are consistent with originating from the primary vertex.

Analysis strategy
This search targets the decay of high-mass pair-produced VLQs, BB, where one B quark decays into W t and the other decays into W t, Zb or Hb. Since a recent search by AT-LAS [17], primarily targeting the T quark decays into W b, has been reinterpreted to exclude VLQs decaying into W t at 95% confidence level (CL) for masses below 1250 GeV, this search focuses on the decays of high-mass VLQs. The final state consists of a high-p T charged lepton and missing transverse momentum from the decay of one of the W bosons, high-momentum large-R jets from hadronically decaying boosted W bosons, and multiple b-tagged jets. The event preselection is described in section 5.1 and the classification of events into two non-overlapping signal regions follows in section 5.2. The multi-jet background is estimated using a data-driven technique discussed in section 5.3.

Event preselection
Events were recorded using a combination of single-electron or single-muon triggers with isolation requirements. In 2015, the lowest p T threshold was 24 GeV; in 2016, it ranged from 24 to 26 GeV. Additional triggers without an isolation requirement were used to recover efficiency for leptons with p T > 60 GeV. Events are required to have exactly one lepton candidate (electron or muon, N lep ) that must be geometrically matched to the triggering lepton. Signal events are expected to have a high jet multiplicity (N jets ), since they include up to two b-jets (N b-jets ) as well as jets from the hadronic decay of up to three W bosons. Therefore, at least four small-R jets are required, of which at least one must be b-tagged. At least one large-R jet candidate, N large jets , with no W -tagging requirement applied, is required and the E miss T is required to be greater than 60 GeV. Signal events are expected to have characteristic high values in S T , defined by the scalar sum of E miss T and -7 - the transverse momenta of the lepton and all small-R jets. In this context, S T is required to be greater than 1200 GeV. Assuming exactly one neutrino is present in each event, its four-momentum can be analytically determined using the missing transverse momentum vector E miss T and assuming the lepton-neutrino system has an invariant mass equal to that of the W boson. Nearly half of the events are found to produce two complex solutions. When complex solutions are obtained, a real solution is determined by minimising a χ 2 parameter based on the difference between the mass of the lepton-neutrino system and the nominal value of the W boson mass. In the case of two real solutions, the solution with the smaller absolute value of the longitudinal momentum is used.
After this selection, backgrounds with large contributions include tt, W + jets, and single-top events. Other SM processes, including diboson, Z + jets, ttV and multi-jet production, make a smaller but non-negligible contribution.

Classification of event topologies
Two orthogonal signal regions are defined. The reconstructed signal region (RECOSR) aims to reconstruct the BB system, whereas the more inclusive signal region (BDTSR) uses a BDT to discriminate between SM and VLQ events. For signal models with B(B → W t) = 1 the relative importance of both signal regions in the final combined fit is roughly equal. In contrast, for SU(2) singlet B scenarios the BDTSR dominates. A summary of the event selection requirements is given in table 1 and the two signal regions are described in detail in section 5.2.1 and section 5.2.2.

RECOSR definition
After the event preselection described in section 5.1, further requirements are applied to reduce the contamination from SM backgrounds in events with at least three reconstructed large-R jets, where at least one is required to be tagged as a W had . Events are required -8 -  Table 2. Event yields in the two signal regions before and after the background-only fit (see 7.2). The quoted uncertainties include statistical and systematic uncertainties; for the tt background no cross-section uncertainty is included since it is a free parameter of the fit. The contributions from dibosons, Z+jets, ttV and multi-jet production are included in the 'Others' category for the RECOSR, whereas they are counted separately within the BDTSR. Modelling errors on the small ttV background are neglected. In the post-fit case, the uncertainties in the individual background components can be larger than the uncertainty in the sum of the backgrounds, which is constrained by data. Both signal models correspond to m B = 1300 GeV.

JHEP08(2018)048
to have ∆R(lep, leading b-jet) ≥ 1, as the leading b-jet is found to be well separated from the lepton in VLQ candidates. In addition, S T is required to be greater than 1500 GeV. These requirements are found to maximise the expected sensitivity to VLQ masses above 1300 GeV for events with at least three reconstructed large-R jets.
The expected number of events in the RECOSR for the background processes and signal hypothesis with mass m B = 1300 GeV are shown in table 2. For a signal model with B(B → W t) = 1, the acceptance times efficiency of the full event selection ranges from 0.2% to 4% for VLQ masses from m B = 500 to 1800 GeV. For the SU(2) singlet B scenario, for which B(B → W t) is approximately 50% for this mass range, the signal acceptance ranges from 0.1% to 2%. In this signal region, SM processes such as diboson, Z + jets, ttV , and multi-jet production, make a smaller but non-negligible contribution, and are therefore collectively referred to as 'Others'.
After the event selection, the four-momenta of the hadronic and semileptonic VLQ candidates are reconstructed using the selected large-R jets and the leptonically decaying W boson candidate. The selected large-R jets are proxies for the hadronically decaying W bosons and top quarks. The leptonically decaying W boson (W lep ) candidate is reconstructed from the lepton and reconstructed neutrino. The W lep is paired with a large-R jet to form the semileptonically decaying VLQ candidate. Two additional large-R jets are combined to form the hadronically decaying VLQ candidate. All possible large-R jet permutations are tested and the pairing that minimises the absolute value of the mass difference between the semileptonically and hadronically reconstructed VLQ candidates, |∆m|, is chosen. It should be noted that in cases where the lepton originates from the decay of a top quark, the reconstruction described above neglects the presence of the additional b-jet. This was found to nonetheless provide on average the best separation between signal and background.
The final discriminating variable used in the statistical analysis is m had B , the reconstructed mass of the hadronically decaying vector-like B quark candidate. This is found to provide good expected signal sensitivity. Figure 2 (left) shows m had B for benchmark B quark signal models and the total expected background in the RECOSR after the reconstruction algorithm is applied. The reconstructed masses for the signal are shown to peak at the generated B quark masses. The tails arise from misreconstructed B candidates.

BDTSR definition
The BDTSR is defined by all events passing the preselection requirements (section 5.1), but vetoing events contained in the RECOSR. It contains events with less than three large-R jets and thus it is not possible to reconstruct the full BB system from the reconstructed objects alone. As a result, a BDT as implemented in the toolkit for multivariate data analysis with ROOT (TMVA) [74] is used to discriminate between potential signal and background events. For training and testing, a set of signal simulation samples assuming B(B → W t) = 1 is used, combining signal masses ranging from 1050 GeV to 1600 GeV. Simulated tt events are used as background in the training, as they are the dominant background contribution in this region. Starting from a list of 75 variables describing the kinematics of the event, individual variables are removed through an iterative process and the performance of the BDT is evaluated, until a final set of 20 variables is selected. The procedure for removing variables is based on a combination of poor separation power, or high correlation with a variable with higher separation power, particularly if the correlation is similar between signal and background. Variables with poor agreement between data and simulation are also rejected. The 20 remaining input variables are well modelled by the -10 -

JHEP08(2018)048
simulation. The selected variables describe the global event characteristics as well as the kinematics and angular separation of the reconstructed objects. The five highest-ranked variables are: S T , the invariant mass of the highest-p T large-R jet, the sphericity of the event, 3 ∆R between the lepton and the sub-leading small-R jet, and ∆R between the leading b-tagged jet and the leading large-R jet.
The expected numbers of events in the BDTSR for the background processes and signal hypothesis with mass m B = 1300 GeV are shown in table 2. For a signal model with B(B → W t) = 1, the acceptance times efficiency of the full event selection ranges from 7% to 24% for VLQ masses from m B = 500 to 1800 GeV. For the SU(2) singlet B scenario the signal acceptance ranges from 4% to 16%.
The final discriminating variable used in the statistical analysis is the BDT discriminant, which is shown in figure 2 (right) for benchmark B quark signal models and the total expected background.

Multi-jet background estimation
The multi-jet background originates from either the misidentification of a jet or photon as a lepton candidate (fake lepton) or from the presence of a non-prompt lepton (e.g. from a semileptonic b-or c-hadron decay) that passes the isolation requirement. The multi-jet shape, normalisation, and related systematic uncertainties are estimated from data using the matrix method (MM) [76]. The MM exploits the difference in efficiency for prompt leptons to pass loose and tight quality requirements, obtained from W and Z boson decays, and non-prompt or fake lepton candidates, from the misidentification of photons or jets. The efficiencies, measured in dedicated control regions, are parameterised as functions of the lepton candidate p T and η, ∆φ between the lepton and jets, and the b-tagged jet multiplicity.
The event selection significantly reduces the contribution of the multi-jet background in the RECOSR, to the point where statistical uncertainties make the MM prediction unreliable. To obtain a reliable prediction, the requirement on the W -tagged large-R jet is removed. In this region the MM prediction and the small simulation-derived backgrounds (diboson, Z+jets and ttV ) are studied and their distribution shapes of the final discriminant m had B are found to be compatible. This selection is also used to determine the ratio of the multi-jet production to the small simulation-derived backgrounds. The ratio is then assumed to be the same in the RECOSR and is used to scale those small simulationderived backgrounds to account for the additional contribution from multi-jet backgrounds. This scaling was found to be stable under small changes to the definition of the looser selection. In the RECOSR region, the contribution from the multi-jet background to the total background is around 1.3%. In the BDTSR, in contrast, the contribution of the multi-jet background is taken directly from the MM prediction JHEP08(2018)048

Systematic uncertainties
The systematic uncertainties are broken down into four broad categories: luminosity and cross-section uncertainties, detector-related experimental uncertainties, uncertainties in data-driven background estimations, and modelling uncertainties in simulated background processes. Each source of uncertainty is treated as a nuisance parameter in the fit of the hadronic B mass and BDT disciminant distributions, and shape effects are taken into account where relevant. Due to the tight selection criteria applied, the systematic uncertainties only mildly degrade the sensitivity of the search.

Luminosity and normalisation uncertainties
The uncertainty in the combined 2015+2016 integrated luminosity is 2.1%. It is derived, following a methodology similar to that detailed in ref.
[77], from a preliminary calibration of the luminosity scale using x-y beam-separation scans performed in August 2015 and May 2016. This systematic uncertainty is applied to all backgrounds and signals that are estimated using simulated Monte Carlo events, which are normalised to the measured integrated luminosity.
Theoretical cross-section uncertainties are applied to the relevant simulated samples. The uncertainties for W /Z+jets and diboson production are 50% [78,79]. The uncertainty in the W +jets normalisation has a pre-fit impact 4 of 8% on the measured signal strength for a B quark mass of 1.3 TeV (B(B → W t) = 1). This same signal mass and branching ratio is used to quantify the impact of the uncertainties for the remainder of this section. For single top production, the uncertainties are taken as 7% [46,47]. The normalisation of tt is determined from the fit. For the data-driven multi-jet estimation, an uncertainty of 100% is assigned to the normalisation in the RECOSR, corresponding to the maximum range obtained by varying the requirements on S T and ∆R(lep, leading b-jet) when obtaining the multi-jet contribution from the 'Others' background. The corresponding uncertainty in the BDTSR is 50% and evaluated by comparing the data with simulation in a region enriched in multi-jet events.

Detector-related uncertainties
The dominant sources of detector-related uncertainties in the signal and background yields relate to the small-R and large-R jet energy scales and resolutions. The small-R and large-R jet energy scales and their uncertainties are derived by combining information from test-beam data, LHC collision data and simulation [80]. In addition to energy scale and resolution uncertainties, there are also uncertainties in the large-R jet mass and substructure scales and resolutions. These are evaluated in a similar way to the jet energy scale and resolution uncertainties and are propagated to the W -tagging efficiencies. The uncertainty in the large-R jet kinematics due to differences between data and simulation seen in JHEP08(2018)048 the large-R jet calibration analysis has the largest pre-fit impact on the measured signal strength, at ∼12%.
Other detector-related uncertainties come from lepton trigger efficiencies, identification efficiencies, energy scales and resolutions, the E miss T reconstruction, the b-tagging efficiency, and the JVT requirement. These have negligible pre-fit impact on the measured signal strength (<1%).

Generator modelling uncertainties
Modelling uncertainties are estimated for the dominant tt and single-top backgrounds. The modelling uncertainties are estimated by comparing simulated samples generated with different configurations, described in section 3. The effects of extra initial-and final-state gluon radiation are estimated by comparing simulated samples generated with enhanced or reduced initial-state radiation, changes to the h damp parameter, and different values of the radiation parameters. This uncertainty has a 30% and 20% normalisation impact on tt in the RECOSR and BDTSR, respectively, resulting in a pre-fit impact of ∼3% on the measured signal strength. 5 The uncertainty in the fragmentation, hadronisation and underlying-event modelling is estimated by comparing two different parton shower models, Pythia and Herwig 7, while keeping the same hard-scatter matrix-element generator. This causes a 55% and 5% shift in the normalisation of tt in the RECOSR and BDTSR, respectively, resulting in a pre-fit impact of 9% on the measured signal strength. The uncertainty in the hard-scatter generation is estimated by comparing events generated with two different Monte Carlo generators, MadGraph5 aMC@NLO and Powheg-Box, while keeping the same parton shower model. This uncertainty has a 27% normalisation impact on tt in both signal regions, resulting in a pre-fit impact of ∼4% on the measured signal strength.
For single top production, the dominant contribution in this analysis is from W t production and the largest uncertainty comes from the method used to remove the overlap between NLO W t production and LO tt production. The default method of diagram removal is compared with the alternative method of diagram subtraction [45]. The full difference between the two methods is assigned as an uncertainty. This uncertainty has a 90% and 80% normalisation impact on single top in the RECOSR and BDTSR, respectively, resulting in a pre-fit impact of ∼16% on the measured signal strength.

Statistical interpretation
The binned distributions of the reconstructed mass of the hadronically decaying B quark candidate, m had B , in the RECOSR, and of the BDT discriminant in the BDTSR, are used to test for the presence of a signal. Hypothesis testing is performed using a modified frequentist method as implemented in RooStats [81, 82] and is based on a profile likelihood that JHEP08(2018)048 takes into account the systematic uncertainties as nuisance parameters that are fitted to the data. A simultaneous fit is performed in the two signal regions. The number and edges of the bins are optimised to maximise the expected vector-like B quark sensitivity while ensuring the overall Monte Carlo statistical uncertainty in each bin remains below 30%.
The statistical analysis is based on a binned likelihood function L(µ, θ) constructed as a product of Poisson probability terms over all bins considered in the search. This function depends on the signal strength parameter µ, a multiplicative factor applied to the theoretical signal production cross-section, and θ, a set of nuisance parameters that encode the effect of systematic uncertainties in the signal and background expectations and are implemented in the likelihood function as Gaussian constraints. Uncertainties in each bin of the fitted distributions due to the finite size of the simulated event samples are also taken into account via additional dedicated fit parameters and are propagated to µ. There are enough events in the low mass and low BDT score regions, where the signal contribution is small, to obtain a data-driven estimate of the tt normalisation and hence the normalisation of the dominant tt background is included as an unconstrained nuisance parameter. Nuisance parameters representing systematic uncertainties are only included in the likelihood if either of the following conditions are met: the overall impact on the sample normalisation is larger than 1%, or the variation induces changes of more than 1% between adjacent bins. This reduction of the number of nuisance parameter is done separately for the two signal regions and for each template (signal or background). When the bin-by-bin statistical variation of a given uncertainty is significant, a smoothing algorithm is applied.
The expected number of events in a given bin depends on µ and θ. The nuisance parameters θ adjust the expectations for signal and background according to the corresponding systematic uncertainties, and their values correspond to the values that best fit the data.
The test statistic q µ is defined as the profile likelihood ratio, q µ = −2ln(L(µ,θ µ )/L(μ,θ)), whereμ andθ are the values of the parameters that maximise the likelihood function (with the constraint 0≤μ ≤ µ), andθ µ are the values of the nuisance parameters that maximise the likelihood function for a given value of µ. The compatibility of the observed data with the background-only hypothesis is tested by setting µ = 0 in the profile likelihood ratio: q 0 = −2ln(L(0,θ 0 )/L(μ,θ)). Upper limits on the signal production cross-section for each of the signal scenarios considered are derived by using q µ in the CL s method [83, 84]. For a given signal scenario, values of the production cross-section (parameterised by µ) yielding CL s < 0.05, where CL s is computed using the asymptotic approximation [85], are excluded at ≥ 95% CL.

Likelihood fit results
The expected and observed event yields in both signal regions after fitting the backgroundonly hypothesis to data, including all uncertainties, are listed in table 2. The total uncertainty shown in the table is the uncertainty obtained from the full fit, and is therefore not identical to the sum in quadrature of all components, due to the correlations between the fit parameters. The probability that the data is compatible with the background-only -  (left) and the BDT discriminant in BDTSR (right). The lower panel shows the ratio of data to the fitted background yields. The band represents the total uncertainty after the maximumlikelihood fit. Events in the overflow and underflow bins are included in the last and first bin of the histograms, respectively. The expected BB signal corresponding to m B = 1300 GeV for a branching ratio of 100% into W tW t is also shown overlaid. hypothesis is estimated by integrating the distribution of the test statistic, approximated using the asymptotic formulae [85], above the observed value of q 0 . This value is computed for each signal scenario considered, defined by the assumed mass of the heavy quark and the three decay branching ratios. The lowest p-value is found to be ∼50%, for a B mass of 800 GeV. Thus no significant excess above the background expectation is found.
Individual uncertainties are generally not significantly constrained by data, except for the uncertainty associated with the single top modelling, which is constrained to be within 50% of its initial size.
A comparison of the post-fit agreement between data and prediction for both regions is shown in figure 3. The RECOSR shows a slight deficit of data for the m had B distribution above 800 GeV. Hence, the observed upper limits on the BB production cross-section are slightly stronger than the expected sensitivity. The post-fit tt normalisation in these regions is found to be 0.92 ± 0.30 times the Monte Carlo prediction, normalised to the NNLO+NNLL cross-section.

Limits on VLQ pair production
Upper limits at the 95% CL on the BB production cross-section are set for two benchmark scenarios as a function of B quark mass m B and compared with the theoretical prediction from  particles of narrow width. Assuming B(B → W t) =1, the observed (expected) lower limit is m B = 1350 GeV (1330 GeV). For branching ratios corresponding to the SU(2) singlet B scenario, the observed (expected) 95% CL lower limit is m B = 1170 GeV (1140 GeV). These represent a significant improvement over the reinterpreted search [17], for which the observed 95% CL limit was 1250 GeV when assuming B(B → W t) = 1.
To check that the results do not depend on the weak-isospin of the B quark in the simulated signal events, a sample of BB events with a mass of 1200 GeV was generated for an SU (2)   The markers indicate the branching ratios for the SU(2) singlet and both SU(2) doublet scenarios with masses above ∼800 GeV, where they are approximately independent of the VLQ B mass. The small white region in the upper plot is due to the limit falling below 500 GeV, the lowest simulated signal mass. generated for an SU(2) singlet B quark. Both the expected number of events and expected excluded cross-section are found to be consistent between those two samples. Thus the limits obtained are also applicable to VLQ models with non-zero weak-isospin. As there is no explicit use of charge identification, the B(B → W t) = 1 limits are found to be applicable to pair-produced vector-like X quarks of charge +5/3 which decay exclusively into W t.

JHEP08(2018)048
Exclusion limits on B quark pair production are also obtained for different values of m B and as a function of branching ratios to each of the three decays. In order to probe the complete branching-ratio plane, the signal samples are weighted by the ratios of the respective branching ratios to the original branching ratios in Protos. Then, the complete analysis is repeated for each point in the branching-ratio plane. Figure 5 shows the corresponding expected and observed B quark mass limits in the plane B(B → Hb) versus B(B → W t), obtained by linear interpolation of the calculated CL s versus m B .

Conclusions
A search for the pair production of a heavy vector-like B quark, based on pp collisions at √ s = 13 TeV recorded in 2015 (3.2 fb −1 ) and 2016 (32.9 fb −1 ) with the ATLAS detector at the CERN Large Hadron Collider, is presented. Data are analysed in the lepton-plus-jets final state and no significant deviation from the Standard Model expectation is observed. Assuming a branching ratio B(B → W t) = 1, the observed (expected) 95% CL lower limit on the vector-like quark mass is 1350 GeV (1330 GeV). For the scenario of an SU(2) singlet B quark, the observed (expected) mass limit is 1170 GeV (1140 GeV). Assuming the B quark can only decay into W t, Zb and Hb, 95% CL lower limits are derived for various masses in the two-dimensional plane of B(B → W t) versus B(B → Hb). The limit for B(B → W t) = 1 is found to be equally applicable to VLQ X quarks that decay into W t. [81] W. Verkerke and D.P. Kirkby, The RooFit toolkit for data modeling, eConf C 0303241 [86] ATLAS collaboration, ATLAS computing acknowledgements, ATL-GEN-PUB-2016-002 (2016).