Measurement of the cross-section for producing a W boson in association with a single top quark in pp collisions at √ s = 13 TeV with ATLAS

The inclusive cross-section for the associated production of a W boson and top quark is measured using data from proton–proton collisions at √ s = 13 TeV. The dataset corresponds to an integrated luminosity of 3 . 2 fb − 1 , and was collected in 2015 by the ATLAS detector at the Large Hadron Collider at CERN. Events are selected requiring two opposite sign isolated leptons and at least one jet; they are separated into signal and control regions based on their jet multiplicity and the number of jets that are identiﬁed as containing b hadrons. The Wt signal is then separated from the t ¯ t background using boosted decision tree discriminants in two regions. The cross-section is extracted by ﬁtting templates to the data distributions, and is measured to be σ Wt = 94 ± 10 (stat . ) + 28 − 22 (syst . ) ± 2 (lumi . ) pb. The measurement is in agreement with the Standard Model prediction.


Introduction
Top quarks can be produced singly via electroweak interactions involving a Wtb vertex.In the Standard Model (SM), single top quark production proceeds via three channels at leading order (LO), represented in Figures 1 and 2: production in association with a W boson (Wt), the t-channel and the s-channel.At the Large Hadron Collider (LHC), the Wt channel is the mode with the second largest production cross-section, behind the dominant t-channel mode.The Wt channel represents approximately 24 % of the total single-top-quark production rate at √ s = 13 TeV, making it experimentally accessible for detailed measurements.The cross-section for each of the three single-top-quark production channels is sensitive to the coupling between the W boson and the top quark.This coupling is parameterised by the relevant Cabibbo-Kobayashi-Maskawa (CKM) matrix element V tb and form factor f L V [2][3][4] such that the proportionality is given by | f L V V tb | 2 [5,6], assuming a left-handed vector interaction as given in the SM.Single top quark production therefore presents an opportunity for testing the structure of the SM, as well as probing classes of new-physics models that can affect the Wtb vertex.In contrast to the tand s-channels, which are sensitive to both the existence of four-fermion operators and corrections to the Wtb vertex, the Wt channel only depends on the latter; it is therefore important to study this channel separately to provide a comparison with the other channels [7,8].
The Wt channel was not accessible at the Tevatron due to its small cross-section in p p collisions at √ s = 1.96TeV.At the LHC, however, evidence of this process with 7 TeV collision data was presented by the ATLAS Collaboration [9] and by the CMS Collaboration [10].With 8 TeV collision data, observations were made by the CMS Collaboration [11] and the ATLAS Collaboration [12] with cross-section measurements in good agreement with theoretical predictions.
The Wt NLO cross-section at a √ s = 13 TeV with next-to-next-to-leading logarithmic (NNLL) softgluon corrections is calculated as σ theory = 71.7 ± 1.8 (scale) ± 3.4 (PDF) pb [1], assuming a top quark mass (m top ) of 172.5 GeV.The first uncertainty accounts for the renormalisation and factorisation scale variations (from m top /2 to 2 m top ), while the second uncertainty originates from uncertainties in the MSTW2008 NLO parton distribution function (PDF) sets [13].
This paper describes a measurement of the cross-section of the Wt process using √ s = 13 TeV protonproton (pp) collisions with an integrated luminosity of 3.2 fb −1 .The data were recorded with the ATLAS detector in 2015.The measurement is made using events containing at least one b jet (according to the definition in Section 4) and exactly two oppositely charged leptons in the final state, where a lepton ( ) is defined to be either an electron (e) or a muon (µ), whether produced directly from the decay of a W boson or from the decay of an intermediate τ lepton.The Wt signal enters this final state when the top quark decays into a W boson and a quark (which is assumed to be a b-quark), with both W bosons subsequently decaying into a neutrino and a lepton, as depicted in Figure 1.A minimal selection is applied to reduce background contributions from Z/γ * +jets (hereafter called Z + jets) events, diboson events, and events containing leptons that are misidentified or arise from the decay of hadrons.A boosted decision tree (BDT) analysis is performed to construct discriminants capable of separating the Wt signal from the dominant top quark pair (t t) background, and these discriminants are used in a profile-likelihood fit to extract the Wt cross-section.The top pair production background is described by simulation, which has been validated in previous ATLAS measurements [14].
The measurement technique is similar to that employed in the corresponding 8 TeV ATLAS measurement [12].The most significant changes include modifications to the BDT training and the binning of the distribution used in the likelihood fit (discussed in Section 6 and Section 8 respectively), and an optimisation of kinematic requirements to more effectively reject Z + jets and other small backgrounds (discussed in Section 5).

The ATLAS detector
The ATLAS detector [15] at the LHC covers nearly the entire solid angle1 around the collision point, and consists of an inner tracking detector (ID) surrounded by a thin superconducting solenoid magnet producing a 2 T axial magnetic field, electromagnetic (EM) and hadronic calorimeters, and an external muon spectrometer (MS).The ID consists of a high-granularity silicon pixel detector and a silicon microstrip tracker, together providing precision tracking in the pseudorapidity range |η| < 2.5, complemented by a transition radiation tracker providing tracking and electron identification information for |η| < 2.0.The innermost pixel layer, the insertable B-layer, was added between Run 1 and Run 2 of the LHC, at an innermost radius of 33 mm around a new, thinner, beam pipe [16].A lead liquid-argon (LAr) electromagnetic calorimeter covers the region |η| < 3.2, and hadronic calorimetry is provided by steel/scintillator tile calorimeters within |η| < 1.7 and copper/LAr hadronic endcap calorimeters in the range 1.5 < |η| < 3.2.A LAr forward calorimeter with copper and tungsten absorbers covers the range 3.1 < |η| < 4.9.The MS consists of precision tracking chambers covering the region |η| < 2.7, and separate trigger chambers covering |η| < 2.4.A two-level trigger system, using a custom hardware level followed by a software-based level, selects from the 40 MHz of collisions a maximum of around 1 kHz of interesting events for offline storage.

Data and simulation
The data events analysed in this paper correspond to an integrated luminosity of 3.2 fb −1 collected from the operation of the LHC in 2015 at √ s = 13 TeV with a bunch spacing of 25 ns and an average number of collisions per bunch crossing µ of around 14.They are required to be recorded in periods where all detector systems are flagged as operating normally.Additionally, individual events identified as containing corrupted data are rejected.
Monte Carlo (MC) simulation samples are used to estimate the efficiency to select signal and background events, train and test BDTs, estimate systematic uncertainties, and validate the analysis tools.All simulation samples are normalised to theoretical cross-section predictions.The nominal samples (used for estimating the central values for efficiencies and background templates) were simulated with a full ATLAS detector simulation [17] implemented in G 4 [18].Many of the samples used in the estimation of systematic uncertainties were instead produced using A 2 [19], which differs from the full simulation in that the ATLAS calorimeters and their responses are simulated using a faster approximation.Pile-up (additional pp collisions in the same or a nearby bunch crossing) was included in the simulation by overlaying collisions with the soft QCD processes of P 8.186 [20] using a set of tuned parameters called the A2 tune [21] and the MSTW2008LO PDF set.Events were generated with a predefined distribution 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the z-axis along the beam pipe.The x-axis points from the IP to the centre of the LHC ring, and the y-axis points upward.Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis.The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2), while the rapidity is defined in terms of particle energies and the z-component of particle momenta as y of the expected number of interactions per bunch crossing, then reweighted to match the actual observed data conditions.In all samples used for this analysis m top was set to 172.5 GeV and the W → ν branching ratio was set to 0.1080 per lepton flavour.
For the generation of Wt and t t event samples [22], the P -B v1 (v2 for t t) [23][24][25][26][27] generator with the CT10 PDF set [28] in the matrix element calculations is used.For these processes, top quark spin correlations are preserved.The parton shower, fragmentation, and underlying event were simulated using P 6.428 [29] with the CTEQ6L1 PDF set [30] and the corresponding Perugia 2012 (P2012) tune [31].The E G v1.2.0 program [32] was used to simulate properties of the bottom and charmed hadron decays.The renormalisation and factorisation scales are set to m top for the Wt process and m 2 top + p T (t) 2 for the t t process.The diagram removal (DR) scheme [33], in which all next-to-leading order (NLO) diagrams that overlap with the t t definition are removed from the calculation of the Wt amplitude, was employed to handle interference between Wt and t t diagrams, and was applied to the Wt sample.The t t cross-section is set to σ t t = 252.9+6.4 −8.6 (scale) ± 11.7 (PDF + α S ) pb as calculated with the Top++2.0program to NNLO, including soft-gluon resummation to NNLL [34].The first uncertainty comes from the independent variation of the factorisation and renormalisation scales, µ F and µ R , while the second one is associated with variations in the PDF and α S , following the PDF4LHC prescription with the MSTW2008 68 % CL NNLO, CT10 NNLO and NNPDF2.35f FFN PDF set [35][36][37][38].Both calculations assume m top = 172.5 GeV.
Additional Wt samples were generated to estimate major systematic uncertainties.An alternative Wt sample was generated using the diagram subtraction (DS) scheme instead of DR, where a gauge-invariant subtraction term modifies the NLO Wt cross-section to locally cancel the double-resonant t t contribution [33].Another sample generated with M G 5_aMC@NLO [39] (instead of the P -B ) interfaced with Herwig++ 2.7.1 [40] using A 2 fast simulation is used to estimate uncertainties associated with the modelling of the NLO matrix element generator.A sample generated with P -B interfaced with Herwig++ (instead of P 6) is used to estimate uncertainties associated with the parton shower, hadronisation, and underlying-event models.In both cases the UE-EE-5 tune of Ref. [41] was used for the underlying event, and E G v1.2.0 was used to simulate properties of the bottom and charmed hadron decays.Finally, in order to estimate uncertainties arising from additional QCD radiation in the Wt events, a pair of samples were generated with P -B interfaced with P 6 using A 2 and the P2012 tune with higher and lower radiation relative to the nominal set, together with varied renormalisation and factorisation scales.In these samples the resummation damping factor was doubled in the case of higher radiation.In order to avoid comparing two different detector response models when estimating systematic uncertainties, another version of the nominal P -B with P 6 sample was also produced with fast simulation.Additional t t samples were also generated to estimate major systematic uncertainties.As with the additional Wt samples, these are used to estimate the uncertainties associated with the matrix element generator (a sample produced using A 2 fast simulation with M G 5_aMC@NLO interfaced with Herwig++ 2.7.1), parton shower and hadronisation models (a sample produced using A 2 with P -B interfaced with Herwig++ 2.7.1) and additional QCD radiation (a pair of samples produced using full simulation with the P2012 higher and lower radiation-varied sets of parameters, as well as with varied renormalisation and factorisation scales).
Samples used to model the Z + jets background [42] were simulated with S 2.1.1 [43].Matrix elements were calculated for up to two partons at NLO and four partons at LO using the C [44] and O L [45] matrix element generators and merged with the S parton shower [46] using the ME+PS@NLO prescription [47].The CT10 PDF set was used in conjunction with S parton shower tuning, with a generator-level cutoff on the dilepton invariant mass of m > 40 GeV applied.The Z + jets events are normalised to NNLO cross-sections.
Diboson processes with four charged leptons, three charged leptons and one neutrino, or two charged leptons and two neutrinos [48] were simulated using the S 2.1.1 generator.The matrix elements contain all diagrams with four electroweak vertices.The NLO corrections are used for the purely leptonic final states as well as for final states with two or four charged leptons plus one additional parton.For other final states with up to three additional partons, the LO calculations of the C and O L generators are used.Their outputs are combined with the S parton shower using the ME+PS@NLO prescription [47].The PDF set used was CT10 with dedicated parton shower tuning.The generatorcalculated cross-sections are used for diboson processes (already at NLO).
Finally, the very small W+ jets contribution was simulated using P -B v2 interfaced to the P 8.186 [20] parton shower model.The CT10 PDF set was used in the matrix element.The AZNLO [49] tune was used, with PDF set CTEQ6L1, for the modelling of non-perturbative effects, and the E G v1.2.0 program was used to simulate properties of the bottom and charmed hadron decays.

Object selection
Electron candidates are reconstructed from energy deposits in the EM calorimeter associated with ID tracks [50].The deposits are required to be in the |η| < 2.47 region, with the transition region between the barrel and endcap EM calorimeters, 1.37 < |η| < 1.52, excluded.The candidate electrons are required to have transverse energy p T > 20 GeV.Further requirements on the electromagnetic shower shape, calorimeter energy to tracker momentum ratio, and other discriminating variables are combined into a likelihood-based object quality selection [50], optimised for strong background rejection.Candidate electrons also must satisfy requirements on the distance of the ID track to the reconstructed primary vertex in the event, which is identified as the vertex with the largest summed p 2 T of associated tracks.The transverse impact parameter significance must satisfy |d 0 |/σ d 0 < 5, and the longitudinal impact parameter must satisfy |∆z 0 sin θ| < 0.5 mm.Electrons are further required to be isolated based on ID tracks and topological clusters in the calorimeter [51], with an isolation efficiency of 90(99) % for p T = 25(60) GeV.
Muon candidates are identified by matching MS segments with ID tracks [52].The candidates must satisfy requirements on hits in the MS and on the compatibility between ID and MS momentum measurements to remove fake muon signatures.Furthermore, they must have p T > 20 GeV as well as |η| < 2.5 to ensure they are within coverage of the ID.Candidate muons must satisfy the following requirements on the distance of the combined ID and MS track to the primary vertex: the transverse impact parameter significance must satisfy |d 0 |/σ d 0 < 3, and the longitudinal impact parameter must satisfy |∆z 0 sin θ| < 0.5 mm.An isolation requirement is imposed based on ID tracks and topological clusters in the calorimeter, and results in an isolation efficiency of 90(99) % for p T = 25(60) GeV.
Single-lepton triggers used in this analysis are designed to select events containing a high-p T , wellidentified charged lepton [53].They require a p T of at least 20 GeV for muons and 24 GeV for electrons, and also have requirements on the lepton quality and isolation.These are complemented by triggers with higher p T thresholds and relaxed isolation and identification requirements to ensure maximum efficiency at higher lepton p T .
Jets are reconstructed from topological clusters in the calorimeter [54] using the anti-k t algorithm [55,56] with a radius parameter of 0.4.They are energy-corrected to account for pile-up and calibrated using a p T -and η-dependent correction derived from simulation [57].They are required to have p T > 25 GeV and |η| < 2.5.To suppress pile-up, a discriminant called the jet-vertex-tagger (JVT) is constructed using a two-dimensional likelihood method [58].For jets with p T < 60 GeV and |η| < 2.4 a JVT requirement corresponding to a 92 % efficiency while rejecting 98 % of jets from pileup and noise is imposed.
Jets containing b-hadrons (b-jets) are tagged using a multivariate discriminant which exploits the long lifetime and large invariant mass of b-hadron decay products relative to c-hadrons and unstable light hadrons [59].The discriminant is calibrated to achieve a 77 % b-tagging efficiency and rejection factor of about 4.5 against jets containing charm quarks (c-jet) and 140 against light-quark and gluon jets in a sample of simulated t t events [60].The b-tagging efficiency in simulation is corrected to the efficiency in data [61].
The missing transverse momentum vector is calculated as the negative vectorial sum transverse momenta of particles in the event.Its magnitude E miss T is a measure of the transverse momentum imbalance, primarily due to neutrinos that escape detection.Energy deposits in the calorimeters are uniquely assigned in order of priority to electrons, jets, and muons found in the event, thus avoiding double counting of signals.This approach also obviates the need for further overlap removal in the E miss T calculation, since a single energy deposit cannot be re-assigned to two nearby reconstructed signals.In addition to the identified electrons, jets and muons, a track-based soft term is included in the E miss T calculation by considering tracks associated with the hard-scattering vertex in the event but not with an identified electron, jet, or muon [62,63].
To avoid cases where the detector response to a single physical object is reconstructed as two separate final-state objects, several steps are followed to remove such overlaps.Bremsstrahlung radiation by a muon can result in ID tracks and a calorimeter energy deposit that are also reconstructed as an electron candidate.Therefore in cases where an electron and muon candidate share an ID track, the object is considered to be a muon, and the electron candidate is rejected.
The overlap of objects is measured using the Lorentz-invariant distance ∆R y,φ = (∆y) 2 + (∆φ) 2 .Due to the isolation requirements placed on electron candidates, any jets that closely overlap an electron candidate within a cone ∆R y,φ < 0.2 are likely to be reconstructions of the electron and so are rejected.When jets and electrons are found within the larger hollow cone 0.2 < ∆R y,φ < 0.4, it is more likely that a real hadronic jet is present and that the electron is a non-prompt constituent of the jet arising from the decay of heavy-flavour hadrons.Hence and electron candidates found within a cone ∆R y,φ < 0.4 of any remaining jet is rejected.
Muons can be accompanied by a hard photon due to bremsstrahlung or collinear final state radiation, and the muon-photon system can then be reconstructed as both a jet and muon candidate.Non-prompt muons can arise from the decay of hadronic jets, however these muons are associated with a higher ID track multiplicity than those accompanied by hard photons.In order to resolve these ambiguities between nearby jet and muon candidates, first any jets having fewer than three ID tracks and within a cone ∆R y,φ < 0.4 of any muon candidate are rejected, then any muon candidates within a cone ∆R y,φ < 0.4 of any remaining jet is rejected.

Event selection and background estimation
Events are required to have at least one well-reconstructed interaction vertex, to pass a single-electron or single-muon trigger, and to contain at least one jet with p T > 25 GeV.Events are required to contain exactly two charged leptons of opposite charge with p T > 20 GeV; events with a third lepton with p T > 20 GeV are rejected.At least one lepton must have p T > 25 GeV, and at least one of the selected electrons (muons) must be matched within a ∆R = (∆η) 2 + (∆φ) 2 cone of size 0.07 (0.1) to the electron (muon) selected online by the corresponding trigger.
In simulated events, information recorded by the event generator is used to identify events in which any selected lepton does not originate promptly from the hard-scatter process.These non-prompt or fake leptons arise from processes such as the decay of a b-hadron, photon conversion or hadron misidentification, and are identified when the electron or muon does not originate from the decay of a W or Z boson (or a τ lepton itself originating from a W or Z).Events with a selected lepton which is non-prompt or fake are themselves labelled as fake and are treated as a contribution to the background.
After this selection has been made, a further set of requirements is imposed with the aim of reducing the contribution from the Z + jets, diboson and fake/non-prompt lepton backgrounds.The resultant sample is intended to consist almost entirely of Wt signal and t t background (a breakdown of the expected signal contributions and background compositions in all regions can be seen in Figure 3), which are subsequently separated by the BDT analysis.Events in which the two leptons have the same flavour and an invariant mass consistent with a Z boson (81 < m < 101 GeV) are vetoed, as well as those with an invariant mass m < 40 GeV.Further requirements on E miss T and m are chosen based on the flavour of the selected leptons (as shown in Table 1).Events with different-flavour leptons are required to have E miss and different-flavour events are chosen separately due to the kinematically different processes contributing to the Z + jets background, namely Z → ee/µµ in same-flavour events and Z → ττ in different-flavour events.These requirements reduce the Z + jets contributions in the signal regions to 12 % according to simulation.The partitioning of events into different selections based on lepton flavour, E miss T , and m is described well by the simulation, motivating the choice to merge these selection regions into the signal regions described below.
The sample of selected events is divided into regions based on the number of jets and b-tagged jets.At LO, the signal process results in a final state with one b-jet arising from the top quark decay, and no additional jets, while the t t process results in two b-jets from the top quark decays.Events with additional jets are also studied since the underlying event, higher order QCD and other effects may produce additional jets in signal events.
Corresponding to these expected final states, two signal regions are defined by the presence of exactly one b-tagged jet and either zero (denoted 1j1b) or one (denoted 2j1b) additional jet.A t t-enriched control region is defined by the presence of exactly two jets, which are both b-tagged (denoted 2j2b).This control region is used to constrain the t t background normalisation, and is expected to contain only a small (< 1 %) proportion of signal events.These three regions -1j1b, 2j1b and 2j2b -are called the fit regions, as they are used in the simultaneous fit described in Section 8.The total efficiency in simulation to accept a dilepton Wt signal event into one of the signal or control regions is about 12 %, while the efficiency to accept a dilepton t t background event to the same regions is about 5 % estimated in simulation.Event yields for each fit region are presented in Section 9. Two additional regions, in which events are required to contain one (denoted 1j0b) or two (denoted 2j0b) jets but no b-tagged jets are used to validate the description of the data by the simulation.A schematic view of the regions definition is shown in Figure 4.

Separation of signal from background
After the event selection is performed, the data sample consists primarily of t t events with a significant number of Wt signal events (see for example Figure 3).As there is no single observable that clearly discriminates between the Wt signal and the t t background, several observables are combined into a single discriminator using a BDT technique [64].A collection of decision trees is created that weakly separates events into signal and background based on a number of binary decisions considering a single observable at a time.A boosting algorithm is then used to assign weights to each tree such that the ensemble of weak classifiers performs as a strong classifier [65].In this analysis, the BDT implementation is provided by the package [66], using the GradientBoost algorithm.
Separate BDTs are prepared for the analysis regions 1j1b and 2j1b.Due to the low efficiency to accept a Wt event in the 2j2b region, the computing cost to simulate events in this region is especially large.Since the expected gain in signal precision from subdividing the 2j2b region is minimal, no BDT is constructed here and a single bin is used.for testing.For each region, a large list of variables is prepared for the BDT.An optimisation procedure is then carried out in each region to select a subset of input variables and a set of BDT parameters (such as the number of trees in the ensemble and the maximum depth of the individual decision trees).The optimisation is designed to provide the best separation between the Wt signal and t t background while avoiding sensitivity to statistical fluctuations in the training sample.
The variables considered are derived from the kinematic properties of subsets of the selected physics objects defined in Section 4 for each event.T are assigned four-momenta by assuming zero mass and z-component.For two systems of objects s 1 and s 2 : ∆R(s 1 , s 2 ) is the separation in φ-η space; ∆p T (s 1 , s 2 ) is the p T difference; ∆φ(s 1 , s 2 ) is the φ difference; and C(s 1 , s 2 ), the centrality, is the ratio of the scalar sum of p T to the sum of energy.
The final set of input variables used in each BDT is listed in Table 2 along with the separating power of each variable.2In order to check that the variables and their correlations in Wt signal and the background events are well modelled by simulation, the distributions of these variables and the BDTs are compared between the MC prediction and the observed data, using a Kolmogorov-Smirnov (KS) statistical test [67] to check agreement.The distributions of the two most powerful variables in each fit region are shown in Figure 5.The MC predictions describe the data well, within the total systematic uncertainties.

Systematic uncertainties
Systematic uncertainties are divided into experimental and theoretical sources.Each uncertainty is assigned a Gaussian-constrained nuisance parameter, which allows the uncertainty to be constrained by data.
The experimental sources of uncertainty include the measurement of the luminosity, lepton efficiency scale factors used to correct simulation to data, lepton energy scale and resolution, E miss T soft-term calculation, jet energy scale and resolution, and the b-tagging efficiency.Among these, the dominant sources of uncertainty are due to the determination of the jet energy scale (JES) and jet energy resolution.Table 3 gives a breakdown of uncertainties in the final fitted cross-section.
The JES uncertainty [57] is divided into a total of 18 components, which are derived using √ s = 13 TeV data.The uncertainties from in situ analyses including studies of Z/γ+jet and dijet events are represented with six orthogonal components (JES Eff1-6).The full description of jet uncertainties and correlations is reduced to obtain this set of uncertainty components that can be used as nuisance parameters in a likelihood fit.This is done by diagnoalising the covariance matrix describing the jet uncertainties to obtain a set of reduced uncertainties corresponding to the eigenvector-eigenvalue pairs as demonstrated ).The signal and backgrounds are normalised to their theoretical predictions, and the error bands represent the total systematic uncertainties in the Monte Carlo predictions.The first and last bins of each distribution contain overflow events.The upper panels give the yields in number of events per bin, while the lower panels give the ratios of the numbers of observed events to the total prediction in each bin.
Table 2: The variables used in each BDT and their separating powers (a measure of the difference between probability distributions of signal and background in the variable, denoted S).The variables are derived from the four-momenta of the leading (sub-leading) lepton 1 ( 2 ), the leading (sub-leading) jet j 1 ( j 2 ) and E miss T .The last row gives the separation power of the BDT discriminant output.
0.1 BDT discriminant 10.9 in Ref. [68].Other components are model uncertainties (such as flavour composition, η intercalibration model), and other systematics in the JES determination (such as pile-up jet area ρ).The most significant JES uncertainty components for this analysis are the in situ calibration and the flavour composition uncertainty, which is the dependence of the jet calibration on the fraction of quark or gluon jets in data.The jet energy resolution uncertainty estimate [57] is based on comparisons of simulation and data using in situ studies with Run-1 data.These studies are then cross-calibrated and checked to confirm good agreement with Run-2 data.
As discussed in Section 4, the E miss T calculation includes contributions from hard sources, including leptons and jets, in addition to soft terms which arise primarily from low-p T pile-up jets and underlying-event activity.The uncertainty associated with the hard terms is propagated from the corresponding uncertainties in the energy/momentum scales and resolutions for jets and leptons, and is classified together with the uncertainty associated with the hard objects.The uncertainty associated with the soft term is estimated by comparing the simulated scale and resolution to that in data, including differences in uncertainties due to model dependence.to cover changes made in the inner detector and tracking algorithms between Run-1 and Run-2 data, accounting for fake tracks, tracking efficiency and tracking resolution.These c-jet and light-parton jet scale factors were later checked against similar scale factors derived on √ s = 13 TeV data, and the scale factors with added uncertainties were found to agree well with the full run-2 scale factors.
Systematic uncertainties in lepton momentum resolution and scale, trigger efficiency, isolation efficiency, and identification efficiency are also considered.These uncertainties arise from corrections to simulation based on studies of Z → ee and Z → µµ data.In this analysis the effects of the uncertainties in these corrections are relatively small.A 2.1 % uncertainty is assigned to the integrated luminosity determination for 2015 data.It is derived, following a methodology similar to that detailed in Ref. [69], from a calibration of the luminosity scale using x-y beam-separation scans performed in August 2015.
Uncertainties stemming from theoretical models are estimated by comparing a set of predicted distributions produced with different assumptions and applying the difference observed as a weight to the nominal Wt or t t distribution.The main uncertainties are due to the NLO matrix element (ME) generator, parton shower and hadronisation generator, initial-and final-state radiation (I/FSR) tuning and the PDF.The NLO matrix element uncertainty is estimated by comparing two NLO matching methods: the predictions of P -B and M G 5_aMC@NLO, both interfaced with Herwig++.The parton shower, hadronisation, and underlying-event model uncertainty is estimated by comparing P -B interfaced with either P 6 or H ++. The uncertainty from the matrix element generator is treated as uncorrelated between the Wt and t t processes, while the uncertainty from the parton shower generator is treated as correlated.The I/FSR tuning uncertainty is estimated by taking half of the difference between samples with P -B interfaced with P 6 tuned with either more or less radiation, and is uncorrelated between the Wt and t t processes.The choice of scheme to account for the interference between the Wt and t t processes constitutes another source of systematic uncertainty for the signal modelling, and it is estimated by comparing samples using either the diagram removal scheme or the diagram subtraction scheme, both generated with P -B +P 6.The uncertainty due to the choice of PDF is estimated using the PDF4LHC15 combined PDF set [70].The difference between the central CT10 [28] prediction and the central PDF4LHC15 prediction (PDF central value) is taken and symmetrised together with the internal uncertainty set provided with PDF4LHC15.For t t and Wt modelling, the NLO matrix element model, parton shower model, and PDF uncertainties are estimated using fast-simulated samples; for Wt, fast simulation is also used for I/FSR.In each case where results from two samples must be compared, fast simulated samples are only compared to other fast simulated samples.
Additionally, normalisation uncertainties of 100 % are assumed for the fake/non-prompt lepton backgrounds.The Z + jets backgrounds with one b-tagged jet are assigned a 50 % uncertainty, while a 100 % uncertainty is assumed for Z + jets events with two b-tagged jets.These uncertainties are chosen to be consistent with previous ATLAS studies of these processes in dedicated validation regions.Diboson backgrounds are assigned an uncertainty of 25 % to cover the difference between the predictions of the S and P -B generators.These uncertainties are treated as uncorrelated across the various regions of jet and b-tagged jet multiplicity.

Extraction of signal cross-section
The Wt cross-section is extracted from the data using a profile-likelihood fit that combines inputs from each signal and control region to constrain backgrounds and systematic uncertainties.The fit uses the HistFitter [71] software framework, which is in turn built on the HistFactory, RooStats, and RooFit [72] frameworks.
The fit uses the binned BDT response for MC events in two of the three fit regions (1j1b and 2j1b) and a single bin in the 2j2b region to construct templates for the Wt signal and each modelled background (t t, Z + jets, diboson, fake or non-prompt leptons).For each signal and background template, an additional template is constructed for each of the MC sample variations (see Section 7) accounting for a systematic uncertainty.Systematic uncertainties are considered by allowing Gaussian-constrained nuisance parameters to deform fit templates while simultaneously varying the normalisation of the templates.The normalisation of the t t background, µ t t , is also determined in the fit by assigning an unconstrained parameter to the t t normalisation.Other backgrounds are constrained within their systematic uncertainties by Gaussian-constrained nuisance parameters, and all templates are affected by the overall luminosity uncertainty.
A global likelihood function is constructed to describe the level of agreement between data and prediction as a function of the parameter of interest, namely the Wt signal strength µ W t , and a list of nuisance parameters each describing the influence of a different source of systematic uncertainty.The Wt crosssection and its uncertainty are extracted from the fitted value of µ W t , with a value of unity corresponding to the predicted NLO+NNLL σ theory value.

Results
The expected and fitted yields from data are measured in the three fit regions.The majority of signal events fall in the 1j1b and 2j1b regions, with the former giving the better signal to background ratio as well as the larger yield of signal events.Table 4 shows the fitted yields of each process.From the fitted Wt yield, a cross-section is then extracted.The result is a measured cross-section of σ W t = 94 ± 10 (stat.)+28 −22 (syst.)± 2 (lumi.)pb, corresponding to an observed (expected) significance of 4.5 σ (3.9 σ).Most pairs of parameters in the fit show small correlations, generally at the 25 % level or less.The most correlated pairs of nuisance parameters are the modelling uncertainties due to matrix element and parton shower (57 %), and the parameters related to JES flavour composition and parton shower uncertainties (45 %).
Figure 6 shows the fit parameters (θ) with the highest post-fit impact on the signal strength, and also gives the pre-fit impacts as well as post-fit parameter values.The post-fit parameters, θ, are shifted and re-scaled by ( θ − θ 0 )/∆θ, where θ 0 and θ are the pre-and post-fit values of θ, while ∆θ is the pre-fit uncertainty on θ.Here the impact (∆µ) of a parameter is defined as the change in signal strength observed when fixing this parameter to its ±1σ values, fixing all other parameters to their nominal values, and fitting the signal strength.The change is taken with respect to the nominal pre-fit value for pre-fit impact and with respect to the nominal post-fit value for the post-fit impact.The pre-fit and post-fit impact are differentiated based on whether pre-fit or post-fit values of ±1σ variations are assumed for the parameter under consideration.The parameters with the highest post-fit impact are jet energy scale uncertainties and the modelling uncertainties due to parton shower and t t initial-and final-state radiation.Some Table 3: Relative uncertainties in the Wt cross-section.These are estimated by fixing each uncertainty parameter to its post-fit ±1σ uncertainties, re-fitting, and assessing the change in the signal strength.Due to correlations between parameters, the individual uncertainty categories are not expected to add up to the total systematic uncertainty.The statistical uncertainty is evaluated by fitting without any nuisance parameters corresponding to systematic uncertainties in the fit, and the total systematic uncertainty is evaluated by subtracting the statistical uncertainty from the total uncertainty in quadrature.parameters fit to values which are significantly different from unity; t t initial-and final-state radiation and the component JES Eff1 each exhibit this effect.This behaviour is expected as a few parameters could be pulled outside of the ±1σ band when there are a large number of parameters being fitted, while the majority should fall within the ±1σ range.Certain parameters are assigned post-fit uncertainties significantly smaller than the nominal pre-fit uncertainty values and are thus profiled or constrained by the observed data.For example, the uncertainty due to parton shower generator would be among the most dominant uncertainties without the constraints from profiling, with pre-fit impacts exceeding 60 % of the signal strength.However, information from the 2j2b region about the t t normalisation and the relative yields in the signal regions significantly constrain these uncertainties.Another feature observed in this plot is how the sum in quadrature of the individual impacts is substantially smaller than the final uncertainty shown in Table 3.This is due to the correlations between the uncertainties, and in particular to the constraint provided by the t t normalisation.Some of the larger uncertainties such as the parton shower generator uncertainty have an asymmetric impact on the signal strength.These asymmetries have been traced to originate from the large normalisation uncertainty on the t t background.

Source
The MC predictions and data yields for the BDT response after setting all fit parameters to their final best-fit values are shown in Figure 7, with error bands representing the total uncertainties in the fitted results.The NLO+NNLL cross-section prediction agrees well with the measured value, and µ t t is fitted to 0.98 ± 0.05.The upper panels give the yields in number of events per bin, while the lower panels give the ratios of the numbers of observed events to the total prediction in each bin.

Conclusion
The inclusive cross-section for the associated production of a W boson and top quark is measured using 3.2 fb −1 of pp collision data collected at √ s = 13 TeV by the ATLAS detector at the LHC.The analysis uses dilepton events with at least one b-tagged jet.Events are separated into signal and control regions based on the number of jets and b-tagged jets, and the Wt signal is separated from the t t background using a BDT discriminant.The cross-section is extracted by fitting templates to the BDT output distribution, and is measured to be σ W t = 94 ± 10 (stat.)+28 −22 (syst.)± 2 (lumi.)pb.The measured value is in good agreement with the SM prediction of σ theory = 71.7 ± 1.8 (scale) ± 3.4 (PDF) pb [1].

Figure 1 :
Figure 1: A representative leading-order Feynman diagram for the production of a single top quark in the Wt channel and the subsequent leptonic decay of both the W boson and top quark.

Figure 2 :
Figure 2: Representative leading-order Feynman diagrams for the production of a single top quark in (a) the t-channel and (b) the s-channel.

T > 20 > 5 ×
GeV, with the requirement raised to E miss T > 50 GeV when the dilepton invariant mass satisfies m < 80 GeV.All events with same-flavour leptons must satisfy E miss T > 40 GeV.For same-flavour events, the Z + jets background is concentrated in a region of the m -E miss T plane corresponding to values of m near the Z mass, and towards low values of E miss T .Therefore, a selection in E miss T and m is used to remove these backgrounds: events with 40 GeV < m < 81 GeV are required to satisfy 4 × E miss T m while events with m > 101 GeV are required to satisfy 2 × m + E miss T > 300 GeV.The requirements for the same-

Figure 3 :
Figure3: Expected event yields for signal and backgrounds with their total systematic uncertainty (discussed in Section 8) and the number of observed events in the data are shown in the three fit regions (1j1b, 2j1b, and 2j2b) and the two additional regions (1j0b and 2j0b).The signal and backgrounds are normalised to their theoretical predictions, and the error bands represent the total systematic uncertainties which are used in this analysis.The upper panel gives the yields in number of events per bin, while the lower panel gives the ratios of the numbers of observed events to the total prediction in each bin.

Figure 4 :
Figure 4: A schematic view of signal, control and validation regions.Signal and control regions are used in the fit.
For a set of objects o 1 . . .o n : p sys T (o 1 . . .o n ) is the transverse momenta of various subsets; H T (o 1 . . .o n ) is the scalar sum of transverse momenta; E T is the scalar sum of the transverse momenta of all objects which contribute to the E miss T calculation; σ(p sys T ) is the ratio of p sys T to (H T + E T ); m(o 1 . . .o n ) is the invariant mass of various subsets; m T (o 1 . . .o n ) is the transverse mass (i.e. the sum of the invariant masses of o 1 . . .o n each projected onto the transverse plane); and E/m(o 1 . . .o n ) is the ratio of energy to invariant mass.Two-dimensional vectors such as ì E miss

Figure 5 :j 1 )T ( 1 , 2 )
Figure 5: Distributions of the two most powerful BDT input variables in each fit region: in the 1j1b region (a) p sys T ( 1 2 E miss T j 1 ) and (b)∆p T ( 1 2 E miss T j 1 ); in the 2j1b region (c) p sys T ( 1 , 2 ) and (d) ∆R( 1 , E miss T j 1 j 2 ).The signal and backgrounds are normalised to their theoretical predictions, and the error bands represent the total systematic uncertainties in the Monte Carlo predictions.The first and last bins of each distribution contain overflow events.The upper panels give the yields in number of events per bin, while the lower panels give the ratios of the numbers of observed events to the total prediction in each bin.
Uncertainties in the scale factors to correct the b-tagging efficiency in simulation to the efficiency in data are assessed using independent eigenvectors for the efficiency of b-jets, c-jets, light-parton jets, and two extrapolation uncertainty factors.These b-tagging uncertainties are determined with √ s = 13 TeV data for b-jets, while for c-jets and light-parton jets they are determined in √ s = 8 TeV data, then extrapolated to and checked with √ s = 13 TeV data.The extrapolation is performed by adding additional uncertainties

Figure 6 :
Figure 6: List of fit parameters ranked by post-fit impact on the signal strength.The fit parameters (θ) here correspond to the nuisance parameters from Section 8. Impact (∆µ) is calculated by fixing the parameter to its ±1σ values, fixing all other parameters to their nominal values, re-fitting the signal strength, and evaluating the change in signal strength with respect to the nominal fit.Green bands indicate the impacts computed with σ corresponding to the pre-fit uncertainty, and hatched purple bands indicate the impacts computed with σ corresponding to the post-fit uncertainty.The black points represent ( θ − θ 0 )/∆θ, the shifted and scaled post-fit parameter values, while the error bars are the post-fit errors of the fit parameter.The meanings of the labels and abbreviations are detailed in Section 7.

Figure 7 :
Figure7: Post-fit distributions in the signal and control regions 1j1b, 2j1b, and 2j2b.The error bands represent the total uncertainties in the fitted results.The upper panels give the yields in number of events per bin, while the lower panels give the ratios of the numbers of observed events to the total prediction in each bin.

aj
Also at Departement de Physique Nucleaire et Corpusculaire, Université de Genève, Geneva, Switzerland ak Also at Eotvos Lorand University, Budapest, Hungary al Also at Departments of Physics & Astronomy and Chemistry, Stony Brook University, Stony Brook NY, United States of America am Also at International School for Advanced Studies (SISSA), Trieste, Italy an Also at Department of Physics and Astronomy, University of South Carolina, Columbia SC, United States of America ao Also at Institut de Física d'Altes Energies (IFAE), The Barcelona Institute of Science and Technology, Barcelona, Spain ap Also at School of Physics, Sun Yat-sen University, Guangzhou, China aq Also at Institute for Nuclear Research and Nuclear Energy (INRNE) of the Bulgarian Academy of Sciences, Sofia, Bulgaria ar Also at Faculty of Physics, M.V.Lomonosov Moscow State University, Moscow, Russia as Also at Institute of Physics, Academia Sinica, Taipei, Taiwan at Also at National Research Nuclear University MEPhI, Moscow, Russia au Also at Department of Physics, Stanford University, Stanford CA, United States of America av Also at Institute for Particle and Nuclear Physics, Wigner Research Centre for Physics, Budapest, Hungary aw Also at Giresun University, Faculty of Engineering, Turkey ax Also at Flensburg University of Applied Sciences, Flensburg, Germany ay Also at CPPM, Aix-Marseille Université and CNRS/IN2P3, Marseille, France az Also at University of Malaya, Department of Physics, Kuala Lumpur, Malaysia ba Also at LAL, Univ.Paris-Sud, CNRS/IN2P3, Université Paris-Saclay, Orsay, France * Deceased

Table 1 :
Summary of event selection criteria used in the analysis.At least one jet with p T > 25 GeV, |η| < 2.5 Exactly two leptons of opposite charge with p T > 20 GeV, |η| < 2.5 for muons and |η| < 2.47 excluding 1.37 < |η| < 1.52 for electrons At least one lepton with p T > 25 GeV, veto if third lepton with p T > 20 GeV At least one lepton matched to the trigger object The BDTs are optimised to distinguish between Wt and t t by using the nominal Wt MC sample, the alternative Wt MC sample with diagram subtraction scheme and the nominal t t MC sample; for each sample, half of the events are used for training while the other half is reserved ∆σ W t /σ W t [%]

Table 4 :
Fit results for an integrated luminosity of 3.2 fb −1 .The errors shown are the final fitted uncertainties in the yields, including uncertainties in the fitted signal strength, systematic uncertainties, and statistical uncertainties, taking into account correlations and constraints induced by the fit.