Measurement of inclusive and differential cross sections for single top quark production in association with a W boson in proton-proton collisions at $\sqrt{s}$ = 13 TeV

Measurements of the inclusive and normalised differential cross sections are presented for the production of single top quarks in association with a W boson in proton-proton collisions at a centre-of-mass energy of 13 TeV. The data used were recorded with the CMS detector at the LHC during 2016-2018, and correspond to an integrated luminosity of 138 fb$^{-1}$. Events containing one electron and one muon in the final state are analysed. For the inclusive measurement, a multivariate discriminant, exploiting the kinematic properties of the events is used to separate the signal from the dominant $\mathrm{t\bar{t}}$ background. A cross section of 79.2 $\pm$ 0.9 (stat) $^{+7.7}_{-8.0}$ (syst) $\pm$ 1.2 (lumi) pb is obtained, consistent with the predictions of the standard model. For the differential measurements, a fiducial region is defined according to the detector acceptance, and the requirement of exactly one jet coming from the fragmentation of a bottom quark. The resulting distributions are unfolded to particle level and agree with the predictions at next-to-leading order in perturbative quantum chromodynamics.


Introduction
The electroweak production of single top quarks was first observed by the D0 [1] and CDF [2] Collaborations at the Fermilab Tevatron. At leading order, single top quark production proceeds mainly via three processes: the t-channel exchange of a virtual W boson, s-channel production, and the associated production of a top quark and a W boson (tW). The last channel, which was negligible at the Tevatron, represents a significant contribution to single top quark production at the CERN LHC. It is a very interesting production mechanism because of its interference with top quark pair (tt) production [3][4][5], its sensitivity to beyond the standard model (SM) physics [6][7][8][9][10][11], and its role as a background in several other analyses.
The definition of tW production at next-to-leading order (NLO) in perturbative quantum chromodynamics (QCD) shares final states with tt production [3][4][5]. The cross section for tW production has been computed at approximate next-to-NLO order (NNLO) in QCD. Assuming a top quark mass (m t ) of 172.5 GeV, the theoretical NNLO cross section for tW production in proton-proton (pp) collisions at √ s = 13 TeV is σ SM tW = 71.7 ± 1.8 (scale) ± 3.4 (PDF) pb [12], where the first uncertainty corresponds to the renormalisation (µ R ) and factorisation (µ F ) scale variations, and the second to using different parton distribution function (PDF) sets. The leading-order (LO) Feynman diagrams for tW production are shown in Fig. 1. The tW channel was not accessible at the Tevatron due to its small cross section in pp collisions at √ s = 1.96 TeV. At the LHC, however, evidence for this process in 7 TeV collision data was presented by ATLAS [13] and CMS [14]. At √ s = 8 TeV, measurements by CMS [15] and ATLAS [16] were in good agreement with theoretical predictions.

At
√ s = 13 TeV the inclusive tW production cross section has been measured by both AT-LAS [17] and CMS [18] using data recorded during 2016. Both measurements employed dileptonic (e ± µ ∓ for CMS and e + e − , µ + µ − , and e ± µ ∓ for ATLAS) channels. The measurement of the differential tW production cross section is particularly challenging due to the overwhelming background from tt events, with 80% [17] in the most tW-enriched category. The ATLAS experiment performed the first measurement of the tW production differential cross section [19], and a study [20] of the WWbb final state (that includes tW and tt). Studies in the semileptonic channel have also been done recently at √ s = 13 TeV by CMS [21] and at √ s = 8 TeV by ATLAS [22]. This paper reports a measurement of inclusive and normalised differential tW production cross sections at √ s = 13 TeV with dilepton final states (e ± µ ∓ ), using data collected with the CMS detector during 2016-2018, corresponding to an integrated luminosity of 138 fb −1 . The paper is structured as follows. A summary of the data and Monte Carlo (MC) samples used is provided in Section 2. The object and event selection criteria are discussed in Section 3. The signal extraction for the inclusive and differential measurements are detailed in Sections 4 and 5, respectively. The systematic uncertainties are discussed in Section 6. The inclusive and differential results are presented in Section 7. Finally, a summary of both measurements is given in Section 8. Tabulated results are provided in the HEPData record for this analysis [23].

The CMS detector and MC simulation
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionisation chambers embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [24]. Events of interest are selected using a two-tiered trigger system. The first level (L1), composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 µs [25]. The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [26]. We rely on MC simulations to estimate the contributions from both signal and background processes. The tW signal samples are simulated at NLO in QCD using POWHEG v2 [27][28][29]. Two schemes are proposed to describe the tW signal: "diagram removal" (DR) [3], where all NLO diagrams that are doubly resonant (i.e. that can have two top quarks on-shell), such as those in Fig. 2, are excluded from the signal definition; and "diagram subtraction" (DS) [3,30], in which the differential cross section is modified with a gauge-invariant subtraction term, which locally cancels the contribution of tt diagrams. The DR scheme is used as the nominal model in this analysis. However, the difference in the results obtained for the two schemes is also evaluated. Both DR and DS schemes are used for comparison with the differential particle-level result. In addition, we consider signal samples generated with MADGRAPH5 aMC@NLO v2.6.5 [31] to compare, at the particle level, the predictions that use the DR and DS schemes. Two other derivations from those approaches are also considered: the so-called "DR2" approach that includes the terms corresponding to the interference between tW and tt processes, and an alternative way of implementing DS (later referenced as "DS dyn."), where a dynamic factor is used to model the top quark resonance, instead of a fixed one [32]. The NLO QCD setup in POWHEG v2 is also used to simulate tt events. The Drell-Yan (DY) background samples are simulated at NLO in QCD using MADGRAPH5 aMC@NLO v2.2.2, except in 2017, where it is simulated at LO in QCD using MADGRAPH5 aMC@NLO v2.4.2. The contributions from WW, WZ, and ZZ (referred to as VV) processes are simulated at NLO in QCD with POWHEG v2 or MADGRAPH5 aMC@NLO v2.2.2. Other contributions from W, Z, and γ boson production in association with tt events (referred to as ttV) are simulated at NLO in QCD using MAD-GRAPH5 aMC@NLO v2.2.2. The small cross section processes of triboson production with different W, Z, and γ combinations are grouped together with the VV and ttV contributions and simulated at NLO in QCD with MADGRAPH5 aMC@NLO v2.2.2.
The lepton+jets events in the tt samples are used to estimate the contribution to the background from events with a jet incorrectly reconstructed as a lepton (electron or muon). Since the latter background contributions contain a lepton candidate that does not originate from a leptonic decay of a gauge boson, they are labelled non-W/Z. The NNPDF 3.1 NNLO [33] PDF set is used in the simulation of all samples, except for the 2016 non-tt backgrounds for which 3.0 NNPDF NLO [34] is applied. The generators are interfaced in all cases with PYTHIA v8.230 [35], which is used to model the hadronisation and parton showering (PS). The underlying event is modelled with the CP5 tune [36] in all of the 2017-2018 samples. For 2016, the CP5 tune is used for the signal and tt background samples, whereas for the rest of the samples the CUETP8M1 tune [37] is taken. For comparison at the particle level, another signal sample where the POWHEG v2 generator is interfaced with HERWIG++ v2.7.1 (2016) and HERWIG 7 (2017-2018) [38][39][40] is used. This uses the EE5C tune [41] for 2016 and CH3 [42] for 2017-2018. For the samples generated with MADGRAPH5 aMC@NLO at LO (NLO) accuracy, double counting of partons from the matrix element calculations and PS described by PYTHIA v8.230 is removed using the MLM [43] (FXFX [44]) matching scheme. The nominal m t is set to 172.5 GeV for all samples. For both the tW signal and tt background alternative samples were generated to estimate systematic uncertainties, which are obtained from the same generator (POWHEG v2) and PS simulation (PYTHIA v8.230). These uncertainties are described in detail in Section 6. The GEANT4 package [45] is used to simulate the CMS detector response for all simulated samples. To compare with the measured data, the event yields in the simulated samples are normalised to the integrated luminosity using their theoretical cross sections. These are taken from calculations at NNLO for DY [46], approximate NNLO for tW events [12], and NLO for diboson events [47]. For the simulated tt sample, the full NNLO plus next-to-nextto-leading-logarithmic calculation [48], performed with the TOP++ 2.0 program [49], is used. The PDF uncertainty is added in quadrature to the uncertainty associated with the strong coupling constant (α S ) to obtain a tt production cross section of 832 +20 −29 (scale) ± 35 (PDF+α S ) pb, for m t = 172.5 GeV.
The simulated samples include additional pp interactions in the same or nearby bunch crossings (pileup). A reweighting is applied in simulations to match the distribution of bunch crossings observed in the data. Assuming a total inelastic pp cross section of 69.2 mb [50], the average number of pileup interactions per bunch crossing are 23, 33, and 32 in 2016, 2017, and 2018, respectively.

Event selection
In the SM, top quarks decay nearly 100% of the time into a W boson and a bottom quark. The analysis described here uses events in the e ± µ ∓ final state, in which the W boson from the decay of the top quark and the W boson produced in association with the top quark both decay leptonically, one into an electron and a neutrino, and the other into a muon and a neutrino. This leads to a final state composed of two different-flavour leptons with opposite electric charge, one jet resulting from the fragmentation of a bottom quark, and two neutrinos.
Events are required to pass either a dilepton or single-lepton trigger with isolation requirements [26]. The dilepton triggers require events to contain either one electron with transverse momentum p T > 12 GeV and one muon with p T > 23 GeV, or one muon with p T > 8 GeV and one electron with p T > 23 GeV. The single-lepton triggers with one electron (muon) with p T > 27 (24), 35 (24), and 32 (24) GeV for 2016, 2017, and 2018 are used to increase the efficiency. The combined trigger efficiency is measured using data events which pass the selection criteria given below, and which were collected with triggers based on the p T imbalance in the event. It is found to be ≈98%. The trigger efficiency in simulated events is corrected to match that observed in data.
The particle-flow (PF) algorithm [51] aims to reconstruct and identify each individual particle in an event with an optimised combination of information from the various elements of the CMS detector. The primary vertex, which is the vertex corresponding to the hardest scattering in the event, is evaluated using tracking information alone, as described in Section 9.4.1 of Ref. [52]. Further requirements are imposed on the reconstructed and identified lepton and jet candidates obtained from the PF algorithm. Electrons [53] and muons [54] in the event are required to be well isolated and have p T > 20 GeV and |η| < 2.4. Additional criteria are imposed on the impact parameter of the leptons in order to ensure that they come from the primary vertex. Electron candidates in the transition region between the barrel and endcap calorimeters, corresponding to 1.444 < |η| < 1.566, are ignored because the electron reconstruction in this region is not optimal.
Events with W bosons decaying into τ leptons are considered as signal only if the τ leptons decay into electrons or muons that satisfy the selection requirements. In events with more than two leptons passing the selection, the two with the largest p T are kept for further study. Jets are reconstructed from the PF candidates using the anti-k T clustering algorithm [55, 56] with a distance parameter of 0.4. Jet energy corrections, derived from simulation, are applied so that the average response to jets matches the particle-level jets [57]. In situ measurements of the momentum balance in dijet, photon+jet, Z+jet, and multijet events are used to account for residual differences in the jet energy scale (JES) between data and simulation. In addition, jet energy resolution (JER) is corrected to reproduce that obtained from data [58].
Jets are required to have p T > 30 GeV and |η| < 2.4. Jets with p T between 20 and 30 GeV and |η| < 2.4 are referred to as "loose jets". The differences in these lower-p T jets between the tW and tt distributions can be exploited for their separation. Jets are identified as coming from the fragmentation of bottom quarks (b jets) using the DeepJet algorithm [59, 60], with an operating point that yields identification efficiencies of ≈70% and misidentification probabilities of about 1% for light-quark and gluon jets. The missing transverse momentum vector p miss T is defined as the negative vector momenta sum of all reconstructed PF candidates, which include the jet energy corrections described above in an event, projected onto the plane perpendicular to the direction of the beam axis. Its magnitude is referred to as p miss T . Events are selected as belonging to the e ± µ ∓ final state if the two leptons with highest p T passing the above selection criteria are an electron and a muon of opposite charge. The highest p T (leading) lepton is required to have p T > 25 GeV. To reduce the contamination from DY production of τ lepton pairs with low dilepton invariant mass, the minimum invariant mass of all pairs of identified leptons (including leptons beyond the leading two) is required to be greater than 20 GeV. The remaining events are classified by the number of jets and identified b jets in the event, as shown in Fig. 3 (left). In the following, the notation njmb represents events with exactly n jets where m of them are identified as b jets. The most signal-enriched region is the 1j1b, but the contribution from tW is still only 20% that of tt. For the inclusive measurement, we take advantage of the information from various regions (1j1b, 2j1b, and 2j2b), whereas for the differential measurement, only the 1j1b region is used. An additional selection criterion is applied to enhance the signal-to-background ratio in the differential measurement. Figure 3 (right) shows the distribution of the number of loose jets in the 1j1b events. The signal-to-background ratio is larger for events with zero loose jets. To minimise the relative contribution from the tt background, the events in the 1j1b region with zero loose jets are used for the differential measurement. Signal events contribute up to 22% of the total expected events in that region.  : Left: the number of events from data (points) and predicted from simulation (coloured histograms) before the fit in the e ± µ ∓ sample as a function of the number of jets and b-tagged jets. Right: the number of loose jets per event in the e ± µ ∓ sample from the 1j1b region. The vertical bar on the points shows the statistical uncertainty in the data. The hatched band represents the sum of the statistical and systematic uncertainties before the fit. The lower panels show the ratio of data to the sum of the expected yields. The MC simulations are normalised to their theoretical cross section values as described in Section 2.

Methodology for the inclusive measurement
After the baseline event selection is performed, the presence of tt events is considerably larger than the tW signal in all event categories, as can be seen in Fig. 3 (left). The region with the best signal-to-background ratio is 1j1b, which consists mainly of tW and tt events. This region is used together with the 2j1b, which also contains a significant tW contribution, and 2j2b in a maximum likelihood (ML) fit to extract the tW signal. The 2j2b region is dominated by tt events and is used in the fit to constrain this background.
As there is no single observable that gives strong discrimination between tt and tW events, two independent boosted decision trees (BDTs [61, 62]), one for the 1j1b region and the other for the 2j1b region, are trained to discriminate between the tW signal and tt background. The BDTs outperform single-tree classifiers [63] by training a set of trees (forest) and taking their weighted vote as the prediction. Each tree is derived from the same training ensemble by reweighting their events to mitigate the statistical fluctuations and increase the overall stability. In this analysis, the BDT implementation is provided by the TMVA package [62], using gradient boost [62] as the boosting algorithm. The BDTs are trained and tested using a set of simulated samples that are statistically independent from the ones used in the signal extraction.
The input variables used in the BDTs are chosen depending on how well the MC simulation models the data and on their discrimination power. For the BDT in the 1j1b, the list of variables used in the training in order of importance are: • p T (e ± , µ ∓ , j): the magnitude of the transverse momentum of the dilepton + jet system.
• C(e ± , µ ∓ , j): centrality, which is defined as sin θ, where θ is the polar angle of the total momentum of the system.
• Leading loose jet p T : if there are no loose jets, this variable is set to 0.
• Jet p T .
• Presence of loose jets in the event: the result is either yes or no.
The order of importance is determined by counting how often each variable is used to split the decision tree nodes. The counts are weighted by the square of the separation gain achieved by the variable and by the number of events in the node. Figure 4 shows the data-MC agreement of the four most discriminating variables in the BDT for 1j1b, where good agreement is observed. The same applies for the remaining distributions of the presence of loose jets and the jet p T , and for the input variables of the BDT trained in the 2j1b region.
The input variables listed in order of importance in the BDT for 2j1b are: • ∆R( 1 , j 1 ): separation in η-ϕ space between the leading lepton and leading jet, where ϕ is the azimuthal angle. • ∆R( 12 , j 12 ): separation in η-ϕ space between the dilepton and dijet systems.
Three distributions are considered in the ML fit: the BDT output distributions in the 1j1b and 2j1b regions, and the p T distribution of the subleading jet in the 2j2b region. This last variable is sensitive to JES variations and is useful in constraining this systematic uncertainty. The binning of the BDT output distribution is chosen such that each bin contains about the same number of tt events. This avoids the presence of low-statistic bins in the background estimation, helping to constrain the systematic uncertainties. The fit is performed simultaneously with the three regions. The uncertainties are included using nuisance parameters, one for each source of systematic uncertainty, correlated across all regions, which parameterise the effect of the given source on the expected signal and background yields.
The likelihood used in this ML fit, L( n|µ, θ), is a function of the signal strength µ, defined as the ratio of the measured and expected SM cross sections µ = σ tW /σ SM tW , the observed number of events in each bin n, and a set of nuisance parameters θ that parameterise the systematic uncertainties. It is constructed as the product of Poisson probabilities corresponding to the number of events in each bin of the distributions. Additionally, the systematic uncertainties are introduced in the likelihood multiplied by the prior p j (θ j ) of each nuisance parameter θ j . In this analysis, a log-normal probability density function is used for nuisance parameters affecting the normalisation of different signal and background processes, and a Gaussian distribution is employed for the shape uncertainties. The best fit value for µ is obtained by maximising the likelihood function with respect to all its parameters. The ML fits are implemented with software based on ROOSTATS [64]. The MC statistical uncertainties are incorporated using the Barlow-Beeston method [65,66]. Other uncertainties that might affect both the normalisation  and shape of the distributions are introduced using specific nuisance parameters [66,67] with a Gaussian function prior.

Methodology for the differential measurement
The spectra of observables are distorted by the response and acceptance of the detector. Unfolding techniques must, therefore, be used to determine the actual distributions without the detector effects so that these can be directly compared with theoretical predictions. The data affected by these distortions are said to be at detector level. The parton level is defined by the particles produced after the generation of the hard-scattering process. If the information from the PS and hadronisation simulations is added, this then gives the particle level.
The measured distributions are unfolded from the detector level to the particle level. Unfolding to particle level instead of parton level reduces the migration and efficiency corrections, and allows the fiducial region to be defined in close correspondence with the event selection of the analysis.
The identification of particle-level objects is summarised in Table 1. These objects are constructed using stable (i.e. with a lifetime larger than 30 ps) generated particles following the conventions given in Ref. [68]. Muons and electrons not coming from hadronic decays (prompt leptons) are "dressed" by taking into account the momenta of nearby photons within a ∆R < 0.1 cone, where ∆R is the separation in η-ϕ space between the muon or electron and the photon. Jets are clustered from all of the stable particles excluding prompt electrons, prompt muons, prompt photons, and neutrinos, using the anti-k T algorithm with a distance parameter of R = 0.4. The information of the intermediate hadrons and τ leptons is preserved inside the jets and used to determine whether a jet originates from the fragmentation of a heavyflavour quark (bottom or charm) or whether it is a decay product of a τ lepton. With these requirements, a fiducial region is defined as described in Table 2. Requiring exactly one b jet reduces the potential to have events from the doubly-resonant diagrams. Those events, more affected by the interference between processes, are expected to have a larger jet multiplicity. The differences between the various models used to treat the interference are expected to be higher when the presence of events from doubly-resonant diagrams is larger, and vice versa. For the unfolded distributions in the fiducial region, as shown in Fig. 8, all MC simulations show very similar distributions. Therefore, this choice of fiducial region reduces these effects and the accompanying modelling uncertainty associated with the interference treatment (see Section 6). The signal extraction and the unfolding for the fiducial differential cross section measurement are performed with an ML fit designed as follows. The parameters of interest are the strengths of the signal process in each bin of the particle-level distribution. The signal sample is divided into as many contributions as there are particle-level bins. There is a 7% contribution to the signal region from nonfiducial events. We treat these events as a background so the strength associated with them is not a parameter of interest. One nuisance parameter for each systematic uncertainty source is added to the fit.
The differential cross section is measured as a function of the following physical observables: • leading lepton p T ; • jet p T ; • ∆ϕ(e ± , µ ∓ ): the azimuthal angle difference between the two leptons; • p z (e ± , µ ∓ , j): the longitudinal momentum component of the dilepton + jet system; • m(e ± , µ ∓ , j): the invariant mass of the dilepton + jet system; and, • m T (e ± , µ ∓ , j, p miss T ): the transverse mass of the dilepton + jet + p miss T system. For a collection of particles with transverse momentum p T,i , m T is defined as: (1) The first two variables shown above provide information on the kinematic properties of the events. The ∆ϕ(e ± , µ ∓ ) variable probes the kinematic and polarisation correlations between the top quark and W boson. The p z distribution can be used to study the boost of the tW system. The last two variables, the dilepton + jet invariant mass and m T , are sensitive to the invariant mass of the tW system. The distributions from the data and simulation for these six variables in the signal region are shown in Fig. 5. As in the case of Figs. 3 and 4, overall there is good agreement within the uncertainties between the data and simulation, though the data are consistently lower than the predicted values.
The measurement is performed using all three years of data taking. The detector response is estimated using the response matrices, which are similar for the three years, with the matrices being almost diagonal. Thus, the measurement is performed directly using the combined data set and without need of regularisation. After the ML fit, the result is normalised to the fiducial cross section (obtained from the summation of the contents of the bins), and the bin width. The uncertainties are propagated taking into account the correlations across bins after the fit. The Asimov data set has been used to verify the closure and performance of the unfolding procedure.

Systematic uncertainties
In addition to statistical uncertainties, the measurements of the inclusive and differential tW cross sections are affected by systematic uncertainties that originate from both detector effects and theoretical assumptions. Each source of systematic uncertainty is assessed individually by suitable variations of the simulations or by variations of parameter values in the analysis within their estimated uncertainties. The systematic uncertainties are introduced in the ML fits as nuisance parameters. All experimental sources are applied to all processes.  The following text describes the sources considered for both the inclusive and differential analyses. Since we are considering data and simulation samples from different years, we indicate whether each uncertainty is correlated or not from year to year. In the case of the modelling uncertainties, we also indicate if they are correlated among the processes or not. Both normalisation and shape effects of all sources are considered apart from the background normalisation and integrated luminosity, which have only rate uncertainties. Effects of all sources in the estimation of the number of events that are not in the fiducial region are taken into account in the maximum likelihood fit, as well as their correlation with other uncertainties. The uncertainty is partially correlated across the years.

L1 prefiring:
In 2016-2017, the L1 trigger from the electromagnetic calorimeter from the forward endcap region (|η| > 2.4) of the CMS detector showed a gradual shift in the timing of its inputs. This caused a effect known as "prefiring", where particles were assigned to previous collisions. This inefficiency only affects events having jets with high pseudorapidity (2.4 < |η| < 3.0) and high p T (>100 GeV). Events from these years are corrected through a reweighting determined from an unbiased data sample. An uncertainty equal to 20% of the correction is taken. This uncertainty source is uncorrelated across the two years.

Modelling uncertainties
The impact of theoretical assumptions in the modelling is determined by repeating the analysis and replacing the standard POWHEG + PYTHIA8 tt or tW simulation by dedicated simulation samples with altered parameters, or by reweighting the nominal samples.

Matrix element (ME) scales:
The uncertainty in the modelling of the hard process is only considered for tt and tW events and is assessed by changing independently the µ R and µ F scales in the POWHEG sample by factors of 2 and 0.5 relative to their common nominal value. Unphysical variations of µ R and µ F , where the nominal values are shifted in opposite directions, are not considered. This uncertainty is correlated across years, and is performed separately for tt and tW events.

Parton shower (PS):
To take into account the PS uncertainties, different effects are studied: • Underlying event: PYTHIA parameters are tuned to the measurements of the underlying event [36,77]. These account for nonperturbative QCD effects. They are varied up and down in simulated tt and tW events. This variation is correlated across years and between tt and tW events. • ME/PS matching: the uncertainty in the combination of the ME calculation with the PS in simulated tt events is estimated from the variation, within its uncertainties, of the POWHEG parameter h damp = 1.379 +0.926 −0.505 m t [36,78]. This parameter regulates the damping of real emissions in the NLO calculation when matching to the PS [77]. This variation is correlated across years and is only considered for tt events.
• Initial-and final-state radiation scales: the PS scale used for the simulation of the initial-and final-state radiations is varied up and down by a factor of two and only considered for tt and tW events. These variations are motivated by the uncertainties in the PS tuning [77]. This variation is correlated across years for the final-state radiation in the tt and tW events, and is treated separately for initial-state radiation. • Colour reconnection: the parameterisation of colour reconnection has been studied in Ref. [79]. A simulation including colour reconnection of early resonant decays (ERD) is used as the reference model. The uncertainties that arise from ambiguities in modelling are estimated by comparing with two alternative models of colour reconnection: a model with string formation beyond leading colour, and a model in which the gluons can be moved to another string [80]. All models are tuned to measurements of the underlying event [77,78]. The different models are included in the ML fits. This variation is correlated across years and between tt and tW events.
PDFs and α S : The uncertainty from the choice of PDF set is determined by reweighting the samples of simulated tt and tW events using 100 NNPDF3.1 replicas [33]. Since they represent the contents of a diagonalised Hessian matrix, the variations are summed quadratically. We then quadratically add the uncertainty in the α S parameter [33]. This uncertainty is correlated across years and between tt and tW events.

Top quark mass:
The nominal m t of 172.5 GeV is modified by ±1 GeV and this is propagated to the MC simulations. This corresponds to twice the uncertainty in m t from CMS [81]. The difference with respect to the nominal results is taken as the uncertainty and is only considered for tt and tW events. This variation is correlated across years and between tt and tW events.
Top quark p T : Previous measurements of the differential cross section for tt production have shown that the top quark has a lower average p T value than predicted by the POWHEG simulation [82-84]. Scale factors are obtained by comparing the generated distributions of the top quark p T with data unfolded to parton level. The difference between corrected and uncorrected shapes is taken as the uncertainty associated with the mismodelling of the top quark p T , leaving the nominal tt events untouched. This variation is correlated across years and only affects tt events.

DR / DS different methods:
The difference between the DR and DS methods used to estimate the uncertainties in the tW simulation. This variation is correlated across years.

Background normalisation uncertainties
An uncertainty of 4% [85] is taken as the uncertainty in the normalisation of the inclusive tt cross section. For the ttV, VV, and the non-W/Z backgrounds, a normalisation uncertainty of 50% is used, as in Ref. [18]. Using the differences between the data and simulations in regions of phase space close to the signal region, a 10% uncertainty is used for the DY background.

Inclusive measurement
The measured value for µ = σ tW /σ SM tW is obtained by maximising the likelihood function with respect to all its parameters. The fit is performed using the BDT discriminants in the 1j1b and 2j1b regions and the subleading jet p T distribution in the 2j2b region. The resulting signal strength is consistent with the SM expectations, corresponding to an inclusive cross section of σ tW = 79.2 ± 0.9 (stat) +7.7 −8.0 (syst) ± 1.2 (lumi) pb.
The distributions of the BDT discriminants in the 1j1b and 2j1b regions and the subleadingjet p T distribution in the 2j2b region after the fit are shown in Fig. 6. The observed and MC predicted event yields in the three regions are given in Table 3.   The 20 largest impacts on the signal strength and the corresponding nuisance parameters are shown in Fig. 7. The impact is defined as the shift ∆μ induced in µ when the nuisance parameter θ is varied by ±1 standard deviation (σ) around its best fit value. The leading uncertainties are the JES corrections, the normalisation of the non-W/Z background, the ME scales of the tW process, and the modelling of the final-state radiation for tt and tW. Figure 7 also shows the pulls of the nuisance parameters, (θ − θ 0 )/∆θ, whereθ and θ 0 are the values after and before the fit of the nuisance parameter θ, and ∆θ its uncertainty before the fit. Several nuisance parameters, such as the b tagging efficiency and jet energy corrections, are significantly constrained in the fit due to their effect on the jet multiplicity. The ME scales of the tt process are also constrained because of the large presence of tt events in all the regions used in the fit. These constraints help in the reduction of the uncertainties during the fit, yielding the most precise measurement of the inclusive tW cross section yet published.   , µ ∓ , j) 0.09 0.12 0.10 0.14 The tW differential cross sections, normalised to the total fiducial cross section σ fid. (obtaining by summing the contents of the particle level bins), are shown in Fig. 8 from the data and the MC predictions. Tables 4 and 5 give the p-values from the χ 2 goodness-of-fit tests done for the six distributions, using the different MC generators and taking into account the full covariance matrix of each result, as well as the statistical uncertainties of the MC predictions. The full covariance matrix is obtained by normalising the covariance matrix extracted from the maximum likelihood fit to the measured fiducial cross section. These tests show a poorer compatibility in the leading lepton p T , m(e ± , µ ∓ , j), and jet p T distributions with the nominal POWHEG +

Normalised fiducial differential cross section measurements
impact (exp.) σ -1 Figure 7: The 20 largest impacts ∆μ (right column) and pulls (θ − θ 0 )/∆θ (middle column) of the nuisance parameters listed in the left column from the ML fit used to determine the inclusive tW cross section. The horizontal bars on the pulls show the ratio of the uncertainties of the fit result to the previous ones, effectively giving the constraint on the nuisance parameter. The label "corr." refers to the correlated component of the uncertainty over the three years and "uncorr." the uncorrelated component for each year. The JES uncertainties are divided into several sources, where "JES-Absolute" groups contributions from scale corrections in the barrel, pileup corrections, and initial-and final-state radiation corrections; "JES-Relative sample" encodes the uncertainty in the η-dependent calibration of the jets; "JES-BBEC1" refers to pileup removal in the barrel (BB) and the first part of the endcaps (1.3 < |η| < 2.5; EC1) and also a contribution from scale corrections in the barrel; and "JES-Flavour QCD" comes from the corrections applied to correct the different detector response to gluon and quark jets.   Predictions from POWHEG (PH) + PYTHIA 8 (P8) DR and DS, POWHEG + HERWIG7 (H7) DR, MADGRAPH5 aMC@NLO (aMC) + PYTHIA 8 DR, DR2, DS, and DS with a dynamic factor are also shown. The grey band represents the statistical uncertainty and the orange band the total uncertainty. In the lower panels, the ratio of the predictions to the data is shown. PYTHIA8 DR prediction than in the other variables. In most of the cases, the p-values determined from the distributions of all the variables are similar for the other expectations. When comparing data to the predictions, there is a slight disagreement in the leading lepton p T and the ∆ϕ(e ± , µ ∓ ) differential cross sections. Other CMS measurements have measured similar tensions in the top quark p T [86] and ∆ϕ(e ± , µ ∓ ) [87] variables. All methods, DR, DR2, DS, and DS with a dynamic factor, show similar compatibility with the measurements, as well as small differences among them. This is also true for the DR predictions interfaced with HERWIG 7. The uncertainties, roughly 10-50% in most cases, depending on the distributions and bins, are dominated by the systematic uncertainties, as in the case of the inclusive measurement.

Summary
Inclusive and normalised differential cross sections for the production of a top quark in association with a W boson are measured in proton-proton collision data at √ s = 13 TeV. The data, corresponding to an integrated luminosity of 138 fb −1 , were recorded by the CMS detector, contain events with an electron and a muon of opposite charge.
For the inclusive measurement, the events have been categorised depending on the number of jets and jets originating from the fragmentation of bottom quarks. The signal is measured using a maximum likelihood fit to the distribution of boosted decision tree discriminants in two of the categories, and to the transverse momentum (p T ) distribution of the second-highest-p T jet in a third category. The measured inclusive cross section is 79.2 ± 0.9 (stat) +7.7 −8.0 (syst) ± 1.2 (lumi) pb, with a total relative uncertainty of about 10%. This is the most precise measurement of this quantity yet published. The leading uncertainty sources are the jet energy scale corrections, the normalisation in the non-W/Z background, the matrix element scales of the tW process, and the modelling of the final-state radiation in the tt and tW processes.
The differential cross section measurements are performed as a function of six kinematical observable of the events in the fiducial phase space corresponding to the selection criteria. The results have relative uncertainties in the range of 10-50%, depending on the measured observable, with larger values in the tails of the distributions. The uncertainties are overall systematically dominated, with the modelling sources being the dominant ones. There is overall good agreement between the measurements and the predictions from the different event generators. The different approaches used to simulate the tW events give similar values in all the distributions, which points to small effects of tW/tt interference on these distributions in the defined fiducial region.     [19] ATLAS Collaboration, "Measurement of differential cross-sections of a single top quark produced in association with a W boson at √ s = 13 TeV with ATLAS", Eur. Phys. J. C 78 (2018) 186, doi:10.1140/epjc/s10052-018-5649-8, arXiv:1712.01602.
[78] CMS Collaboration, "Investigations of the impact of the parton shower tuning in PYTHIA 8 in the modelling of tt at √ s = 8 and 13 TeV", CMS Physics Analysis Summary CMS-PAS-TOP-16-021, 2016.
[82] CMS Collaboration, "Measurement of differential cross sections for top quark pair production using the lepton+jets final state in proton-proton collisions at 13 TeV", Phys.