Measurements of the inclusive and differential production cross sections of a top-quark-antiquark pair in association with a $Z$ boson at $\sqrt{s} = 13$ TeV with the ATLAS detector

Measurements of both the inclusive and differential production cross sections of a top-quark-antiquark pair in association with a $Z$ boson ($t\bar{t}Z$) are presented. The measurements are performed by targeting final states with three or four isolated leptons (electrons or muons) and are based on $\sqrt{s} = 13$ TeV proton-proton collision data with an integrated luminosity of 139 fb$^{-1}$, recorded from 2015 to 2018 with the ATLAS detector at the CERN Large Hadron Collider. The inclusive cross section is measured to be $\sigma_{t\bar{t}Z} = 0.99 \pm 0.05$ (stat.) $\pm 0.08$ (syst.) pb, in agreement with the most precise theoretical predictions. The differential measurements are presented as a function of a number of kinematic variables which probe the kinematics of the $t\bar{t}Z$ system. Both absolute and normalised differential cross-section measurements are performed at particle and parton levels for specific fiducial volumes and are compared with theoretical predictions at different levels of precision, based on a $\chi^{2}/$ndf and $p$-value computation. Overall, good agreement is observed between the unfolded data and the predictions.


Introduction
Precise measurements of the properties of the top quark, the heaviest known elementary particle, are an important check of the internal consistency of the Standard Model (SM) of particle physics and could provide hints of possible new physics beyond the SM (BSM). The production cross sections of top-quark-antiquark pairs (¯), single top quarks, as well as the top-quark mass, have been measured with a great level of precision [1][2][3][4]. The large centre-of-mass energy and luminosity of the Large Hadron Collider (LHC) enable precise and differential cross-section measurements for SM processes with small production rates, such as the associated production of a¯pair and a boson (¯).
The¯production process is particularly interesting, as it provides direct access to the neutral coupling of the top quark to the electroweak (EW) gauge bosons [5,6]. Deviations of the coupling strength of the top quark to the boson ( -coupling) from its SM value might imply the existence of new effects in the EW symmetry breaking mechanism which could be probed in the context of effective field theory (EFT) [7]. Various BSM models predict large deviations of the top quark's EW couplings from the SM value, which were probed by the previous generation of lepton colliders [8,9]. Precise measurements of the inclusive and differential cross sections of the¯process are, thus, of particular interest. Differential cross-section measurements can also offer sensitivity to differences among the predictions from various Monte Carlo (MC) generators and can, therefore, serve as an important input to the tuning of MC parameter values (MC tunes). Furthermore, the¯process is an irreducible background in several searches for BSM phenomena [10,11], as well as in measurements of important SM processes, such as¯production in association with a Higgs boson [12] or single top-quark production in association with a boson [13]. The ATLAS Collaboration measured the inclusive¯cross section using a subset of the LHC Run 2 data, collected in 2015 and 2016 [14] and a first differential measurement of the¯process was carried out by the CMS Collaboration using the 2016 and 2017 data sets [15].
Theoretical predictions of the¯cross section exist at next-to-leading order (NLO) with the resummation of soft gluon corrections computed at next-to-next-to-leading-logarithm (NNLL) precision [16,17] in perturbative quantum chromodynamics (QCD) with added EW corrections. Recently they have been matched to the complete set of NLO corrections of both QCD and EW origin [18,19] using the M G 5_aMC@NLO (MG5_ MC@NLO) framework [20].
This paper presents measurements of the inclusive and differential¯production cross section in final states with three or four isolated leptons (electrons or muons) with the ATLAS detector [21] at the LHC. The measurements were performed with √ = 13 TeV proton-proton ( ) collision data collected during Run 2 of the LHC (2015-2018) and corresponding to an integrated luminosity of 139 fb −1 . The boson is identified by targeting events featuring an oppositely charged electron ( ) or muon ( ) pair. The detector signatures resulting from the hadronisation of final-state quarks from the decay of the¯system, in particular those from bottom (anti)quarks, are exploited by constructing target regions with different jet and -jet multiplicities. The inclusive measurement follows an analysis strategy similar to the previous ATLAS¯measurement [14]. The production cross section is extracted by performing a simultaneous maximum-likelihood fit in the targeted analysis regions with the signal normalisation as the parameter of interest. In addition, a set of normalised and absolute differential measurements are presented as a function of different variables which probe the SM predictions for the kinematics of the¯system. Some of these variables are found to be sensitive to potential EFT signals [7], while others are more interesting in the context of MC tuning [22,23]. The differential measurements were performed at both particle and parton level in different fiducial volumes in order to correct for various acceptance effects. For the first time, the 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the -axis along the beam pipe. The -axis points from the IP to the centre of the LHC ring, and the -axis points upwards. Cylindrical coordinates ( , ) are used in the transverse plane, being the azimuthal angle around the -axis. The pseudorapidity is defined in terms of the polar angle as = − ln tan( /2). Angular distance is measured in units of Δ ≡ √︁ (Δ ) 2 + (Δ ) 2 .
the range | | < 2.4 with resistive-plate chambers in the barrel, and thin-gap chambers in the endcap regions.
Interesting events are selected to be recorded by the first-level trigger system implemented in custom hardware, followed by selections made by algorithms implemented in software in the high-level trigger [26]. The first-level trigger accepts events from the 40 MHz bunch crossings at a rate below 100 kHz, which the high-level trigger reduces in order to record events to disk at about 1 kHz.

Data and simulated event samples
The analysis is performed on data from collisions at √ = 13 TeV delivered by the LHC and recorded by the ATLAS detector between 2015 and 2018. The bunch spacing for this data-taking period was 25 ns with an average number of interactions per bunch crossing ('pile-up') which varies by year and LHC beam conditions and was in the range from 10 to 70 for almost all events. After requirements on the stability of the beams, the operational status of all ATLAS detector components, and the quality of the recorded data, the total integrated luminosity of the data set corresponds to 139 fb −1 . This value is derived from the calibration of the luminosity scale using -beam-separation scans, following a methodology similar to that detailed in Ref. [27], and using the LUCID-2 detector [28] for the baseline luminosity measurements.
The data were collected using a combination of single-electron and single-muon triggers, with requirements on the identification, isolation, and T of the leptons to maintain efficiency across the full momentum range while controlling the trigger rates [26]. For electrons the trigger thresholds were T = 26, 60 and 140 GeV, whereas for muons, the thresholds were T = 26 and 50 GeV. 2 Identification and isolation requirements were applied to the triggers with the lower T thresholds [29-31].
Signal and background processes considered in this analysis were modelled using simulated MC samples. The effect of pile-up interactions was modelled by overlaying the hard-scattering event with simulated minimum-bias events generated with P 8. 186 [32] using the NNPDF2.3 set of parton distribution functions (PDFs) [33] and the A3 set of tuned MC parameters [34]. The simulated events were reweighted to match the pile-up conditions observed in the measured data. For processes featuring boson, boson or top-quark production, the , and top-quark masses were set to 80.4 GeV, 91.2 GeV [35] and 172.5 GeV, respectively. The decays of bottom and charm hadrons were simulated using the E G program [36]. The MC samples were either processed through a full simulation of the ATLAS detector based on G 4 [37,38] or a fast simulation (A F 2) relying on parameterised showers in the calorimeter [39,40].
The¯signal process was modelled using the MG5_ MC@NLO 2.3.3 [41] generator together with E G 1.2.0, which provided a matrix element (ME) calculation at NLO in the strong coupling constant ( s ) with the NNPDF3.0 [42] PDF set. The functional form of the renormalisation and factorisation scales ( r and f ) was set to r,f = 0.5 × √︃ 2 i + 2 T,i , where runs over all final-state particles generated from the ME calculation. The¯ * contribution and the / * interference were included with dilepton invariant masses ( ℓℓ ) down to 5 GeV. Top-quark decays were simulated at leading order (LO) using M S [43,44] to preserve all spin correlations. The events were interfaced with P 8.210 [45] for the parton shower and hadronisation, using the A14 set of tuned parameters [46] and the NNPDF2.3 PDF set. 2 Lower T thresholds of 24 GeV for electrons and 20 GeV for muons were applied for 2015 data.
The SM theoretical prediction of the production cross section for the¯process, including all boson decay modes and taking into account the¯ * contribution and the / * interference, is¯= 0.88 +0.09 −0.10 pb and includes NLO QCD+EW corrections [47]. This value is an off-shell extension of a cross-section calculation of¯= 0.84 +0.09 −0.10 pb, which was reported in Ref. [48] (based on Ref. [49]). The uncertainties are due to the QCD scales, the proton PDFs, and s . The measured differential cross sections are compared with theoretical expectations obtained with different generators. Alternative¯samples were simulated with S 2.2.1 [50] generator at NLO QCD accuracy, using both inclusive and multi-leg set-ups. In both cases, dynamic r and f scales were used as in the nominal samples. The default S 2.2.1 parton shower was used together with the NNPDF3.0 PDF set [42]. The multi-leg sample was simulated using the MEPS@NLO prescription [51][52][53][54] with up to one additional parton at NLO and with a merging scale of 30 GeV. Another sample was generated with the same MG5_ MC@NLO and E G versions as the nominal sample but using a different MC program for the modelling of the parton-shower and hadronisation: H 7 [55,56] instead of P 8. In addition, two alternative samples with the same settings as the nominal sample, but using a set of variations of the A14 tune's parameters (A14 eigentune variation Var3c [46]), were employed to evaluate the uncertainty associated with the amount of initial-state radiation (ISR).
The production of a single top quark or antiquark in association with a boson and one extra parton ( ) was simulated using the MG5_ MC@NLO 2.3.3 generator at NLO QCD with the NNPDF3.0 PDF set. The events were interfaced with P 8.230 using the A14 tune and the NNPDF2.3 PDF set. The sample also includes off-shell decays to dilepton pairs with invariant masses in the range ℓℓ > 30 GeV. Single top-quark or top-antiquark production in association with both a and a boson ( ) was simulated at NLO with MG5_ MC@NLO 2.2.2 and the NNPDF3.0 PDF set, using P 8.235 for the parton-shower simulation. The interference between¯and was removed following a diagram removal approach referred to as the DR1 scheme [57].
Events featuring the production of a¯pair in association with a or Higgs boson (¯and¯) were generated using NLO QCD MEs in MG5_ MC@NLO 2.3.3 (for¯) or 2.6.0 (for¯) with the NNPDF3.0 PDF set and showered with P 8.210 or 8.230 using the A14 tune. MC samples featuring Higgs production in association with a or boson ( + / ) were generated at LO with P 8.186 using the A14 tune and the NNPDF2.3 PDF set.
Diboson processes featuring the production of three charged leptons and one neutrino or four charged leptons ( + jets or + jets, respectively) were simulated using the S 2.2.2 generator. In this set-up, multiple MEs were matched and merged with the S parton shower based on the Catani-Seymour dipole factorisation scheme [58,59] using the MEPS@NLO prescription [51][52][53][54]. The virtual QCD corrections for MEs at NLO accuracy were provided by the O L library [60,61]. Samples were generated using the NNPDF3.0 PDF set, along with the dedicated set of tuned parton-shower parameters developed by the S authors. The / + jets events with up to one additional parton were simulated at NLO, whereas events with two or three partons were simulated at LO precision.
The production of three or four top quarks (¯and¯¯) and the production of a¯pair with two bosons (¯) were simulated at LO using MG5_ MC@NLO 2.2.2 interfaced to P 8.186 with the A14 tune and the NNPDF2.3 PDF set. Fully leptonic triboson processes ( , , and ) with up to six leptons in the final states were simulated with S 2.2.2 and the NNPDF3.0 PDF set. Final states with no additional partons were calculated at NLO, whereas final states with one, two or three additional partons, were calculated at leading order.

Object reconstruction
The following subsections describe the definitions of final-state objects at reconstruction (detector), particle, and parton levels, which are used to characterise the final-state event topologies and to define the phase-space regions for the cross-section measurements.

Reconstruction of detector-level objects
Electron candidates are reconstructed from clusters of energy deposits in the electromagnetic calorimeter that are matched to a track in the ID. They are required to satisfy T > 7 GeV, | | < 2.47 and a 'Medium' likelihood-based identification requirement [62,63]. Electron candidates are excluded if their calorimeter clusters lie within the transition region between the barrel and the endcap of the electromagnetic calorimeter, 1.37 < | | < 1.52. The track associated with the electron must pass the requirements | 0 sin( )| < 0.5 mm and | 0 |/ ( 0 ) < 5, where 0 describes the longitudinal impact parameter relative to the reconstructed primary vertex, 3 0 is the transverse impact parameter relative to the beam axis, and ( 0 ) is the uncertainty on 0 .
Muon candidates are reconstructed from MS tracks matched to ID tracks in the pseudorapidity range of | | < 2.5. They must satisfy T > 7 GeV along with the 'Medium' identification requirements defined in Ref. [64,65]. This criterion defines requirements on the number of hits in the different ID and MS subsystems and on the significance of the charge-to-momentum ratio / . In addition, the track associated with the muon candidate must have | 0 sin( )| < 0.5 mm and | 0 |/ ( 0 ) < 3.
Isolation criteria are applied to the selected electrons and muons. For electrons, the scalar sum of the T of tracks within a variable-size cone around the electron, excluding tracks originating from the electron itself, must be less than 6% of the electron T . The track isolation cone radius Δ = √︁ (Δ ) 2 + (Δ ) 2 is given by the smaller of Δ = 10 GeV/ T and Δ = 0.2. In addition, the sum of the transverse energy of the calorimeter topo-clusters 4 in a cone of Δ = 0.2 around the electron is required to be less than 6% of the electron T , excluding clusters originating from the electron itself. For muons, the scalar sum of the T of tracks within a variable-size cone around the muon, excluding its own track, must be less than 6% of the muon T , with the track isolation cone radius being given by the minimum of Δ = 10 GeV/ T and Δ = 0.3.
Jets are reconstructed from topo-clusters, using the anti-jet clustering algorithm [67] as implemented in the F J package [68], with a radius parameter of = 0.4. They are calibrated through the application of a jet energy scale derived from 13 TeV data and simulation [69]. Only jet candidates with T > 25 GeV and | | < 2.5 are considered in this analysis. To mitigate the impact of jets arising from additional collisions in a given bunch crossing, an additional selection criterion using a likelihood-based 'jet-vertex-tagging' (JVT) discriminant is applied to jets with T < 120 GeV and | | < 2.5 [70]. 3 The primary vertex is defined as the vertex with the highest scalar sum of the squared transverse momenta of associated tracks with T > 400 MeV. 4 Topo-clusters are constructed from calorimeter cells that are combined using a topological clustering algorithm [66]. These objects provide a three-dimensional representation of energy depositions in the calorimeter and implement a nearest-neighbour noise suppression algorithm. The resulting clusters are classified as either electromagnetic or hadronic based on their shape, depth and energy density. Energy corrections are applied to the topo-clusters in order to calibrate them to the appropriate energy scale for their classification.
Jets containing -hadrons (' -jets') are identified (tagged) by the MV2c10 -tagging algorithm [71]. The algorithm uses a multivariate discriminant with quantities such as the impact parameters of associated tracks, and well-reconstructed secondary vertices. For the differential measurements, a selection that provides an 85% efficiency for identifying -jets in simulated¯events, with rejection factors against light-flavour jets and -jets of 28 and 2, respectively is used. Different calibrated -tagging working points (WPs), corresponding to different -jet selection efficiencies are used for the inclusive cross-section measurement. A method where exclusive bins in the -tagging discriminant corresponding to different identification efficiencies is employed. In the following, this approach is referred to as pseudo-continuous -tagging (PCBT).
Scale factors are applied as weights to MC events to correct for the mismodelling of efficiencies associated with the reconstruction, identification and trigger selection of electrons and muons, as well as the JVT and -tagging requirements for jets. The -tagging scale factors are derived from a pseudo-continuous calibration as outlined above.
The missing transverse momentum is defined as the negative vector sum of the transverse momenta of all selected and calibrated physics objects (electrons, photons, muons and jets). Low-momentum tracks from the primary vertex that are not associated with any of the reconstructed physics objects described previously are also included as a 'soft term' in the calculation [72]. The magnitude of the missing transverse momentum vector is denoted as miss T . Ambiguities can arise from the independent reconstruction of electron, muon and jet candidates in the detector. A sequential procedure (overlap removal) is applied to resolve these ambiguities and, thus, avoids a double counting of physics objects. 5 It is applied as follows. If an electron candidate and a and muon candidate share a track, the electron candidate is removed. Jet candidates within a distance of Δ , = √︁ (Δ ) 2 + (Δ ) 2 = 0.2 from a remaining electron candidate are discarded. If multiple jets are found in this area, only the closest jet is removed. If the electron-jet distance is between 0.2 and 0.4, the electron candidate is removed. If the Δ , between any remaining jet and a muon candidate is less than 0.4, the muon candidate is removed if the jet has more than two associated tracks, otherwise the jet is discarded.

Particle-and parton-level objects and definitions of fiducial regions
In the measurements of differential¯cross sections, the measured spectra are corrected for detector effects to so-called particle and parton levels using an unfolding procedure.
Parton-level objects were obtained from the MC record of the¯event. The top quarks (antiquarks) and bosons were selected after final-state radiation and just before their corresponding decay, → or → ℓℓ, respectively. The leptons originating from and bosons were selected directly from the decay vertex of the parent bosons.
The parton-level fiducial volumes for final states with three or four leptons were defined as follows: the boson was required to decay into leptons, whereas the¯pair was required to decay via¯→ + −¯, with either one or both bosons subsequently decaying leptonically.
The particular decay chains of interest are therefore: 5 The lepton candidates considered for the overlap removed are electrons selected with the 'Loose' identification [62,63] and muons selected with the 'Medium' identification requirement, but before placing isolation requirements on the leptons.
or a three-lepton final state, and or a four-lepton final state. The two decay chains for the three-lepton final state differ only in terms of which of the top quark or antiquark decayed hadronically.
The invariant mass of the lepton pair originating from the boson has to be within a range of ±15 GeV around the nominal boson mass (91.2 GeV) [35] to be sensitive to on-shell decays. Prompt -leptons from or boson decays were not included in the parton-level fiducial volume, regardless of their subsequent decay into leptons or hadrons. No kinematic requirements were applied to the parton-level objects in order that the unfolded differential results at parton level can be more easily compared with fixed-order predictions.
Particle-level objects in simulated events were defined using quasi-stable particles with a mean lifetime greater than 30 ps originating from collisions. They were selected after hadronisation but before the interaction of these particles with the detector components or consideration of pile-up effects. Electrons and muons were required to not to have originated from a hadron in the MC generator event record, whether directly or through a -lepton decay. This ensures that they originated from the boson or the bosons from top-quark decays, without requiring a direct match with the parent boson. The four-momenta of the bare leptons were modified ('dressed') by adding the four-momenta of all radiated photons within a cone of size Δ = 0.1, excluding photons from hadron decays, to take into account final-state photon radiation. Particle-level jets were reconstructed with the anti-algorithm with a radius parameter of = 0.4 applied to all stable particles, but excluding both the neutrinos originating from the boson or top quarks and the selected electrons, muons and photons used in the definition of the charged leptons. If -hadrons with T > 5 GeV were found in the MC event record, they were clustered in the stable-particle jets with their energies set to a negligible positive value ('ghost-matching') [73]. Particle-level jets containing one or more of these -hadrons were considered to originate from a -quark. The particle-level missing transverse momentum was defined as the vector sum of the transverse momenta of all neutrinos found in the simulation history of the event, excluding those originating from hadron decay.
The particle-level fiducial volume for final states with three or four leptons was defined by applying the same T and | | requirements as those summarised for the detector-level selection in Tables 1 and 2, respectively. In addition, the same requirements were placed on the number of jets and -jets, and the same requirements were placed on the opposite-sign-same-flavour (OSSF) lepton pair, along with the same invariant mass requirement for the -mass window as that used in the detector-level selection described in the following section (| ℓℓ − | < 10 GeV). 6 For the particle-level fiducial volume for four-lepton final states, only one OSSF lepton pair was required within the -mass window -the remaining lepton pair was required only to be opposite-sign. Only one of the jets is required to have originated from a -quark.

Event selection and signal regions
Only final states with exactly three or four isolated leptons (electrons or muons) and at least two jets, as defined in Section 4, are considered. All selected events are required to pass a single-electron or single-muon trigger. In addition, at least one reconstructed lepton with T > 27 GeV is required to be matched to the lepton reconstructed by the trigger algorithm and to be of the same flavour. Different signal regions are defined and optimised to achieve the best sensitivity to¯production with one or both top quarks decaying via → → ℓ ℓ . Furthermore, the regions are designed to contain a sufficient number of signal events in order to reduce the statistical uncertainties of the differential¯cross-section measurements. The signal regions are referred to as 'trilepton' (3ℓ) and 'tetralepton' (4ℓ) signal regions, depending on the number of reconstructed leptons, and are meant to target events with one or two prompt leptons, respectively from the¯decay.

Trilepton signal regions
A summary of the definitions of the trilepton signal regions is provided in Table 1. The requirement on the minimum transverse momentum of the leading, sub-leading and third lepton is 27, 20 and 20 GeV, respectively. The sum of the three lepton charges is required to be ±1. The OSSF lepton pair with the invariant mass closest to the boson mass is considered to originate from the decay and its invariant mass (labelled as ℓℓ ) is required to be compatible with the mass of the boson (| ℓℓ − | < 10 GeV). Furthermore, all OSSF lepton combinations are required to have OSSF > 10 GeV to remove contributions arising from low-mass resonances. Additional requirements are imposed on the total number of reconstructed jets ( jets ) and -tagged jets ( -jets ) in the event.
Different -jet requirements are used for the inclusive and differential cross-section measurements. For the inclusive measurement, a combination of two orthogonal regions which use different -tagging WPs (PCBT, see Section 4) with 60% and 70% efficiency is employed. The tighter b-tagging WPs are used to suppress the + jets background and reduce its impact on the overall precision of the measurement. The two regions, referred to as 3ℓ--1 4 -PCBT and 3ℓ--2 3 -PCBT, are kept distinct from one another during the fitting procedure used to perform the cross-section measurement. For the differential measurements, a looser, fixed WP corresponding to an 85% efficiency is used in order to increase the data statistics. Table 1: The definitions of the trilepton signal regions: for the inclusive measurement, a combination of the regions with pseudo-continuous -tagging 3ℓ--1 4 -PCBT and 3ℓ--2 3 -PCBT is used, whereas for the differential measurement only the region 3ℓ--2 3 with a fixed -tagging WP is employed.

Tetralepton signal regions
The definitions of the tetralepton signal regions are summarised in Table 2. The requirement on the transverse momentum of the four leading leptons in all regions is 27, 20, 10 and 7 GeV, respectively. As in the case of the trilepton signal regions, all events are required to have at least one OSSF lepton pair with an invariant mass satisfying | ℓℓ − | < 10 GeV. Furthermore, the remaining leptons which are not associated with the boson (non-) are required to have opposite charges, such that the sum of the four lepton charges is zero. As in the trilepton selection, a requirement that all OSSF lepton combinations satisfy OSSF > 10 GeV in order to suppress background contributions from low-mass resonances is applied.
The tetralepton signal regions are separated into different-flavour (DF) and same-flavour (SF) signal regions, according to the -jet multiplicities and the flavour composition of the non-lepton pair. The + jets background is suppressed by setting requirements on the jet and -jet multiplicities, as well as by applying cuts on miss T and the invariant mass of the non-lepton pair ( nonℓℓ ) in the case of the SF regions. In the SF regions, events with nonℓℓ close to the mass are accepted, but the miss T requirement is increased to reduce the + jets background. If nonℓℓ is not close to the mass, the miss T cut is relaxed. For the inclusive cross-section measurement, the four tetralepton regions are included as separate bins in the fit, whereas for the differential measurements all the events are combined. Unlike the trilepton signal regions, the -jets are all selected using a fixed 85% -tagging efficiency WP. The tetralepton signal region selections are identical for the inclusive and differential measurements.

Background estimation
Several processes can lead to background contaminations in the signal regions. The contributions from SM processes featuring the production of three or four prompt leptons 7 is discussed in Section 6.1, whereas the estimation of processes where at least one of the reconstructed leptons is a fake lepton is explained in Section 6.2.

Prompt lepton background
The dominant SM background processes in the trilepton and tetralepton regions are / + jets production with → ℓℓℓ and → ℓℓℓℓ decays, respectively. The normalisations of these processes are obtained from data and measured in dedicated + jets and + jets control regions (CRs) as defined in Table 3. The CRs are common to both the inclusive and the differential cross-section measurements. Invariant mass requirements on the OSSF lepton pairs are applied to select the bosons expected in both regions. A -jet veto is applied in 3ℓ--CR to suppress the¯contribution and to ensure orthogonality with the trilepton signal region. In 4ℓ--CR, no requirements are placed on the number of jets or -jets. The invariant mass requirements on the two OSSF lepton pairs are sufficient to yield a very high + jets purity in this region. Orthogonality with the tetralepton signal regions is ensured through the use of an miss T requirement (20 GeV < miss T < 40 GeV), where the lower bound is set so that the selected events are more similar kinematically to those in the signal regions. The + jets purity in 3ℓ--CR is approximately 80%, while the + jets purity in 4ℓ--CR is approximately 97%.
The event yields in these control regions are extrapolated to the signal regions in accord with simulation.
As the control regions are mostly populated by / plus light-flavour jet events, only the predictions from these light-flavour components in the signal regions are constrained by the observed data yields in the control regions. The / + -and -jet 8 backgrounds are constrained to their MC predictions, but with additional normalisation uncertainties assigned (more details are provided in Section 7.3). Figures 1(a) and 1(b) show, respectively, the T and distributions of the leading lepton for the + jets control region. The T distribution and the number of selected jets in the + jets control region are shown in Figures 2(a) and 2(b). All distributions in the control regions are shown before the simultaneous fit to data is applied (pre-fit).
Another important background in the signal regions is production, which can lead to final states very similar to those of the¯signal. A relevant background process in the trilepton regions is production, which contributes more for lower jet multiplicities. Other background processes, such as¯+ / ,¯, three/four top-quark production, + / or triboson production can also contribute to the signal regions, but are significantly smaller than the other processes mentioned above.
The MC samples used to simulate these processes are described in Section 3. Besides the / plus light-flavour jets background, for which control regions are employed to obtain the normalisation, the contributions from all SM processes leading to three or four prompt leptons are estimated entirely from MC simulation and normalised to their theoretical cross-section predictions.

Fake lepton background
Different types of objects, which are misidentified as leptons, are referred to as 'fake leptons' throughout the rest of the document. In the signal regions of this analysis, this background arises mainly from dileptonic decays where additional non-prompt leptons arise from heavy-flavour hadron decays.
To estimate the contribution of fake leptons in the signal regions, a fully data-driven method, called the 'matrix method' is employed. Descriptions of this technique can be found in Refs. [74,75]. It relies on the prompt and fake leptons having different probabilities of passing the identification, isolation and impact 8 These backgrounds are defined by separating / + jets events into light-flavour jet, -jet and -jet components, depending on whether a -or -hadron is found in the MC event record of any of the selected jets. The shaded band corresponds to the total uncertainty (systematic and statistical) of the total SM prediction. The lower panel shows the ratio of the data to the SM prediction. The results and uncertainties are shown before the fit to data is performed. The category 'other' contains all processes mentioned in Section 3 which are not listed separately. Events with a leading lepton T above 300 GeV are included in the uppermost bin of (a).  Figure 2: Distribution of (a) the T of the leading lepton and (b) the number of selected jets in the + jets control region. The shaded band corresponds to the total uncertainty (systematic and statistical) of the total SM prediction. The lower panel shows the ratio of the data to the SM prediction. The results and uncertainties are shown before the fit to data is performed. The category 'other' contains all processes mentioned in Section 3 which are not listed separately. Events with a leading lepton T above 300 GeV are included in the uppermost bin of (a). parameters requirements. The method uses data events selected with the same criteria as in the signal regions, but with looser lepton selections 9 than the ones defined in Section 4.
An alternative version of the matrix method is described in Ref. [76]. It evaluates the total number of fake electrons and muons entering the signal regions via the maximisation of a likelihood function.
The likelihood function is constructed from a product of Poisson probability functions that represent the numbers of leptons passing different quality criteria for the signal regions. The observed number of leptons selected with the looser criteria and the probabilities (efficiencies) for fake or prompt leptons to satisfy the nominal lepton requirements are fixed, while the expectation values of the Poisson functions -the numbers of fake leptons in the signal regions -are obtained from the likelihood maximisation. This method offers a more robust fake-lepton estimation for statistically limited regions. For the differential measurements, the estimations are performed separately for each bin of the measured variables.
The probabilities of fake leptons to satisfy the nominal lepton requirements (fake-lepton efficiencies) are obtained from data. They are measured separately for electrons and muons in events with exactly two leptons with the same charge (same-sign) and at least one -jet identified at the 85% efficiency WP. The measurements are performed after subtracting the contributions, estimated from MC events, of charge-misidentified electrons and prompt leptons in the same-sign region. It has been checked that the dominant fake lepton source in this region are heavy-flavour hadron decays, as in the signal regions. The fake-lepton efficiencies are approximately 10% for electrons and 15% for muons, with an increase to around 20% for muons with T < 12 GeV and > 35 GeV. The equivalent probabilities for prompt leptons (prompt-lepton efficiencies) are obtained from → ℓℓ simulation and the respective scale factors for electrons or muons. The prompt-lepton efficiencies are in most cases higher than 90% for both the electrons and muons. They increase for larger lepton T values and reach > 98% for T > 35 GeV. Both the fake-and prompt-lepton efficiencies are parameterised as a function of T and | | of the respective lepton.
Systematic uncertainties are assigned to the fake-lepton estimates to account for differences in the relative contributions of the various fake-lepton sources between the signal regions and the regions used for the efficiency measurements. Further uncertainties arise from the subtraction of prompt and chargemisidentified leptons, as well as from the dependencies of the fake-lepton efficiencies on the number of jets/ -jets in the events. The method is also affected by statistical uncertainties arising from the limited number of events in the data sample used to evaluate the fake-lepton yields, as well as the statistical limitations of the efficiency measurements. Similarly to the nominal values, the uncertainties of the fakeand prompt-lepton efficiencies are binned in T and | | of the leptons and propagated to the fake-lepton estimation in the signal regions. The overall uncertainties are approximately 50% for both electrons and muons, but they can fluctuate for the differential measurements, depending on the variable and the kinematic region.

Pre-fit event yields in signal and control regions
To validate the SM background modelling explained in the previous sections, Table 4 presents a comparison between the total expected background prediction and the observed data events in the trilepton and tetralepton signal regions, as well as in the + jets and + jets control regions. The event yields and uncertainties are shown before applying the fitting procedure. The statistical and all systematic uncertainties as explained in Section 7 are considered except the normalisation uncertainties of the processes which are free parameters in the fit. Within the uncertainties, agreement between data and the SM predictions is observed in nearly all regions. Table 4: The observed and expected numbers of events in the trilepton and tetralepton signal regions, as well as in the / + jets control regions. The predictions are shown before the fit to data. The and + jets backgrounds are listed separately for their light-flavour ( ), -jet and -jet components. The category 'fake leptons' refers to the contributions from fake and non-prompt leptons. 'Other' includes the contributions from + / ,¯, three/four top-quark production and triboson processes. Background categories with event yields shown as '-' do not contribute significantly to a region. The indicated uncertainties consider statistical errors as well as all experimental and theoretical systematic uncertainties, except the normalisation uncertainties of the fitted background components.

Systematic uncertainties
The signal and background predictions in all signal regions are affected by several sources of experimental and theoretical systematic uncertainty. These are considered for both inclusive and differential measurements presented in Sections 8 and 9. The uncertainties can be classified into the different categories which are described in the following subsections.

Detector-related uncertainties
The uncertainty in the combined 2015-2018 integrated luminosity is 1.7% [77], obtained using the LUCID-2 detector [28] for the primary luminosity measurements. This systematic uncertainty affects all processes modelled using MC simulations apart from the light-flavour components of the / + jets backgrounds, whose normalisations are taken from data in control regions.
The uncertainty in the reweighting of the MC pile-up distribution to match the distribution in data is evaluated by varying the pile-up correction factors used to perform the reweighting.
Uncertainties associated with the lepton selection arise from the trigger, reconstruction, identification and isolation efficiencies, and the lepton momentum scale and resolution [63][64][65]. They are below 1% for the individual sources and have a total impact of 2-2.5% on the measurements. Uncertainties associated with the jet selection arise from the jet energy scale (JES), the JVT requirement and the jet energy resolution (JER). The JES and its uncertainties are derived by combining information from test-beam data, collision data and simulation [78]. The uncertainties in the JER and JVT increase at lower jet T . The overall effect of uncertainties related to jet selection and calibration is approximately 2%.
The efficiency of the flavour-tagging algorithm is measured for each jet flavour using control samples in data and in simulation. From these measurements, correction factors are derived to correct the tagging rates in the simulation. In the case of -tagged jets, the correction factors and their uncertainties are estimated from data using dileptonic¯events [71]. In the case of -jets, they are derived from jets arising from boson decays in¯events [79]. In the case of light-flavour jets, the correction factors are derived using dĳet events [80]. Sources of uncertainty affecting the -and -tagging efficiencies are evaluated as a function of jet T , including bin-to-bin correlations. The uncertainties in the efficiency for tagging light-flavour jets depend on the jet T and on . An additional uncertainty is assigned to account for the extrapolation of the -tagging efficiency measurement from the T region used to determine the correction factors to regions with higher T . The impact of flavour-tagging uncertainties on the measurements depends on the signal regions and is 2-3% in total.

Signal modelling uncertainties
Different sources of systematic uncertainty in the theoretical predictions of the¯process are considered. To evaluate the effect of r and f uncertainties, the scales used in the ME of the MG5_ MC@NLO + P 8 samples are varied simultaneously, as well as individually, by factors of 2.0 and 0.5 relative to their nominal values. The uncertainty due to the ISR is estimated using a set of variations of the A14 tune's parameter values. Uncertainties associated with the choice of PDF set are evaluated according to the PDF4LHC prescription [81] using eigenvector variations from multiple NLO PDF sets, the effects of which are added in quadrature.
The systematic uncertainty due to the modelling of the parton shower, the hadronisation and the underlying event -called the parton-shower uncertainty in the following -is quantified by employing an alternativē sample generated with MG5_ MC@NLO, but interfaced to H 7 instead of P 8.

Background modelling uncertainties
The normalisation of the + jets and + jets backgrounds with light-flavour jets are obtained from data, as discussed in Section 6.1. The / + jets components with -or -jets are constrained to their MC predictions and normalisation uncertainties of 50% ( / + ) and 30% ( / + ) are assigned to them. These uncertainties are evaluated from data/MC comparisons in + / events [82], but also take into account differences in the heavy-flavour jet fractions between + jets and / + jets events. Modelling uncertainties of / + jets related to the r and f scales and the PDF choice are obtained with the same prescription as for the signal. Uncertainties attributed to the resummation scale and CKKW matching scale [52,54] are evaluated from alternative / + jets samples with variations of these scale choices.
Uncertainties related to the r and f scales and the PDF of the background are evaluated in the same way as for the¯and / + jets samples. An additional uncertainty is assigned to the process to account for the interference between the¯and processes. It is evaluated by switching to an alternative diagram removal scheme (DR1 vs DR2) [57] and obtaining an uncertainty from the differences observed in the signal regions.
Scale and PDF uncertainties of the background are obtained in the same way as for the previously described samples. In addition, a normalisation uncertainty of 30% is assigned, motivated by the measurements of this process presented in Refs. [83,84]. As for the¯signal, the uncertainty due to the ISR is evaluated using alternative samples with P 8 A14 Var3c eigentune variations.
For the fake-lepton background, statistical as well as systematic uncertainties are considered as explained in Section 6. They are evaluated for each signal region independently and applied as normalisation uncertainties of the total fake-lepton background contribution in each region.
For the¯background, a normalisation uncertainty of approximately 10% due to the choice of QCD scales and PDF is used [48]. For processes giving smaller backgrounds, namely + / ,¯,¯, triboson and three/four top-quark production, a conservative overall normalisation uncertainty of 50% is applied.

Results of the inclusive cross-section measurement
The ratio of the measured value of the inclusive¯production cross section to its corresponding SM prediction (¯) is obtained from a simultaneous fit to the numbers of events in the trilepton and tetralepton signal regions (as defined in Tables 1 and 2), as well as the + jets and + jets control regions (defined in Table 3). For trilepton events, only the dedicated regions for the inclusive measurement are included in the fit. The fit is based on the profile-likelihood technique, with a likelihood function as a product of Poisson probability functions given by the observed event yields in the signal and control regions. The value ofā s well as the normalisations of the light-flavour components of the / + jets backgrounds are treated as free parameters in the fit. The systematic uncertainties described in Section 7 are included in the fit as nuisance parameters constrained by Gaussian functions. None of the uncertainty parameters is found to be significantly constrained or pulled by the fit. The calculation of confidence intervals and hypothesis testing is performed using a modified frequentist method as implemented in the R S framework [85][86][87].
Within their uncertainties, the fitted normalisations of the light-flavour components of the + jets and + jets backgrounds are compatible with unity, but can vary by up to 10% from their initial value. The observed and expected total event yields in the signal regions and the / + jets control regions after the combined fit (post-fit) are shown in Figure 3 and detailed in Table 5. The strong anti-correlation between the + and + backgrounds results in a smaller total uncertainty of the fitted SM background expectation in 3ℓ--CR compared with the uncertainties of the individual + and + components.
Comparisons between data and the post-fit SM predictions for some selected variables which offer sensitivity to the quality of the background modelling in the signal regions are also presented. The number of selected jets with T > 25 GeV in signal region 3ℓ--1 4 -PCBT is shown in Figure 4(a). The T of the leading lepton in 3ℓ--2 3 -PCBT is given in Figure 4(b). Figure 5(a) depicts the number of selected jets and Figure 5(b) the T of the leading lepton in the combination of the four tetralepton regions. Figure 6(a) shows the T and Figure 6(b) the rapidity ( ) of the reconstructed boson in the combination of the trilepton and tetralepton regions.   Table 6 summarises the measured¯parameters obtained from the individual fits in the trilepton and tetralepton regions, as well as the value from the combined 3ℓ + 4ℓ fit. The values obtained from the fit in the different regions are compatible within their uncertainties. The 3ℓ-channel events represent the dominant contribution to the combined result, and the individual 3ℓ result can be seen to differ only slightly from that using the combined selections. The total systematic uncertainties in the 4ℓ channel are smaller than those in the 3ℓ channel, but the overall precision is poorer in the 4ℓ channel due to the limited number Table 5: The observed and expected numbers of events in the trilepton and tetralepton signal regions, as well as the / + jets control regions, after the combined fit. The definitions of the background categories are the same as in Table 4. Categories with event yields shown as '-' do not contribute significantly to a region. The indicated uncertainties consider all experimental and theoretical systematic uncertainties as well as the statistical errors. As systematic uncertainties might be correlated between different processes, the individual uncertainties do not necessarily add up in quadrature to the uncertainty of the total SM prediction.     of data events.  The contributions from the relevant uncertainties of the measured cross section are summarised in Table 7. For this table, the uncertainties are grouped into several type-related categories and are shown together with the total uncertainty. As none of the uncertainties show significant asymmetries, they are symmetrised.
The dominant uncertainty sources can be attributed to the¯parton shower, the modelling of the background, and jet flavour-tagging. It should be noted that the uncertainty in the cross section due to the systematic uncertainty on the luminosity is larger than the 1.7% mentioned in Section 7.1, as the luminosity affects both signal and background normalisation. Table 7: List of relative uncertainties of the measured inclusive¯cross section from the combined fit. The uncertainties are symmetrised for presentation and grouped into the categories described in the text. The quadrature sum of the individual uncertainties is not equal to the total uncertainty due to correlations introduced by the fit. 9 Differential cross-section measurements 9.

Description of the observables and reconstruction of the¯system
A set of ten observables were selected for the differential cross-section measurements which probe the kinematics of the¯system. The definitions of these variables are summarised in Table 8. With the exception of the number of reconstructed jets ( jets ), which is unfolded to particle level only, all distributions are unfolded to both particle and parton level. Two of the variables, namely the transverse momentum and the absolute value of the rapidity of the boson ( T and | |), which are sensitive to¯generator modelling and various BSM effects, are defined identically for the trilepton and tetralepton selections. The differential measurements for these variables are therefore performed using an inclusive selection denoted by 3ℓ + 4ℓ.
The jet multiplicity is a natural variable to use to probe the modelling of QCD radiation and hadronisation in MC generators. It is measured separately for the trilepton and tetralepton selections due to the different number of final-state quarks from the decay of the¯system in the two channels. The transverse momentum of the lepton which is not associated with the boson ( ℓ,non-T ) in the trilepton signal regions provides Table 8: Summary of the variables used for the differential measurements. Some variables are considered for the trilepton or tetralepton signal regions only, as indicated. The jet multiplicity is measured for the two topologies separately, whereas for the variables related only to the kinematics of the boson ( T and | |), the trilepton and tetralepton regions are combined. The absolute azimuthal separation between the two leptons associated with the top quarks (|Δ (ℓ + , ℓ − )|) in tetralepton events provides sensitivity to BSM effects modifying the spin correlations between the two top quarks. The transverse momentum of the¯system (T ) is sensitive to the MC modelling of the hard-scattering process as well as the modelling of the QCD radiation in the parton shower.
In order to construct the |Δ ( , lep )| and |Δ ( , lep )| variables in the trilepton regions, the full four-vector of the leptonic top quark from the¯system ( lep ) is required. 10 For both detector-and particle-level quantities the reconstructed miss T (both its magnitude and azimuthal angle), is first attributed to the neutrino from the associated boson decay. The SM value of the boson mass [35] is then used to determine the -component of the neutrino momentum by analytically solving the corresponding quadratic equation. In many cases the solution is ambiguous. For those, both real solutions are considered. For cases in which the discriminant of the quadratic equation is negative, the T of the neutrino is set to the particular value which yields a single solution. In order to form the final top-quark candidate, the reconstructed leptonically decaying boson candidate -or candidates in the case of two neutrino solutions -is added, via a four-vector sum, to the closer (in Δ ) of the two reconstructed jets in the event with the highest output from the -tagging algorithm (MV2c10). At particle level, the two jets which are ghost-matched to a -hadron (as described in Section 4.2) are considered. In the case of only a single such ghost-matched jet, that jet is selected to form the top-quark candidate. Events with two distinct neutrino solutions will have two possible top-quark candidates, so the one with an invariant mass of the -system more consistent with a top-quark decay is chosen.
In the tetralepton channel the¯system is reconstructed in the transverse plane only. The underlying assumption is that the two neutrinos from the¯decay represent the dominant source of missing transverse momentum in the event; the value of the reconstructed miss T can, therefore, be taken to be a reasonable proxy for the vector sum of the neutrino momenta in the transverse plane. Such a partial reconstruction avoids the need to determine the full kinematics of both neutrinos separately, while still allowing the reconstruction of theT and |Δ (¯, )| variables for the differential measurements. The selection of the two -tagged jets is performed analogously to the trilepton case. At detector level, the two reconstructed jets with the highest -tagging score are selected. At particle level, the two jets ghost-matched to a -hadron are selected; in the case of only one ghost-matched jet, the jet with the highest T of those remaining is selected as the second -jet.

Unfolding procedure
To measure the differential cross-section distributions at particle and parton levels in the specific fiducial phase-spaces defined in Section 4.2, an iterative Bayesian unfolding procedure is used [88]. It relies on the Bayesian probability formula starting from a given prior of the particle-or parton-level distribution and iteratively updating it with the posterior distribution. The unfolding is performed using the R U package [89]. The differential¯cross sections are calculated using the following equation: where denotes the variable used for the differential measurement (with the bin-width Δ ), the index indicates the bin at particle (or parton) level and the detector-level bin.
The migration matrix M quantifies the detector response and can be derived from the bin-to-bin migrations of events from particle or parton level to detector level in the nominal¯simulation for each of the considered differential variables. Its inverse, M −1 , is determined through the iterative unfolding procedure. For each , obs denotes the number of observed data events, and bkg is the expected background contribution. The various background contributions are estimated in the same way as for the inclusive measurement (see Section 6). In this case, the / + backgrounds are corrected by normalisation parameters obtained from an inclusive fit based on the combined 3ℓ + 4ℓ channels. A statistics-only version of the fit was performed solely for the extraction of the normalisation parameters in this case. The acceptance corrections, acc , account for events that are generated outside the fiducial phase-space but pass the detector-level selection, whereas the efficiency correction terms, eff , correct for events that are in the fiducial phase-space but are not reconstructed in the detector. In either case, the term 'fiducial' refers to the corresponding type of unfolding being performed -either to parton or particle level. The integrated luminosity is denoted by L. The branching ratio B is that of the¯system to final states with three or four charged electrons or muons, originating directly from either the boson decay or the decay of the bosons from the¯system, and is used to extrapolate the measurements to cover all¯and decays. The branching ratio correction is only applied for the parton-level measurements and corresponds to the decay channels applicable for the fiducial region based on the particular variable involved (see Table 8). The values for B, calculated using inputs taken from Ref. [35], are 0.0193 (3ℓ), 0.0030 (4ℓ), and 0.0223 (3ℓ + 4ℓ). Figure 7 shows the particle-and parton-level migration matrices that are used for the differential crosssection measurements depending on the T of the boson. The matrices are normalised such that the sum of entries in each row is equal to one. The entries in the matrices represent the fraction of events at either particle or parton level in a -axis bin that are reconstructed at detector level in an -axis bin. Thus, the fraction of events in the diagonal elements shows the quality of the resolution for a specific variable. In the case of T , these fractions lie between 90% and 96% for both particle and parton level. For some of the other variables which do not depend only on the boson reconstruction (e.g. jets ,T ), the migrations between bins can be significantly larger and reach a level of 20-25%. Figure 8 depicts the corresponding correction factors as a function of T : eff increases for larger T due to higher lepton reconstruction efficiencies for increasing transverse momenta. It lies between 33% and 43% at particle level and between 10% and 22% at parton level. The values of acc are in all bins higher than 80% for both particle and parton level and show no notable dependence on T .  The matrices quantify the migrations from (a) particle or (b) parton level to detector level. The quoted values are expressed as a percentage. The matrices are normalised such that the sum of any given row is 100%, although small differences may be present due to rounding.
The choice of binning is determined separately for each variable by performing a multi-dimensional scan in order to strike a reasonable balance between three partially competing aspects: retaining a large number of bins; limiting the relative impact from statistical uncertainties of the measured data; and ensuring large values (> 50%) for the diagonal elements of the matrices associated with the bin migrations between particle/parton and detector level. As a result, the binning for the differential measurements differs from that shown in Figures 4-6. The stability of the unfolding procedure is determined by constructing pseudo-data sets by randomly sampling events from the nominal¯MC sample, such that the pseudo-data sets contain approximately the same number of events as in the measured data. So-called 'pull tests' are performed as part of the binning optimisation to verify that the unfolding is stable for the selected number and range of bins. In addition, linear re-weightings are applied to the pseudo-data to test the ability of the unfolding procedure to correct the pseudo-data back to their underlying true spectra, obtained from the MC event record. The number of iterations used in the iterative Bayesian unfolding is also optimised with pseudo-experiments: for each iteration, a 2 per degree-of-freedom (ndf) is calculated by comparing the bin contents of the unfolded pseudo-data with those from the previous iteration. In the case of the first iteration, the unfolded pseudo-data are instead compared with the corresponding generator-level distribution. Iterations are performed until the 2 /ndf value of a given distribution stabilises at a constant value while the statistical uncertainty returned from the unfolding procedure is kept as low as possible. For all variables, the number of iterations used lies between two and five. Systematic uncertainties are propagated to the unfolded distributions by varying the detector-level distributions within the uncertainties and repeating the unfolding procedure. The normalised differential cross sections are obtained by dividing the distributions by the integrated fiducial cross sections, which are computed by adding up the contributions from all bins. The evaluation of systematic uncertainties is performed after the normalisation is done and it is on the same prescriptions employed for the absolute differential measurements.

Results of the differential measurements
The measured differential¯cross sections unfolded to particle and parton levels for the T of the reconstructed boson are presented in Figure 9.
The results are displayed in the seven T bin ranges used when performing the unfolding, with any additional contributions beyond 400 GeV included in the uppermost bin. The relative contributions from statistical and systematic uncertainties in each bin are shown in the theory-to-data ratio panels of the upper figures, where the net effect corresponds to a sum in quadrature of the two. In the lower figures, the same relative contributions are shown as well as a decomposition of the systematic uncertainties into various  Figure 9: Absolute differential¯cross sections measured at (a) particle level and (b) parton level as a function of the transverse momentum of the reconstructed boson. (c,d) show the relative contributions from different categories of systematic uncertainties per bin. The large difference between the -axis scales in (a) and (b) is a result of different efficiency and acceptance corrections between the particle-and parton-level measurements, together with the branching ratio correction of B = 0.0223 for the combined 3ℓ + 4ℓ channels, which is applied only for the parton-level result.
categories. 11 The black data points in the upper figures correspond to the measured unfolded data and error bars representing the sum in quadrature of statistical and systematic uncertainties. The total uncertainty of this measurement is between 20% and 40%, depending on the bin, with the dominant uncertainty arising from the limited number of data events. Other significant sources of systematic uncertainty are associated with¯modelling and -tagging. Figure 10 shows the same set of results for the normalised distributions for this variable. The uncertainties on the normalised cross sections are notably smaller (15-35%) than those of the absolute cross sections because several systematic uncertainties cancel out. The differential cross sections measured in data are compared with the NLO QCD predictions from different¯generators, as described in Section 3. The predictions are shown for MG5_ MC@NLO interfaced to P 8 (red) or H 7 (magenta), as well as for S 2.2.1 inclusive (blue) and multi-leg (green). For the T measurement, the different generators provide very similar predictions.
Results for the other observables described in Table 8 are presented in Figures 11-15. For these variables, only the absolute parton-level differential measurements are shown, with the exception of jets , which is unfolded only to particle level. Additional differential¯predictions at NLO, NLO+NNLL or approximate next-to-next-to-leading-order (nNLO) precision -including EW corrections -are shown in grey for the parton-level results for most of the observables. The calculations were carried out in similar fashion to that described in Ref. [18], but specifically performed in the context of this analysis in order to provide predictions for the measured observables and to match the number and ranges of bins for the different variables. 12 These additional parton-level predictions are not provided for two of the observables, namely ℓ,non-T and |Δ (ℓ + , ℓ − )|, since the decays of the¯pair and the boson were not included in the theoretical calculations.
In order to test the overall compatibility between the unfolded measurements and the various predictions, a 2 /ndf and corresponding -value is evaluated for each of the differential measurements, separately for the parton level and particle level as well as for the absolute and normalised cases.
The 2 value is defined as: where and correspond to the content in bin of the distributions from the unfolded data and the given prediction, respectively, and −1 to the element in row and column of the inverse of the covariance matrix for the particular variable. The values, and , are, therefore, notational shorthands for d¯/d , or 1/ · d¯/d in the normalised case.
A given -value can be interpreted as the probability of obtaining a value of 2 greater or equal to the quoted value for a particular number of degrees of freedom, where the latter is equal to the number of bins ( bins ) in the case of the absolute measurements and bins − 1 for the normalised measurements.
The construction of the covariance matrix is based on the approach described in Ref.
[90], and it includes both the statistical and systematic uncertainties. The latter include detector-related uncertainties as well as those related to the modelling of the signal and various background processes. While all sources of uncertainty related to the measurements are incorporated in the covariance matrix elements, uncertainties in the theoretical predictions, themselves, are omitted, and their impact is, therefore, not reflected in the quoted 2 and corresponding -values.
For a given variable, the elements of the covariance matrix, , are evaluated using a bootstrap technique, whereby 150 000 Poisson-fluctuated distributions are produced, each corresponding to a pre-unfolded distribution for a given pseudo-experiment. For the detector-related uncertainties, Gaussian-distributed shifts are added coherently to each of the Poisson-fluctuated bin contents, with each shift corresponding to a particular uncertainty source. The shifts are applied as a multiplicative scale relative to the particular bin content, and with the amount and direction of each shift dictated by the corresponding uncertainty source.
Each of the varied distributions is subsequently unfolded using the nominal acceptance and efficiency corrections, as well as the nominal migration matrix -those derived from the nominal MG5_ MC@NLO + P -8 signal sample. Gaussian-distributed shifts are then added coherently to the post-unfolding distributions for each of the signal-and background-modelling uncertainty sources. These shifts are also determined and applied as relative variations for each particular source. The relative variations in this case are defined according to the difference between the generated and the unfolded cross section of a given alternative signal or background model, using nominal corrections in the unfolding.
The resulting changes to the unfolded distributions are, then, used to determine the elements of the covariance matrix for that particular variable. The covariance matrices for the normalised measurements are constructed in an analogous fashion. In this case, the distributions are normalised to unity after all effects are included. In order to avoid performing the unfolding on distributions with negative bin contents, that can arise due to the effects of systematic uncertainties and the subtraction of backgrounds, any such bin contents are set to zero prior to the unfolding.
The uncertainties reflected in the elements, , initially evaluated as relative values, are then multiplied by the differential cross-section values from the measured data in each particular bin in order to yield absolute  uncertainties.
In general, each of the individual sources contributing to the full covariance matrix will contribute to the off-diagonal terms, including even those from the limited data sample size where non-diagonal contributions arise from correlations between the bins introduced during the unfolding process.
The correlations between the bins in the absolute differential measurements for the trilepton and combined channels are sizeable -in many cases in the 20-55% range -and positive. In the case of the tetralepton channel, where statistical uncertainties are more dominant, the correlations are generally below 20% in absolute value. The correlations in the case of the normalised measurements are generally negative and reach absolute values even larger than in the absolute cross-section measurements, with the most extreme case being a value of = −0.76 between the first and second bin of the |Δ ( , lep )| variable in the trilepton channel. For the -related variables in the combined channel, the effect of the larger data sample is partially balanced by the increase in the number of bins, such that the correlations in the absolute measurements for T and | | are also positive, but lie in the 15-45% range. In the normalised measurements for these two variables, the correlations are also mostly negative but are smaller in magnitude than for other variables (strictly | | < 40%, but in most cases | | < 20%). Table 9 summarises the evaluated 2 /ndf and -values used to quantify the compatibility between the measured unfolded data and the various predictions. For the parton-level measurements, the values for the additional theory predictions at NLO, NLO+NNLL or nNLO are also shown for those variables for which predictions are available [18].
Overall, the unfolded spectra from the measured data are compatible with the various predictions for most of the variables considered. For the T variable in the combined 3ℓ + 4ℓ channel, as well as for ℓ,non-T and |Δ ( , lep )| in the trilepton channel, slightly lower -values are obtained for several predictions, but in all cases they are found to be greater than 0.05. For the T variable in the combined channel, the slightly poorer agreement is driven in large part by the sixth bin (220 GeV ≤ T < 290 GeV). For this variable, however, the -value is larger (0.17) for the additional NLO+NNLL prediction in the absolute differential measurement. For the |Δ (¯, )| variable in the tetralepton channel, for which the data exhibit a greater relative fraction of events with larger azimuthal separation between the boson and the¯system, a slightly better level of agreement is observed for the S predictions compared with those from MG5_ MC@NLO. As the statistical uncertainties of the measured data are almost always significantly larger than the differences between the predictions, no definite conclusion about the overall compatibility for these observables can be made. The effect of the uncertainties in the fixed-order parton-level theoretical predictions in the rightmost column of Table 9 was evaluated. This was done by adding terms of the form to the covariance matrix, where ( ) is the sum in quadrature of the uncertainties associated with the scale and PDF choice for bin ( ). The bin contents of the theoretical predictions were, therefore, considered to be 100% correlated. The inclusion of these uncertainties leads to a relative increase of 20-50% in the -values for the variables |Δ ( , lep )|, |Δ (¯, )|, andT relative to those quoted in Table 9. For the variables |Δ ( , lep )|, T , and | |, the impact is negligible. Table 9: Summary of the tests of compatibility between the unfolded differential measurements and the various predictions. Quoted are the 2 /ndf and corresponding -values incorporating all bins for the given variable and based on the assumption that all sources of uncertainty are Gaussian-distributed. The values associated with the additional theory predictions (last column) are included where applicable. These additional predictions are obtained as described in Ref. [18], with their precisions depending on the particular variable: NLO for |Δ (¯, )| andT ; NLO+NNLL for |Δ ( , lep )|, |Δ ( , lep )| and T ; and nNLO for | |. A difference between the measured inclusive cross section quoted in Section 8 and the cross section based on the integrated absolute parton-level spectra in the combined 3ℓ + 4ℓ channel is observed. The two measurements differ both in terms of the method used and in their selection due to the use of different -tagging WPs (refer to Table 1). Approximately 67% of selected data events are common to both measurements. The compatibility between the two cross-section measurements is evaluated using pseudo-experiments taking into account the correlation between uncertainties, including all sources of statistical and systematic effects, and it is found to be at the level of two standard deviations.

Conclusions
Inclusive and differential measurements of the production cross section of a¯pair in association with a boson are presented. The full √ = 13 TeV collision data set collected by the ATLAS detector during Run 2 of the LHC between 2015 and 2018, corresponding to 139 fb −1 , was used for this analysis. Only final states with three or four isolated charged leptons (electrons or muons) were considered for the measurements. The measured inclusive cross section of the¯process is¯= 0.99 ± 0.05 (stat.) ± 0.08 (syst.) pb, in agreement with the SM prediction. The dominant sources of systematic uncertainty in this measurement are associated with the¯parton-shower modelling, -tagging, and modelling of the background.
Absolute and normalised differential cross sections were measured as functions of nine different observables sensitive to the MC modelling of the¯process and to potential BSM effects. The differential cross-section measurements were performed at particle and parton levels in specific fiducial volumes. The unfolded spectra from the measured data are compared with the predictions of different NLO QCD¯MC generators and theoretical predictions at NLO, NLO+NNLL and nNLO precision, based on a 2 /ndf and -value compatibility test. For most of the considered observables, good agreement between data and the predictions is observed. The differences between the various predictions are determined to be smaller than the uncertainties of the unfolded data. For the variables T , ℓ,non-T , |Δ ( , lep )| and |Δ (¯, )|, the observed and predicted differential results show slightly poorer agreement, but -values > 0.05 are obtained in all cases.  [29] ATLAS Collaboration, 2015 start-up trigger menu and initial performance assessment of the ATLAS trigger using Run-2 data, ATL-DAQ-PUB-2016-001, 2016, : https://cds.cern.ch/record/2136007.       [79] ATLAS Collaboration, Measurement of -tagging efficiency of -jets in¯events using a likelihood approach with the ATLAS detector, ATLAS-CONF-2018-001, 2018, : https://cds.cern.ch/record/2306649.    [91] ATLAS Collaboration, ATLAS Computing Acknowledgements, ATL-SOFT-PUB-2021-003, : https://cds.cern.ch/record/2776662.