Measurements of differential cross-sections in four-lepton events in 13 TeV proton-proton collisions with the ATLAS detector

Measurements of four-lepton differential and integrated fiducial cross-sections in events with two same-flavour, opposite-charge electron or muon pairs are presented. The data correspond to 139 fb−1 of s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \sqrt{s} $$\end{document} = 13 TeV proton-proton collisions, collected by the ATLAS detector during Run 2 of the Large Hadron Collider (2015–2018). The final state has contributions from a number of interesting Standard Model processes that dominate in different four-lepton invariant mass regions, including single Z boson production, Higgs boson production and on-shell ZZ production, with a complex mix of interference terms, and possible contributions from physics beyond the Standard Model. The differential cross-sections include the four-lepton invariant mass inclusively, in slices of other kinematic variables, and in different lepton flavour categories. Also measured are dilepton invariant masses, transverse momenta, and angular correlation variables, in four regions of four-lepton invariant mass, each dominated by different processes. The measurements are corrected for detector effects and are compared with state-of-the-art Standard Model calculations, which are found to be consistent with the data. The Z → 4ℓ branching fraction is extracted, giving a value of (4.41 ± 0.30) × 10−6. Constraints on effective field theory parameters and a model based on a spontaneously broken B − L gauge symmetry are also evaluated. Further reinterpretations can be performed with the provided information.


Introduction
This paper presents measurements of differential and integrated fiducial cross-sections of four-lepton events, containing two same-flavour, opposite-charge electron or muon pairs. The data used correspond to 139 fb −1 of √ = 13 TeV proton-proton collisions, collected by the ATLAS detector [1,2] during Run 2 of the Large Hadron Collider (LHC) [3] between 2015 and 2018.
Several interesting Standard Model (SM) processes contribute to this final state, with the possibility of additional contributions from beyond-the-SM (BSM) physics. The dominant SM contribution is from the quark-induced -channel¯→ 4ℓ process, shown in Figure 1(a). Gluon-induced → 4ℓ production contributes at next-to-next-to-leading order (NNLO) in quantum chromodynamics (QCD), via a quark loop, as shown in Figure 1(b). The → 4ℓ process, shown in Figure 1(c), dominates in the four-lepton invariant mass, 4ℓ , region close to the boson mass, [4]. The → ( * ) ( * ) → 4ℓ process, shown in Figure 1(d) for the gluon-gluon production mode, dominates in the 4ℓ region close to the Higgs boson mass, [4]. Here the superscript ( * ) refers to a particle that can be either on-shell or off-shell.
BSM contributions can arise from modifications to the SM couplings of the Higgs boson, the gauge bosons and from possible four-fermion interactions. Contributions are also possible from models producing four leptons either via the decay of bosons or of new BSM particles. For example, cascade decays of new particles introduced by the Minimal Supersymmetric SM, with parameters set such that searches based on missing transverse momentum [5] are insensitive, can nevertheless contribute to four-lepton final states [6].
Other examples include generic models with additional gauge boson(s), , which may be pair-produced and decay into lepton pairs, or models with additional Higgs bosons which may decay into a pair of bosons.  Figure 1: Main contributions to the → 4ℓ(ℓ = , ) process: (a) -channel¯→ 4ℓ production, (b) gluon-induced → 4ℓ production via a quark loop, (c) internal conversion in boson decays and (d) Higgs-boson-mediated -channel production (here: gluon-gluon fusion). The superscript ( * ) refers to a particle that can be either on-shell or off-shell, whereas * indicates that it is always off-shell.
The measurements are corrected for the effects of the detector. Observables are defined in terms of final-state particles, rather than in terms of a particular process. The definition is inclusive of particles in addition to the two lepton pairs, including neutrinos, other leptons, hadrons, photons and any possible BSM particles, although the leptons are required to be isolated from other particles. There are therefore small SM contributions from top-quark pair production in association with a dilepton pair, from triboson processes, where at least two bosons decay leptonically, and from events where -leptons decay to muons or electrons. Using this definition results in minimal dependence of the measurement on the modelling of these other SM processes. The dependence on SM modelling is transferred to the theoretical predictions that are used in comparisons with the data.
Cross-sections are measured differentially as a function of various kinematic variables, and integrated fiducial cross-sections are also provided. The primary observable, 4ℓ , is measured inclusively, sliced in other kinematic variables, and in different lepton flavour categories. Additional differential cross-sections are measured, including dilepton invariant masses and transverse momenta, and angular correlation variables between leptons, in four regions of 4ℓ , each dominated by different processes. The 4ℓ distributions and a subset of the other variables, measured in a region where 4ℓ is above the on-shell production threshold, have been presented previously at this centre-of-mass energy, using a smaller dataset [7][8][9] and by the CMS collaboration using a similar dataset [10]. The current result is more inclusive than previous measurements, in particular the previous requirement that the invariant mass of at least one of the dilepton pairs be close to is removed. This gives sensitivity to BSM processes where there are sources of dilepton pairs other than boson decays. The cross-section measurement in the region close to is used to extract the → 4ℓ branching fraction.
BSM contributions to this final state may lead to discrepancies between the measured cross-sections and the SM predictions. The measurements can therefore be used to set limits on a wide range of BSM models. The methods used to correct the data for the effects of the detector are shown to be robust against the addition of BSM contributions, as is discussed in Section 5.4. Since the cross-sections are defined at particle-level, they can be compared with BSM theory predictions or improved SM predictions without the need to simulate the ATLAS detector. The data and the SM predictions are available in HEPData [11], with the analysis included in the Rivet [12] library, allowing straightforward comparison with other predictions.

ATLAS detector
The ATLAS detector at the LHC covers nearly the entire solid angle around the collision point, as detailed below for each sub-detector. 1 It consists of an inner tracking detector (ID) surrounded by a thin superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer incorporating three large superconducting toroidal magnets.
The ID, immersed in a 2 T axial magnetic field, provides charged-particle tracking for | | < 2.5. It consists of a high-granularity silicon pixel detector covering the vertex region and typically provides four measurements per track. This is followed by the silicon microstrip tracker, which usually provides eight measurements per track. These silicon detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to | | = 2.0. The TRT also provides electron identification information based on the fraction of hits (typically 30 in total) above a higher energy-deposit threshold corresponding to transition radiation.
The calorimeter system covers the pseudorapidity range | | < 4.9. Within | | < 3.2, electromagnetic calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) calorimeters, with an additional thin LAr presampler covering | | < 1.8 to correct for energy loss in material upstream of the calorimeters. Hadronic calorimetry is provided by the steel/scintillator-tile calorimeter, segmented into three barrel structures within | | < 1.7, and two copper/LAr hadronic endcap calorimeters covering 1.5 < | | < 3.2. The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimised for electromagnetic and hadronic measurements respectively.
The muon spectrometer (MS) comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by the superconducting air-core toroids. A set of precision chambers covers the region | | < 2.7 with three layers of monitored drift tubes, complemented by cathode-strip chambers in the forward region, where the background is highest. The muon trigger system covers the region | | < 2.4 with resistive-plate chambers in the barrel, and thin-gap chambers in the endcap regions.
Interesting events are selected by the first-level trigger system implemented in custom hardware, followed by algorithms implemented in software in the high-level trigger [13]. 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the -axis along the beam pipe. The -axis points from the IP to the centre of the LHC ring, and the -axis points upwards. Cylindrical coordinates ( , ) are used in the transverse plane, being the azimuthal angle around the -axis. The rapidity is defined as = (1/2) ln [( + )/( − )], where is the energy of a particle and is the momentum component in the beam direction. The pseudorapidity is defined in terms of the polar angle as = − ln tan( /2). Angular distance is measured in units of Δ ≡ √︁ (Δ ) 2 + (Δ ) 2 .

Fiducial definition
The fiducial phase space is defined according to the kinematic acceptance of the detector, with kinematics that ensure a high efficiency to trigger on the events, and is designed to be as inclusive as possible while keeping backgrounds from non-prompt 2 leptons relatively small. This phase space is defined by a kinematic selection applied at particle-level 3 using final-state, prompt leptons (including those from -lepton decays). In order to align the particle-level definition of lepton kinematics as closely as possible to their measurements in the detector, prompt electrons are 'dressed' by adding to their four-momenta the four-momenta of prompt photons within a cone of size Δ = 0.1 around the electron. This is done because the experimental measurement of electron energies in the calorimeter includes the energy of nearby photons. If the photon is near more than one electron, it is assigned to the nearest. This dressing is not applied to prompt muons because their four-momenta are measured using the ID and the MS, and the four-momenta of nearby photons are not included in the measurement. Prompt leptons are required to be isolated from other particles. Their isolation is tested by forming a scalar sum of the transverse momenta, T , of all charged particles within a cone of Δ = 0.3 around the lepton. The ratio of this sum to the T of the lepton is required to be less than 0.16. If another selected lepton is within the cone, the momentum of this lepton is not included in the sum.
Electrons (muons) are required to have T > 7 (5) GeV and | | < 2.47 (2.7). Events are required to contain at least four such leptons that can be grouped into at least two same-flavour, opposite-charge pairs. This results in three possible flavour configurations: + − + − (4 ), + − + − (2 2 ) and + − + − (4 ). Additional particles (leptons, neutrinos, photons, hadrons and possible BSM particles) are allowed to be present in the event. Events are also required to satisfy the following: • T > 20 GeV for the leading lepton (in T ).
• T > 10 GeV for the sub-leading lepton (in T ).
• The invariant mass of any same-flavour, opposite-charge lepton pair that can be formed in the event is required to satisfy ℓℓ > 5 GeV.
• The angular separation between any two leptons in the event is required to satisfy Δ > 0.05.
These requirements are made to minimise experimental uncertainties and are justified in Section 5.1.

Lepton pairing
Leptons are paired in order to define some of the measured variables. The same-flavour, opposite-charge pair with an invariant mass closest to is selected as the primary pair in the event. Of the remaining leptons, the same-flavour, opposite-charge pair with an invariant mass closest to is selected as the secondary pair, completing a quadruplet of leptons. Therefore, only one quadruplet is defined even in events containing more than four leptons. This selection strategy ensures that pairs corresponding to on-shell bosons for the dominant pair production process are formed preferentially.
A number of differential fiducial cross-sections are also measured, providing kinematic information about the events, which are each varyingly sensitive to the modelling of the SM processes, including QCD and electroweak corrections, and to possible BSM contributions. The 4ℓ distribution is measured single-differentially as well as double-differentially with the four-lepton transverse momentum, T,4ℓ , and the absolute four-lepton rapidity, | 4ℓ |. The single-differential 4ℓ distribution is also measured in 4 , 4 and 2 2 events separately.
The following variables are measured in the four regions of 4ℓ defined above: • The invariant mass of the primary (secondary) lepton pair: 12 ( 34 ).
• A variable sensitive to the polarisation of the decaying particle: cos * 12 (cos * 34 ) is the cosine of the angle between the negative lepton in the primary (secondary) dilepton rest frame, and the primary (secondary) lepton pair in the laboratory frame.
• The absolute rapidity difference between the primary and secondary lepton pairs, |Δ pairs |.
• The difference in the azimuthal angle between the primary and secondary lepton pairs, |Δ pairs |.
• The difference in the azimuthal angle between the leading lepton and sub-leading lepton (in T ) of the quadruplet, |Δ ℓℓ |.

Theoretical predictions and simulation
Simulated events are used to correct the observed events for detector effects, as well as to provide the particle-level predictions with systematic uncertainties for comparisons with the measured data. The various simulated samples and their uncertainties are described in this section.
The¯→ 4ℓ process, including → 4ℓ, was simulated with the S v2.2.2 event generator [15]. Matrix elements were calculated at next-to-leading-order (NLO) accuracy in QCD for up to one additional parton and at leading-order (LO) accuracy for up to three additional parton emissions. The higher-order corrections include initial states with a gluon, but the¯→ 4ℓ notation is kept for simplicity. The calculations were matched and merged with the S parton shower based on Catani-Seymour dipole factorisation [16,17], using the MEPS@NLO prescription [18][19][20][21]. The virtual QCD corrections were provided by the O L library [22,23]. The NNPDF3.0nnlo set of PDFs [24] was used, along with the dedicated set of tuned parton-shower parameters (tune) developed by the S authors. All S v2.2.2 samples discussed below use this same PDF set and tune.
An alternative¯→ 4ℓ sample was generated at NLO accuracy in QCD using P -B v2 [25][26][27]. Events were interfaced to P 8.186 [28] for the modelling of the parton shower, hadronisation, and underlying event, with parameters set according to the AZNLO tune [29]. The CT10 PDF set [30] was used for the hard-scattering processes, whereas the CTEQ6L1 PDF set [31] was used for the parton shower. A correction to higher-order precision, defined for this process as the ratio of the cross-section at NNLO QCD accuracy to the one at NLO QCD accuracy, was obtained using a M NNLO QCD prediction [32][33][34][35], and applied as a function of 4ℓ .
A reweighting for virtual NLO electroweak effects [36,37] was applied as a function of 4ℓ to both¯→ 4ℓ samples. A 100% uncertainty was assigned to this reweighting function to account for non-factorising effects in events with high QCD activity [38]. The real higher-order electroweak contribution to 4ℓ production in association with two jets (which includes vector-boson scattering, but excludes processes involving the Higgs boson) was not included in the sample discussed above but was simulated separately with the S v2.2.2 generator. The LO-accurate matrix elements were matched to a parton shower based on Catani-Seymour dipole factorisation using the MEPS@NLO prescription.
Uncertainties due to missing higher-order QCD corrections are evaluated for both¯→ 4ℓ samples [39] using seven variations of the QCD factorisation and renormalisation scales in the matrix elements by factors of one half and two, avoiding variations in opposite directions. The envelope of the effects of these variations is taken as the uncertainty. For the S sample, uncertainties from the choice of nominal PDF set are evaluated using 100 replica variations, as well as by reweighting to the alternative CT14nnlo [40] and MMHT2014nnlo [41] PDF sets, and taking the envelope of these contributions as a combined PDF uncertainty. The effect of the uncertainty in the strong coupling constant S is assessed by variations of ±0.001. For the P + P 8 sample, the PDF uncertainty is evaluated using the 26 pairs of upwards and downwards internal PDF variations within CT10 NLO, as well as reweighting to the NNPDF3.0nnlo and MSTW2008 [42] PDF sets, and taking the envelope of the variations.
The gluon-initiated 4ℓ production process was simulated using S v2.2.2 [43] at LO precision for up to one additional parton emission, with the parton-shower modelling being the same as for the¯→ 4ℓ sample described above. The generator requires ℓℓ > 10 GeV for any same-flavour and opposite-charge lepton pair, which removes a small amount of the phase space included in the measurement. This is recovered by the correction to the NLO QCD calculation described below, for all but the 12 and 34 distributions in the region below 10 GeV, where the prediction from this sample is missing. This is a few percent of the total prediction in the fiducial phase space in this region. The sample includes the → 4ℓ box diagram, the -channel process proceeding via a Higgs boson, → ( * ) → ( * ) ( * ) → 4ℓ, and the interference between the two. In the region 4ℓ > 130 GeV this sample is used to simulate all three contributions. In the region 4ℓ < 130 GeV, where on-shell Higgs boson production dominates, this sample is only used to simulate the → 4ℓ box diagram, with the dedicated samples described below used to simulate Higgs boson production. In this region the interference between the two is negligible and is not simulated. A NLO QCD calculation [44,45] allowing 4ℓ differential -factors to be calculated is used to correct each of these contributions separately, together with an associated uncertainty. The details are the same as those described in Ref. [7]. An additional correction factor of 1.2, taken from the ratio of a NNLO QCD calculation [46,47] to the NLO prediction for off-shell Higgs production, is assumed to be the same for all three components. Scale and PDF uncertainties are obtained in the same way as for the S ¯→ 4ℓ sample described above.
In the region 4ℓ < 130 GeV, where on-shell Higgs boson production is important, and the effect of interference is negligible, dedicated samples are used to model the Higgs boson production processes as accurately as possible. Higgs boson production via gluon-gluon fusion [48], which dominates, was simulated at NNLO accuracy in QCD using the P NNLOPS program [25,[49][50][51][52]. The simulation achieves NNLO accuracy for arbitrary inclusive → observables by reweighting the Higgs boson rapidity spectrum in Hj-MiNLO [48,53,54] to that of HNNLO [55]. P 8 [56] was used with parameters set according to the AZNLO tune to simulate the parton shower and non-perturbative effects. The P prediction used the PDF4LHC15nnlo PDF set [57]. The prediction from the Monte Carlo samples is normalised to the next-to-next-to-next-to-leading-order QCD cross-section plus electroweak corrections at NLO [47,[58][59][60][61][62][63][64][65][66][67]. Higgs boson production via vector-boson fusion (VBF) [68], in association with a vector boson ( ) [69], and in association with a top-quark pair were all simulated with P [25,51,52,68] and interfaced with P 8 for the parton shower and non-perturbative effects, with parameters set according to the AZNLO tune. The P prediction used the PDF4LHC15nlo PDF set [57]. For VBF production, the P prediction is accurate to NLO in QCD, and is normalised to an approximate-NNLO QCD cross-section with NLO electroweak corrections [70][71][72]. For production, the P prediction is accurate to NLO in QCD for up to one additional jet, and is normalised to a cross-section calculated at NNLO in QCD with NLO electroweak corrections [73][74][75][76][77]. The uncertainties in on-shell Higgs boson production are the same as reported in Ref. [78]. The largest components are from the QCD scale and PDF uncertainties affecting the gluon-gluon fusion component.
Other SM processes making smaller contributions to the final state used in the analysis include triboson production ( , and ), collectively referred to as , and¯pairs produced in association with vector bosons (¯,¯), collectively referred to as¯( ). The production of triboson events was simulated with S v2.2.2 using factorised gauge-boson decays. Matrix elements, accurate to NLO for the inclusive process and to LO for up to two additional parton emissions, were matched and merged with the S parton shower based on Catani-Seymour dipole factorisation using the MEPS@NLO prescription. The virtual QCD corrections for matrix elements at NLO accuracy were provided by the O L library. Uncertainties are evaluated in the same way as discussed above for the S ¯→ 4ℓ sample. Two sets of¯( ) samples are used. A sample produced with the M G 5_aMC@NLO 2.3.3 [79] generator at NLO with the NNPDF3.0nlo [24] PDF set is used to provide the particle-level predictions to compare with the measurements. The events were interfaced to P 8.210 [56] using the A14 tune [80] and the NNPDF2.3lo PDF set [81]. Uncertainties due to missing higher-order corrections are evaluated as in the S samples discussed above. Uncertainties in the PDFs are evaluated using the 100 replicas of the NNPDF3.0nlo PDF set. A second sample produced with S v2.2.0 at LO accuracy, using the MEPS@LO set-up with up to one additional parton, is used to perform the detector corrections. The default S v2.2.0 parton shower was used along with the NNPDF3.0nnlo PDF set. A flat uncertainty of ±15% is assigned to these samples to cover any differences between them and the M G 5_aMC@NLO samples, and also their uncertainties.
A small contribution from double-parton scattering, with the dilepton pairs produced in different partonparton interactions, is expected to contribute at the level of 0.1%. This is included in the definition of the final state but neglected in the predictions due to the negligible contribution.
To correct for detector effects, generated events were passed through a G 4-based simulation of the ATLAS detector and trigger [82,83], and then through the same reconstruction and analysis software as applied to the data. Corrections are applied to simulated leptons to account for differences seen between simulation and data. These include differences in lepton reconstruction, identification, isolation and vertex-matching efficiencies, as well as in the momentum scale and resolution, with associated uncertainties [84,85]. The effect of multiple proton-proton interactions in the same bunch crossing, known as pile-up, was emulated by overlaying inelastic proton-proton collisions, simulated with P 8.186 using the NNPDF2.3lo set of PDFs and the A3 tune [86]. The events are then reweighted to reproduce the distribution of the number of collisions per bunch-crossing observed in the data, which have an average of 33.7 for the total dataset, with an associated uncertainty coming from the uncertainty in the inelastic cross-section.

Event selection
Events are selected by requiring at least one of a set of triggers to accept the event. Each trigger requires the presence of one, two or three electrons or muons satisfying a variety of T thresholds [87,88]. The trigger efficiency increases from around 80% for 4ℓ below 80 GeV, to nearly 100% at 4ℓ ∼ 200 GeV and above. Events are required to contain at least one reconstructed collision vertex candidate with at least two associated ID tracks. The vertex with the largest sum of 2 T of tracks is considered to be the primary interaction vertex.
Electron identification utilises a likelihood-based method, combining information from the shower shapes of clusters in the electromagnetic calorimeters, properties of tracks in the ID, and the quality of the track-cluster matching, the latter being based on spatial separation as well the ratio of the cluster energy to the track momentum. The shower shape variables include variables sensitive to the lateral and longitudinal development of the electromagnetic shower, and variables designed to reject clusters from multiple incident particles. A 'loose' identification working point is used [84], with the additional requirement of a hit in the innermost layer of the pixel detector. Muons are identified using information from various combinations of the MS, the ID and the calorimeters. As with electrons, a 'loose' identification working point is used [85], which focuses on recovering efficiency in poorly instrumented detector regions. In particular, ID tracks identified as muons on the basis of their calorimetric energy deposits or the presence of individual muon segments are included in the region | | < 0.1, where the muon spectrometer is only partially instrumented. In addition, stand-alone MS tracks, supplemented using short tracks in the forward pixel detector where possible, are added in the region 2.5 < | | < 2.7, where ID coverage does not permit full independent ID track reconstruction.
The kinematic requirements described in Section 3 for the particle-level selection are also applied to these reconstruction-level 4 electrons and muons, ensuring very little extrapolation into unmeasured regions when correcting for detector effects. The leading and sub-leading leptons must have T > 20 GeV and > 10 GeV, respectively, to ensure efficient triggering of events. The requirements of ℓℓ > 5 GeV and Δ > 0.05 between the leptons suppress contributions from / decays and conversion electrons respectively. To select leptons originating from the primary proton-proton interaction, their tracks are required to have a longitudinal impact parameter, 0 , satisfying | 0 sin( )| < 0.5 mm from the primary interaction vertex. For MS-only muons, no such requirement is made. To avoid the double-counting of particles, if leptons share an ID track, only one survives the selection. Preference is given to higher-T leptons and to muons over electrons, unless the muon has no associated MS track, in which case the electron survives. The leptons satisfying the above criteria are referred to as baseline leptons and are used to form the quadruplet, as detailed in Section 3. Once the quadruplet is formed the following further selection requirements are made. The electrons and muons are required to be isolated from other particles using information from the ID and the calorimeters. The isolation variables form a ratio of the scalar sum of the transverse momenta of all charged particle tracks within a lepton-T dependent cone size, with an additional contribution from calorimeter measurements, to the T of the lepton. The variables have a correction for pile-up, and another correction for tracks or energy deposits originating from other leptons in the event, in order to retain events with closely spaced prompt leptons. Additionally, the transverse impact parameter significance, | 0 |/ 0 , calculated relative to the measured beam-line position must be less than 5 (3) for electrons (muons). The leptons satisfying the above criteria are referred to as signal leptons, and they are a subset of the baseline leptons.
The overall efficiency to reconstruct, identify, isolate and vertex-match electrons varies from 30% at low T and high | | to 98% at high T . For muons it varies from 30% at low T and high | | to more than 99% at high T , and is higher on average than for electrons [89].

Background estimation
Backgrounds where one or more of the reconstructed leptons entering a quadruplet did not originate from a prompt lepton (non-prompt leptons) are estimated using data-driven methods rather than simulation. The backgrounds are subtracted from the data prior to correcting for detector effects. Simulations suggest that the main source of background events is where the quadruplet is formed by a combination of prompt leptons from / * or¯production processes, with additional leptons originating from the decay of hadrons.
The background is estimated with a fake factor method. The method uses three classes of leptons, the signal and baseline leptons defined in Section 5.1, as well as baseline-not-signal leptons, defined as those which pass the baseline selection but fail the signal selection. A quantity called the fake factor is defined as the ratio of the number of signal leptons to the number of baseline-not-signal leptons and measured in a dedicated control region in data, which is enriched in non-prompt leptons. The background yield is estimated by applying the fake factor to each baseline-not-signal lepton in events passing an event selection requiring only baseline instead of signal leptons. Four-lepton events containing prompt baseline-not-signal leptons are removed from the estimation by using simulation predictions.
The control region in which the fake factor is evaluated is defined by selecting events containing a same-flavour and opposite-charge lepton pair, with an invariant mass within 15 GeV of , and at least one other baseline lepton. The leptons from this pair are required to have triggered the event. Events with → ℓℓ decays compose 90% of this sample. The remaining leptons in an event that do not form the candidate pair are likely to be non-prompt and to have originated from hadron decays or, to a lesser extent, from jets misidentified as leptons. There is a small contribution of prompt leptons, primarily from decays, and an even smaller contribution from four-prompt-lepton events. These contributions are subtracted using S v2.2.2 simulations. After this subtraction, the fake factor is measured in bins of lepton T and the number of jets in the event.
An important assumption of the fake-factor method is that the probability of any given lepton to be prompt is uncorrelated with the equivalent probabilities for other leptons in the event. Since this analysis accepts leptons which are separated by as little as Δ = 0.05, this assumption can break down. For example, cascade decays of -hadrons can lead to two or more leptons being close to each other. In such cases, the leptons' prompt/non-prompt probabilities are highly correlated. To account for this, if two leptons are found to be within Δ = 0.4 of each other, one of them is omitted when determining the fake contribution expected from that event. The choice of which lepton to omit depends on the lepton flavour and T . If the leptons are different flavours, the electron is omitted, and if the leptons are of the same flavour, the one with the lower transverse momentum is omitted. This approach was validated in closure tests of the method using simulated events.
In order to limit the impact of statistical fluctuations in the non-prompt background, a smoothing procedure based on a variable-span smoothing technique [90] is used to obtain a smoother background shape and reduce the impact of single outlier events. The overall predicted background is 4% of the selected data in the signal region, and falls quickly from 38% for 4ℓ between 20 and 60 GeV to around 2.5% for 4ℓ above 200 GeV.
Five sources of uncertainty in the background yield are considered. First, the dominant uncertainty for 4ℓ < 150 GeV and 4ℓ > 350 GeV comes from the statistical uncertainty of the number of events with four baseline leptons, used as input to the fake-factor method. Second, dominant in the region 150 < 4ℓ < 350 GeV, are the theory uncertainties of the prompt-lepton subtraction in the control region, predominantly from events, which are dominated by QCD scale variations. Third, the uncertainties in the subtracted contribution of genuine four-prompt-lepton events containing baseline-not-signal leptons (which is estimated from simulation) are propagated to the background estimate, which is a subdominant contribution. Fourth is the statistical uncertainty of the data in the control region used to measure the fake factors, which is also subdominant. Finally, a very small uncertainty in the smoothing technique is included. The total size of the uncertainty in the predicted number of background events varies between 15% for 4ℓ between 20 and 60 GeV and 150% for 4ℓ of 1000 GeV.
The background estimation was validated by comparing the prediction of the method to data in dedicated background-enriched phase-space regions. The first validation region is the 'different-flavour validation region', which is defined to be like the signal region, but requires the leptons of one of the pairs to have different flavours. The second is the 'same-charge validation region', which instead requires the leptons in one of the pairs to have the same charge. After application of the estimated background yield, the data and simulation in both validation regions are in agreement within the statistical uncertainties of the data, as shown in Figure 2 for the 4ℓ observable.
The fake factor method does not account for a very small background contribution from + Υ events. These events amount to 0.2% of the selected data, and 0.8% in the → 4ℓ region, and are removed prior to correcting for detector effects, using an estimate from P 8 simulation. Table 1 shows the number of selected events over the full fiducial phase space and in each 4ℓ region. Also shown is the predicted contribution from each SM process contributing to the final state, as well as the predicted background contribution from non-prompt leptons. The S simulation is used for thē → 4ℓ process. Uncertainties in the predictions arise from the sources discussed in Sections 4, 5.2 and 5.5. Figure 3 shows the 4ℓ distribution, comparing data with these predictions at reconstruction-level, together with the uncertainties. The distribution shows a number of interesting features. There is a peak in the region 4ℓ ∼ , dominated by the → 4ℓ process, and a peak in the region 4ℓ ∼ , dominated by the → 4ℓ processes. At the threshold for producing two on-shell bosons, 4ℓ ∼ 180 GeV, there is an increase in the cross-section. The cross-section then falls steeply.

Detector corrections
After subtracting the background due to non-prompt leptons, the measured event yields are corrected for detector effects using a combination of a per-lepton efficiency correction and an iterative Bayesian unfolding technique [91]. The detector effects include the resolution of the measured kinematic variables and the inefficiencies of reconstructing leptons and triggering on the events. The sum of the SM simulations described in Section 4 are used to provide the relationship between the particle-level observables defined in Section 3 and the reconstruction-level observables defined in Section 5.1.
The first step is a correction for the reconstruction, identification, isolation and vertex-matching efficiency of each lepton in the quadruplet, which is referred to as a pre-unfolding efficiency correction. The efficiency is measured in the simulation as a function of and T for electrons and muons, treating those from decays separately, due to a lower | 0 |/ 0 efficiency. It is defined as the ratio of the number of reconstruction-level leptons (that are matched with Δ < 0.05 to particle-level leptons) to the number of particle-level leptons.
A per-event weight is given by: is the efficiency for the ℎ lepton in the quadruplet, treated as uncorrelated between leptons. The efficiency for a given lepton-flavour is obtained from a weighted-average of the efficiency from leptons that originate from decays and those that do not, using the 4ℓ -dependent admixture expected from the simulation. The per-event weight defined above is applied to events in both data and simulation, taking the and T values from the leptons in data and simulation respectively. This ensures there is minimal dependence on the SM description of the lepton kinematics when correcting the data.
The data are then corrected for events that pass the reconstruction-level selection but fail the particle-level selection. This primarily occurs due to resolution effects, and is corrected by a multiplicative factor, known as a fiducial correction, in each bin of each distribution. Then, the iterative Bayesian procedure is Table 1: Predicted reconstruction-level yields per process and in total, compared with observed data counts, over the full fiducial phase space and in the following regions of 4ℓ : → 4ℓ (60 < 4ℓ < 100 GeV), → 4ℓ (120 < 4ℓ < 130 GeV), off-shell (20 < 4ℓ < 60 GeV or 100 < 4ℓ < 120 GeV or 130 < 4ℓ < 180 GeV) and on-shell (180 < 4ℓ < 2000 GeV). Uncertainties in the predictions include both the statistical and systematic sources. The uncertainty in the total prediction takes into account correlations between processes, and therefore contributions in a given column do not trivially add up in quadrature to give the total. The background row is events with non-prompt leptons, including those from + Υ events. The → 4ℓ row includes only the on-shell Higgs boson contribution, with off-shell contributions included in → 4ℓ. applied using the SM particle-level distribution as the initial prior. The data are unfolded using a migration matrix containing probabilities that an event in a given particle-level bin of a distribution is found in a particular reconstruction-level bin of that distribution. This matrix is formed from events that pass both the particle-level and reconstruction-level event selections. This process is iterated with the prior being replaced by the unfolded data from the previous iteration: three iterations are used for all distributions, with the exception of 12 , |Δ pairs | and |Δ ℓℓ | where only two iterations are applied. The iterations chosen represent the optimal compromise between the increase of the statistical uncertainty with too many iterations, and the increase of the residual bias due to over-regularisation with too few iterations.

Region
The final step is to divide the resulting unfolded distribution by the ratio of the number of events passing both particle-and reconstruction-level selections to the number passing the particle-level selections. This is known as an efficiency correction. Since the reconstruction-level selections have already had the pre-unfolding weights applied, this correction is fairly close to one, but it accounts for any residual effects such as resolution and trigger efficiencies.
For sliced variables the distributions are unfolded simultaneously, properly taking into account the migration between regions. The integrated cross-section is obtained by correcting the total number of observed events, after background subtraction and application of pre-unfolding weights, with the fiducial and efficiency corrections calculated for the inclusive phase space. When unfolding the per-region cross-sections the inter-region migrations are accounted for.
The binning of each distribution is driven by the requirement that the fraction of events in a reconstructionlevel bin that originate from the same bin at particle-level is at least 60% (70%, 80%) if 25 (20,14) or more events are predicted for the reconstruction-level yield of the bin. At least 14 events must be expected in each bin. These requirements ensure that the number of events is approximately Gaussian distributed. There is an additional constraint that bins should be centred on the various resonant peaks in the distributions.
The unfolding method is designed to minimise the dependence on the simulation of the underlying kinematics of the particles. To validate this, a data-driven closure test is performed, where a simulated  Observed reconstruction-level 4ℓ distribution compared with the SM prediction, using S for thē → 4ℓ simulation. The statistical uncertainty of the data is displayed as error bars and systematic uncertainties in the prediction are shown as a grey hashed band. The ratio of the data to the prediction is shown in the lower panel. The -axis is on a linear scale until 4ℓ = 225 GeV, where it switches to a logarithmic scale, as indicated by the double dashes on the axis. There is one additional data event reconstructed with 4ℓ = 2.14 TeV, while 0.4 events are expected from simulation for 4ℓ > 2 TeV. pseudo-data sample is formed by reweighting the simulation such that the distribution agrees with the data. This pseudo-data sample is then unfolded with the nominal simulation and the resulting unfolded distribution is compared with the input reweighted particle-level prediction. They are in good agreement with each other, and the small differences, averaging much less than 1% but reaching 3% in a few bins in some distributions, are taken as a systematic uncertainty of the final result.
In order to demonstrate that these measurements are robust against the presence of BSM physics in the data and can be used to constrain BSM models, a number of BSM signal injection tests are performed. Pseudo-data consisting of SM+BSM simulations are unfolded with the nominal SM simulation and the result is compared with the particle-level SM+BSM simulation. Any differences are interpreted as a bias in the unfolded SM+BSM result. Overall, the results are found to be extremely robust with only small differences seen, all of them within the experimental uncertainties. Models that predict a broad excess over the SM prediction, such as wide resonances and modifications to SM couplings, lead to very small biases in the unfolded distributions. As an example, the addition of a heavy Higgs boson, with various Higgs masses ranging from 300 GeV to 1400 GeV, and a width of 15% of the mass, is studied. The cross-section of the process is scaled such that the change relative to the SM prediction is equivalent to 2 of the data uncertainty. The bias in the unfolded result is always less than 20% of the total experimental uncertainty in any given bin, with most cases and bins being considerably less affected. Without the pre-unfolding corrections applied the bias is up to a factor of two larger, indicating that the pre-unfolding step improves the robustness of the unfolding. The same tests are performed with narrow-width heavy Higgs bosons, which lead to predicted enhancements in a single bin only. These tests lead to slightly larger biases due to the more drastic changes in the predicted shapes of distributions. In this case the bias observed is still always within experimental uncertainties, with the maximum being 50% of the total experimental uncertainty (which is dominated by the statistical component) in any given bin. This bias slightly reduces the cross-section in the bin of interest. This implies that limits placed on narrow resonances will be slightly more stringent, or that claims of an excess would be slightly less significant, than they would be without the bias.

Uncertainties
The statistical uncertainty of the data is dominant for the vast majority of the differential cross-section bins. It is estimated with two approaches: the first is a model-independent approach, using the observed number of events, which is used as the quoted uncertainty in the measurements, and the second uses the expected number of SM events, which is appropriate when testing the observed cross-section against the SM prediction. In the first approach, 3500 pseudo-datasets are generated by assigning random Poissondistributed weights of mean one to the data events, and taking the root mean square of the differences between all the unfolded results obtained using the pseudo-datasets. The statistical uncertainties obtained in this way are equivalent to frequentist confidence intervals in the large-sample limit, while in the bins with few entries, the quoted bands are known to be up to 10% narrower than a frequentist confidence interval. In the second approach, 3500 Poisson-distributed pseudo-datasets are generated with a mean equal to the predicted reconstruction-level SM event yield in each bin, unfolding each one and taking the root mean square of the differences between all the unfolded results.
The systematic uncertainties of the measured cross-sections are evaluated by repeating the measurement after applying each associated variation and comparing the unfolded result with the nominal one. Significant contributions arise from uncertainties in the lepton reconstruction, identification, isolation and trackto-vertex matching efficiencies, and momentum resolution and scale. These uncertainties are derived from the data-driven measurements used to determine the factors applied to the simulation [84,85], as discussed in Section 4. Another important source of uncertainty arises from the choice of generator in the simulation of the¯→ 4ℓ process (the nominal S prediction and the alternative P + P 8 prediction, both introduced in Section 4) used to unfold the results. The predominant effect on the measured distributions comes from a known difference between the two generators' modelling of the final-state radiation of photons. When this difference is evaluated the P + P 8 prediction is first reweighted to match the S prediction for the distribution being unfolded. This is to avoid double counting with the data-driven closure test, described in Section 5.4. Particularly important in the tails is the uncertainty in the estimate of the background from non-prompt leptons, as described in Section 5.2. A flat uncertainty of ±1.7% is assigned as a result of the uncertainty in determining the luminosity for the Run 2 dataset [92,93]. Smaller uncertainties come from the slight non-closure in the data-driven test for the unfolding, described in Section 5.4, the statistical uncertainty of the simulated samples, the uncertainty in the inelastic cross-section, and the other theory uncertainties, the last two both introduced in Section 4. The theory uncertainties have small effects on the unfolding due to changes in the shape and normalisation of the various SM contributions. They have a much larger effect on the particle-level predictions that are compared with the data. The dominant uncertainty comes from the scale variations for each process. Figure 4 shows the breakdown of the uncertainties for the measured 4ℓ distribution. The statistical uncertainty of the data is the dominant source of uncertainty in all but the third mass bin (at 4ℓ ≈ ), where the uncertainty in lepton efficiencies dominates. The generator uncertainty shows some bin-to-bin fluctuations due to the limited Monte Carlo sample sizes when comparing the unfolding performed with the S and P + P 8 predictions, as well as some real features due to the 4ℓ -dependence of the differences in the generators' modelling of the final-state radiation of photons. 30   The covariance matrices for the statistical and systematic uncertainties are available in HEPData for each distribution. The dominant parts of the lepton uncertainties are highly correlated across bins. The background uncertainties are mostly uncorrelated as they are driven by limited sample size. The systematic uncertainty arising from the choice of generator is highly correlated across bins.
For the theoretical uncertainties of the particle-level predictions, the uncertainties from each individual process are treated as uncorrelated and an additional decorrelation of the scale uncertainty between the four 4ℓ regions for the¯→ 4ℓ process is introduced. This strategy is motivated by the fact that each region probes a very different interaction scale, and a different type of process is dominant. This correlation scheme also leads to the most conservative results when interpreting the data. Table 2 gives the measured cross-sections in the full fiducial phase space and in the four 4ℓ regions, each dominated by a different process, compared with the theoretical predictions described in Section 4. Two predictions are shown, one where the¯→ 4ℓ process is simulated with S at NLO accuracy Table 2: Fiducial cross-sections in fb in the full fiducial phase space and in the following regions of 4ℓ : → 4ℓ (60 < 4ℓ < 100 GeV), → 4ℓ (120 < 4ℓ < 130 GeV), off-shell (20 < 4ℓ < 60 GeV or 100 < 4ℓ < 120 GeV or 130 < 4ℓ < 180 GeV) and on-shell (180 < 4ℓ < 2000 GeV), compared with particle-level predictions and their uncertainties as described in Section 4. Two predictions are shown for the¯→ 4ℓ process simulated with S or with P + P 8. All other SM processes are the same for the two predictions. in QCD and one where it is simulated with P + P 8 normalised to a prediction at NNLO accuracy in QCD, as described in Section 4. All the other SM processes are the same in the two predictions. The S prediction is generally higher than the P + P 8 prediction in all but the on-shell region, where the predictions are very close. The cross-sections measured in data generally agree with both predictions within the quoted uncertainties. The data central values are above the P + P 8 predictions in all regions, and in all but the → 4ℓ region for S . In the on-shell region the S prediction is a bit more than 1 below the data. In Ref.

Region
[94] the → 4ℓ cross-section is measured by ATLAS in a fiducial phase space that differs slightly from the → 4ℓ region measured here. The phase space is designed to minimise the contribution from non-→ 4ℓ processes. In the dedicated Higgs boson measurement the cross-section is found to be slightly below the SM prediction. The dedicated Higgs boson measurement differs from the present measurement in using a slightly different phase space, in subtracting non-Higgs boson processes using a data-driven approach, and in including a ∼ 1% contribution from Higgs boson production in association with a -quark pair in the prediction.
The differential cross-section as a function of 4ℓ is shown in Figure 5, in much finer bins than those in Table 2. The breakdown of the contribution from different SM processes is also shown. The features seen in the reconstruction-level distribution in Figure 3 are also present here. The SM predictions agree well with the measurement within uncertainties over the entire 4ℓ spectrum, with the same features seen as in the comparisons in Table 2. For this distribution, and all the others shown below, two -values for the observed data given the predicted SM cross-section (using either S or P to model the¯→ 4ℓ contribution) are obtained from the 2 . This is defined as 2 where ì meas and ì pred are -dimensional vectors from the measured and predicted differential cross-sections of a given observable respectively, and is the × total covariance matrix defined by the sum of the statistical and systematic covariances in ì meas and ì pred . The statistical covariance on ì meas is obtained from the expected number of SM events, as described in Section 5.5. The -value is the probability for the 2 , with degrees of freedom, to have at least the observed value.
In order to study the different 4ℓ regions in more detail, Figures 6 and 7 show the cross-section versus 12 and 34 respectively in each region. In the → 4ℓ region, the contribution from Higgs boson production is shown separately. The different regions show peaks in different places due to the kinematic constraints of the 4ℓ requirements. For all regions but → 4ℓ there is a clear enhancement at for 12 . Conversely, 34 only has contributions from on-shell bosons in the on-shell region. The → 4ℓ 4l and off-shell regions are dominated by off-shell photon and boson exchange, and the → 4ℓ region is dominated by off-shell production. The data are generally well modelled by the SM predictions within uncertainties, although, as already discussed, the normalisation of the predictions in the on-shell region is lower than the observed measurement, and for the P + P 8 prediction it is lower in all the regions, especially the off-shell region. Combined with some statistical fluctuations in the data this gives some low -values for some of the distributions, especially for the P + P 8 prediction. This is also true for the measurements in Figures 8-11. For 12 in the on-shell region, the predictions are below the data for 12 below , and above the data for 12 above . Here the shapes of the two SM predictions also deviate from each other, indicating differences in the modelling, perhaps related to the modelling of the final-state radiation of photons. The 12 and 34 measurements provide particular sensitivity to BSM models in which the lepton pairs do not come from boson decays, as discussed in Section 6.3.      Figure 6: Differential cross-section as a function of 12 in the four 4ℓ regions. The measured data (black points) are compared with the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. In (b) the contribution from Higgs production is shown in addition to the total SM prediction. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data. Figure 8 shows the differential cross-sections as a function of |Δ pairs |, |Δ pairs |, T,12 , and T,34 in the highest cross-section, on-shell region. The |Δ pairs | distribution peaks at , with the dilepton pairs back-to-back. The |Δ pairs | distribution peaks at zero, with a tail going out to five. The T,12 and T,34 distributions peak at around 40 GeV. Overall, the SM gives a reasonable description of the kinematics in this region, although the SM prediction is about 20% lower than the data for 2.6 < |Δ pairs | < 3.2 and 50% lower for |Δ pairs | > 3.2, indicating mis-modelling by the simulation in this region of phase space.    : Differential cross-section as a function of 34 in the four 4ℓ regions. The measured data (black points) are compared with the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. In (b) the contribution from Higgs production is shown in addition to the total SM prediction. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data. Figure 9 shows the cross-section versus |Δ ℓℓ | for each region, and good agreement with the SM prediction is seen. In each region the cross-section peaks when the two leading leptons are back-to-back. This variable is sensitive to electroweak corrections to the four-lepton final state [95].    region. The measured data (black points) are compared with the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data. In (c) the -axis is on a linear scale until T,12 = 90 GeV, where it switches to a logarithmic scale; in (d) it switches at T,34 = 86 GeV.      Figure 9: Differential cross-section as a function of |Δ ℓℓ | in the four 4ℓ regions. The measured data (black points) are compared to the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data. =13 TeV, 139 fb s =13 TeV, 139 fb s    Figure 10: Differential cross-section as a function of cos * 12 in the four 4ℓ regions. The measured data (black points) are compared with the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data. =13 TeV, 139 fb s =13 TeV, 139 fb s    Figure 11: Differential cross-sections as a function of cos * 34 in the four 4ℓ regions. The measured data (black points) are compared with the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data.

Extraction of the → 4ℓ branching fraction
The → 4ℓ branching fraction is extracted using the measured cross-section in the → 4ℓ region, meas = 22.1 ± 1.3 fb, shown in Table 2. An extrapolation is performed using particle-level simulation to a phase space that was used in previous measurements of the branching fraction [7, 9,96], allowing direct comparison. This phase space is defined by only two requirements: 80 < 4ℓ < 100 GeV and ℓℓ > 4 GeV, with no requirements made on the kinematics of the individual leptons. The definition of the final state is based on leptons before photon radiation (Born level), with no isolation applied and excludes leptons from decays. An acceptance factor fid = 0.0852 ± 0.0015 is defined as the ratio of simulated particle-level events passing the fiducial selection used for meas to those in the extended phase space, calculated using S . The uncertainty comes from the same sources as discussed in Section 4 and is dominated by the scale variations.
The branching fraction is calculated as where pred non-¯→ 4ℓ = 0.22 ± 0.04 fb is the predicted fiducial cross-section from sources other than → 4ℓ, obtained from the respective simulations; Z = 0.952 ± 0.005 is the fraction of¯→ 4ℓ coming from single production rather than the -channel process, obtained from P -B v2; non-= 0.99186±0.00014 is the fraction of events where no leptons originate from decays, obtained from S ; and is the total cross-section for single-production, taken from the ATLAS measurement in Ref. [97] using 81 pb −1 , with a correction of 0.935 ± 0.001 to account for the different mass window. The uncertainties in where the statistical, systematic and luminosity uncertainties come from the measurements of meas and , and the theory uncertainty includes contributions from fid , Z , non-and pred non-¯→ 4ℓ . Since the measurement of was performed with only 81 pb −1 of collision data, all detector-related systematic uncertainties as well as the luminosity uncertainty are conservatively treated as uncorrelated between meas and . The measurement is compatible with previous measurements [7, 9,96] and with the SM prediction of (4.50 ± 0.01) × 10 −6 , calculated with P [96]. The current measurement is the most precise to date, and benefits from an acceptance gain of 130% relative to the previous ATLAS measurement in Ref. [7].

BSM interpretation
The measured cross-sections can be used to constrain BSM models by using the central values and covariance matrices made available in HEPData. For convenience, the SM particle-level predictions described in this paper are also included, so that comparisons can be made simply by adding BSM simulations to the provided SM predictions. In this section, two well-motivated interpretations are given as examples: constraints on effective field theory parameters and constraints on a model with a boson, where the bosons may be produced in pairs. Limits are set by constructing a likelihood function, where ì meas and ì pred and are defined in Section 6.1. In this case only the uncertainty in the SM contribution to ì pred is included in . In order to avoid an underestimation of the statistical uncertainty in bins where the data has fluctuated downward, the statistical uncertainty of the expected SM yield is used. It was shown that using the observed statistical uncertainty leads to test statistics with reduced exclusion power and with probability distributions that are far from asymptotic. The BSM theoretical uncertainties are included as nuisance parameters, ì , with Gaussian constraints, G ( , 0, 1). Since there are a number of measured differential cross-sections, and the statistical correlation between them is not determined, the variable providing the best expected sensitivity is chosen to set the limits for a given point in BSM parameter space. A variable measured in slices of another variable counts as one observable.

Effective field theory constraints
A SM effective field theory (SMEFT) formalism using the Warsaw basis [98] is considered. The effective Lagrangian is defined as where O correspond to operators of dimension describing the new interactions, the coefficients specify the strength of the interactions and are known as Wilson coefficients, and Λ is the scale of the new physics. Only dimension-six operators are considered for this paper as they are the lowest-dimension operators that conserve lepton and baryon number, and the effect of higher-dimensional operators is expected to be suppressed. Since at energies well below Λ only the ratio ( =6) /Λ 2 is accessible, Λ is absorbed into the Wilson coefficients, and they are redefined as = ( =6) /Λ 2 .
The SMEFTsim package [99] is used for the SMEFT implementations of FeynRules [100]. Monte Carlo samples were simulated with M G 5_aMC@NLO2.6.5 + P 8.243 at LO precision in QCD [79], with the NNPDF3.0nlo set of PDFs. Uncertainties come from missing higher-order QCD corrections, PDFs and from the uncertainty in S , using the same strategy as is used for the SM¯→ 4ℓ predictions, described in Section 4.
The set-up allows the separate generation of the SM only, BSM only and interference contributions, with the total predicted cross-section given by where ì SM is the most accurate SM predicted cross-section, described in Section 4, using S for the¯→ 4ℓ prediction, · ì INT is the interference term between the SM and BSM contributions, also referred to as the linear term, and 2 · ì BSM is the BSM-only cross-section, also referred to as the quadratic term. Since the BSM contributions are LO cross-sections, they are scaled by the ratio of ì SM to ì LO SM , which is the LO SM prediction given in the above set-up, where the assumption is that higher-order effects are the same for SM contributions as they are for contributions from new physics. A minimal flavour-violating scenario is assumed and the full list of 59 dimension-six operators can be found in Ref. [98]. Only the following 22 coefficients that give non-negligible contributions to the four-lepton final state are considered: • three affecting Higgs couplings: ,˜, ; • one affecting gauge boson couplings: ; • seven affecting the → ℓℓ vertex: , , , (1) , (3) , (1) , (3) ; • eleven from four-fermion interactions (contact terms): , , , , , , (1) , (1) , (3) , , .
Each one is considered separately with the others set to zero.
The test statistic is based on a ratio of profiled likelihoods [101], where is the Wilson coefficient,ˆandì are the unconditional maximum-likelihood estimators, andì is the conditional maximum likelihood estimated under the hypothesis. The value of the test statistic is used to compare the different probabilities of hypotheses to construct a likelihood scan over . For simplicity the uncertainties are taken to be fixed at the SM value, = 0, rather than scaling with . It has been shown that this leads to very similar results and is much less computationally expensive. In the case that the signal results in a small enhancement relative to the SM yield, with a dominant linear (interference) term, the use of uncertainties fixed at the SM value results in an asymptotic test statistic. For large, local enhancements or cases where the quadratic term dominates, the behaviour becomes non-asymptotic. For this reason Monte Carlo toys are used to evaluate the distribution of the test statistic when obtaining 95% confidence level (CL) intervals.
Two sets of results are presented: those where the full predicted cross-section from Eq. (3) is used, referred to as the full EFT model, and those where the quadratic term in Eq. (3) is neglected, referred to as the linear-only model. Since the linear terms of the missing dimension-eight operators in the effective Lagrangian expansion are at the same order in an expansion in 1/Λ as the quadratic terms of the dimension-six operators, large differences indicate that the neglected dimension-eight operators may play a non-negligible role. Tables 3 and 4 show the observed and expected exclusion limits at 95% CL for the 22 Wilson coefficients, for the full EFT model and for the model with only linear terms, respectively. Also shown in the tables is the most sensitive observable that is used to set the limit. The results are also presented graphically in Figure 12. When comparing the results in Tables 3 and 4, the coefficients fall into four categories: those ( , , , (1) , (3) , (3) and (1) ) that have a negligible contribution from the quadratic term and therefore have very similar limits whether or not the quadratic term is included; those ( (1) and (3) ) that, because of the quadratic term, make small positive contributions to the total prediction, enhancing (lessening) the positive (negative) linear contribution at positive (negative) values of the coefficient, and hence shifting both the lower and upper limits to slightly lower values; those ( and ) that have more stringent limits when using the linear-only terms because of the presence of double maxima in the likelihood scans when the quadratic term is included; and the remaining 11 coefficients that have non-negligible contributions from the quadratic term and hence more stringent limits when they are included (including˜, which has zero contribution from the linear term). The differences seen in the limits in the last category indicate that dimension-eight terms may not be negligible.
In general the observed limits are compatible with the expected limits. One exception is the observed lower limit on (1) , which is significantly less stringent than the expected limit for the results using linear-only terms. This is an artifact of the simple model of the scale uncertainty affecting the EFT prediction. In general, for a large EFT signal that is incompatible with the data, the EFT scale uncertainty's nuisance parameter is pulled in order to bring the prediction closer to the observation. The nuisance parameter's constraint term penalises this behaviour in the likelihood. However, for large negative values of (1) the size and shape of the scale uncertainty's effect on the signal prediction are related in a way that produces a prediction precisely imitating the statistical fluctuations in the measured cross-section, increasing the likelihood. This leads to a larger than expected observed lower limit on (1) .
Limits have already been placed on ,˜and , in a measurement of the → 4ℓ crosssection [102]. For the limits are more stringent in the → 4ℓ cross-section analysis, but for˜the limits are very similar, with this paper providing slightly tighter constraints. For the constraints from this paper are significantly more stringent with [−0.20, 0.21] expected and [−0.29, 0.13] observed compared to [−1.09, 0.99] expected and [−1.06, 0.99] observed in the → 4ℓ cross-section paper. The improvement can be understood as being due to the fact that changes in affect the entire 4ℓ spectrum, not just the region close to . Limits on the coefficients affecting the → ℓℓ vertex have been obtained previously from a global fit to LEP and LHC data [103], and are generally one or two orders of magnitude more stringent than the limits in this paper.

− gauge model constraints
The data are used to set limits on a model in which the global baryon-number-minus-lepton-number ( − ) symmetry is treated as a local gauge symmetry and spontaneously broken [104]. This model predicts a , the gauge boson of the new local symmetry, and an exotic Higgs boson ℎ 2 , from the symmetry breaking, that mixes with the SM Higgs boson, with a mixing angle . The new Higgs boson and the SM Higgs boson can decay into or , with the gauge bosons decaying into lepton pairs.
The sensitivity of previous LHC measurements to this model was presented in Ref. [105], where four-lepton measurements were seen to provide significant constraints. One of the scans from that study is repeated here using all the measured observables, to derive exclusion limits in the plane of sin versus the mass of ℎ 2 , ℎ 2 , for a low-mass (35 GeV) which is weakly coupled to the SM ( = 10 −3 ). In this scenario, depending on the other parameters, and production via one of the Higgs bosons may lead to contributions to the cross-sections measured in this paper.
BSM events were generated using Herwig 7.2 [106,107]. Interference terms, and the impact of the model on SM decays of the SM Higgs, are not taken into account. The impact of this on the limits is expected to be negligible.
where determines the strength of the signal process, with = 0 corresponding to the SM hypothesis and = 1 being the nominal signal hypothesis. The CL s method [108] is then used to derive the 95% CL exclusion contour in the (sin , ℎ 2 ) plane as shown in Figure 13(a). 5 Figure 13(b) displays for each (sin , ℎ 2 ) point the observable which has the highest expected sensitivity and was therefore used to set the limit. The most sensitive observable changes in the region ℎ 2 ∼ 125 GeV since the phenomenology of the model changes as the mass of the exotic Higgs boson approaches that of the SM Higgs boson. In order to demonstrate the power of the different variables, a similar scan was performed using the 4ℓ distribution only, and the expected limits on sin weakened from 0.28 to 0.46 at high ℎ 2 . As can be seen in Figure 13(b), improvements at high ℎ 2 come mostly from including measurements of 12 . The previous study [105] only covers sin > 0.4 for most values of ℎ 2 , and provides no constraints for ℎ 2 > 600 GeV. The new overall limits significantly improve on this, excluding sin > 0.28 for most of the plane, even at high ℎ 2 , and, for example, for ℎ 2 = 500 GeV, the limit on sin improves from 0.5 to 0.26.  Figure 13: (a) Exclusion contour at 95% confidence level for the − gauge model [104] in the plane of sin versus ℎ 2 for = 35 GeV and = 10 −3 . The red line shows the observed exclusion and the solid black line shows the expected exclusion, with the dashed black lines indicating the ±1 and ±2 uncertainty bands. The region above and to the left of the lines is excluded, up to the grey band, which shows the region ℎ 2 < 2 , where the four-lepton final state has no sensitivity. (b) Colour palette indicating which observable has the highest expected sensitivity at each point in the parameter space and is used to set the limit.

Conclusion
Measurements of inclusive and differential fiducial cross-sections in four-lepton events are presented using 139 fb −1 of √ = 13 TeV proton-proton collisions, collected by the ATLAS detector during Run 2 of the LHC. Included are measurements of the four-lepton invariant mass, the dilepton masses, and other kinematic variables of the four leptons and dilepton pairs, including angular correlations. The measured phase-space is significantly increased compared to previous measurements with four leptons, in particular reducing requirements that specifically enhance the contribution from SM production. The methodology used to correct for detector effects was developed to have minimal bias in the case of possible BSM contributions to the final-state. The measured cross-sections are compared with state-of-the-art SM predictions and are found to be in agreement.
The final state has contributions from a number of SM processes, including → 4ℓ, → 4ℓ and the dominant quark-induced -channel¯→ 4ℓ process. The region dominated by → 4ℓ production is used to extract the most precise measurement of the → 4ℓ branching fraction to date, B →4ℓ = (4.41 ± 0.13 (stat.) ± 0.23 (syst.) ± 0.09 (theory) ± 0.12 (lumi.)) × 10 −6 . The result is consistent with previous measurements and with the SM prediction.
Various BSM effects can contribute to the final state. The data are used to constrain SMEFT coefficients and a model based on a spontaneously broken − gauge symmetry. All necessary information for a detailed reinterpretation has been made public so it can be used to probe further BSM models in the future.

Appendix
This Appendix shows the differential cross-sections that are not included in Section 6.1. In each case the data are compared with the SM predictions. Two predictions are shown, one where the¯→ 4ℓ process is simulated with S and one where it is simulated with P + P 8. All the other processes are the same for both predictions. Figure 14 shows the 4ℓ distribution separately for the three channels 4 , 4 and 2 2 ; Figures 15 and 16 show it in slices of T,4ℓ and | 4ℓ | respectively. The SM predictions describe the data well for most of the distributions. However, the P + P 8 prediction in the highest-T slice in Figure 15, and in the off-shell region in the 0.9 < | 4ℓ | < 1.2 slice in Figure 16, is below the data. Figures 17,18,19, and 20 respectively show T,12 , T,34 , |Δ pairs | and |Δ pairs | separately in the → 4ℓ, → 4ℓ and off-shell regions. The on-shell region is shown in Section 6.1. The data are well described by the SM predictions. The -values for the |Δ pairs | distribution in the off-shell region are low, which is mainly driven by statistical fluctuations in the data, together with predictions that are lower than the observations at low |Δ pairs | values, especially for the P + P 8 prediction.  Figure 14: Differential cross-section as a function of 4ℓ for each lepton flavour channel. The measured data (black points) are compared with the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data. The -axis is on a linear scale until 4ℓ = 225 GeV, where it switches to a logarithmic scale.  Figure 15: Differential cross-section as a function of 4ℓ in T,4ℓ slices. The measured data (black points) are compared with the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data. The -axis is on a linear scale until 4ℓ = 234 GeV for the T,4ℓ < 10 GeV slice and 4ℓ = 225 GeV for all other slices, when it switches to a logarithmic scale.   Figure 16: Differential cross-section as a function of 4ℓ in | 4ℓ | slices. The measured data (black points) are compared with the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data. The -axis is on a linear scale until 4ℓ = 225 GeV, where it switches to a logarithmic scale.   Figure 17: Differential cross-section as a function of T,12 in 4ℓ regions. The measured data (black points) are compared with the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data. In (a) the -axis is on a linear scale until T,12 = 44 GeV, where it switches to a logarithmic scale; in (b) it switches at T,12 = 50 GeV, and in (c) at T,12 = 54 GeV.    Figure 18: Differential cross-section as a function of T,34 in 4ℓ regions. The measured data (black points) are compared with the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data. In (a) the -axis is on a linear scale until T,34 = 36 GeV, where it switches to a logarithmic scale; in (b) it switches at T,34 = 34 GeV, and in (c) at T,34 = 54 GeV.   Figure 19: Differential cross-section as a function of |Δ pairs | in 4ℓ regions. The measured data (black points) are compared with the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data.   Figure 20: Differential cross-section as a function of |Δ pairs | in 4ℓ regions. The measured data (black points) are compared with the SM prediction using either S (red, with red hashed band for the uncertainty) or P + P 8 (blue, with blue hashed band for the uncertainty) to model the¯→ 4ℓ contribution. The error bars on the data points give the total uncertainty and the grey hashed band gives the systematic uncertainty. The -value is the probability for the 2 , with the number of degrees of freedom equal to the number of bins in the distribution, to have at least the observed value, given the SM prediction. The lower panel shows the ratio of the SM predictions to the data.         The ATLAS Collaboration