Measurement of the four-lepton invariant mass spectrum in 13 TeV proton-proton collisions with the ATLAS detector

A measurement of the four-lepton invariant mass spectrum is made with the ATLAS detector, using an integrated luminosity of 36.1 fb$^{-1}$ of proton-proton collisions at $\sqrt{s}$ = 13 TeV delivered by the Large Hadron Collider. The differential cross-section is measured for events containing two same-flavour opposite-sign lepton pairs. It exhibits a rich structure, with different mass regions dominated in the Standard Model by single $Z$ boson production, Higgs boson production, and $Z$ boson pair production, and non-negligible interference effects at high invariant masses. The measurement is compared with state-of-the-art Standard Model calculations, which are found to be consistent with the data. These calculations are used to interpret the data in terms of $gg\rightarrow ZZ \rightarrow 4\ell$ and $Z \rightarrow 4\ell$ subprocesses, and to place constraints on a possible contribution from physics beyond the Standard Model.

The box diagram gg → 4 and gg → H ( * ) → 4 processes interfere destructively in the SM. While interference is maximal around m 4 = 220 GeV [3], the relative effect of the gg → H ( * ) → 4 contribution to the overall gg → 4 lineshape is most pronounced above 350 GeV, as is visible in Figure 2.
The off-shell Higgs production rate may be affected by beyond-the-SM (BSM) processes involving additional heavy particles, or modifications of the Higgs couplings, even if there is no effect on on-shell Higgs boson production [4].
Previous measurements in this final state were carried out at √ s = 13 TeV by the ATLAS [5] and CMS [6] collaborations with a focus on Z Z production. The CMS result additionally includes a determination of the Z → 4 branching ratio using a dedicated detector-level analysis. The ATLAS Collaboration performed a measurement of inclusive four-lepton production at √ s = 8 TeV [7] and set constraints on the contribution from gg → 4 . An analysis using √ s = 7 TeV and 8 TeV data [8] to determine the Z → 4 branching fraction has also been published by ATLAS. Constraints on off-shell Higgs boson production have recently been set by ATLAS [9] using the 4 and 2 2ν final states in a dedicated detector-level analysis. This measurement is carried out in a fiducial phase space based on the kinematic acceptance of the detector to ensure a high selection efficiency. The fiducial phase space and all observables are defined using stable final-state particles to minimise model dependence. The observation at detector level is corrected for experimental effects such as the detector and trigger system efficiencies and the detector resolution to provide results which may be used and reinterpreted without requiring a full simulation of the ATLAS detector. Electrons or muons originating from leptonic decays of the τ-lepton are not considered to be part of the signal and their contribution to the observation at detector level is subtracted.
Cross-sections are measured differentially in the invariant four-lepton mass m 4 , and double-differentially with respect to both m 4 and the following kinematic variables: the transverse momentum of the four-lepton system p 4 T , the rapidity of the four-lepton system y 4 , and a matrix-element discriminant (introduced in Ref. [3] and denoted by D ME in this paper) designed to distinguish the s-channel Higgs-mediated production process from all other processes. The m 4 measurement is also made separately for each flavour combination of leptons in the event; 4e, 4µ and 2e2µ. The double-differential cross-sections can provide additional sensitivity to the various subprocesses contributing to the measured final state; for example, the p 4 T is expected to discriminate gg → Z Z from qq → Z Z. They are also of interest for future interpretation; for example, some BSM contributions can have an impact which depends upon the final-state lepton flavours [10]. The measurements are compared with SM predictions. To explore the potential of reinterpreting differential cross-section measurements, they are also used to constrain the gg → 4 process and set a limit on the gg → H * → 4 off-shell signal strength, to extract the Z → 4 contribution and to place limits on a selected BSM scenario.
[GeV] The total gg → 4 includes contributions from gg → H ( * ) → 4 as well as gg → 4 and the interference between the two. The qq → 4 and gg → 4 processes including off-shell Higgs boson production are modelled using S 2.2.2 including all corrections described in Section 5, while on-shell Higgs production is modelled using the dedicated samples based on P + P 8 and M G 5_aMC@NLO + Herwig++ described in the same section.
The angular separation between opposite flavour leptons in the quadruplet is required to satisfy ∆R > 0.2, while any same flavour leptons have to be separated by ∆R > 0.1 from each other. The latter condition enhances the acceptance for boosted topologies in high-m 4 Z boson pair production. To exclude leptons originating from quarkonia decays, the invariant mass of any same-flavour, opposite-sign lepton pair in the event is required to exceed 5 GeV. The full list of selection criteria is given in Table 1 and largely follows Refs. [16,17]. The overall range in m 4 considered for this measurement is 70 GeV < m 4 < 1200 GeV and was chosen based on the yields predicted in MC simulation. All candidates observed in the collision data fall into this interval. In addition to the invariant mass m 4 , transverse momentum p 4 T , rapidity y 4 and flavour composition of the selected quadruplet, the observables measured in this paper also include a matrix-element discriminant (D ME ) defined as where M 2 X p µ 1,2,3,4 indicates the squared matrix element for process X evaluated for the specific fourmomenta and flavours of the leptons in the given event, and M 2 X (m 4 ) represents the average squared matrix element for process X in the fiducial region for the given four-lepton invariant mass. The first squared matrix elementM 2 gg(→H ( * ) )→ZZ ( * ) →4 in the denominator of Eq. (1) includes the non-Higgs box diagram ( Figure 1(b)), Higgs-mediated production (Figure 1(d)), as well as the interference of the two, whereas the squared matrix element in the numeratorM 2 gg→H ( * ) →Z Z ( * ) →4 only includes Higgs-mediated production. The constant factor multiplying the t-channel matrix element in the denominator affects the shape of the observable, but does not have a significant impact on its separation power. The value of 0.1 is chosen to keep the peak of the distribution sufficiently distant from the maximum possible value of 0 while also limiting tails in the negative direction. The numerator represents the s-channel matrix element involving the Higgs boson produced via gluon-gluon fusion. The squared matrix elements are computed at leading-order QCD precision using the MCFM [18] program version 8.0. The strong coupling constant is evaluated at the scale of half the four-lepton invariant mass. The Higgs boson mass is set to m H = 125.0 GeV, and its width to the Standard Model prediction for this mass. Given the leading-order QCD precision, the incoming parton momenta are approximated by assuming the four-lepton centre-of-mass system is produced at rest.

Data sample and event selection
This measurement uses 36.1 fb −1 of proton-proton collision data with a centre-of-mass energy √ s = 13 TeV, collected during 2015 and 2016 with the ATLAS detector.
Events are selected in the online trigger system by requiring that one of several triggers be passed, in which one, two or three leptons (electrons or muons) are required with a range of lepton p T requirements dependent upon the multiplicity [19]. The combined efficiency of these triggers for events within the detector-level phase space of the measurement is above 96% for 70 GeV < m 4 < 180 GeV and increases beyond 99% for m 4 > 180 GeV as the final-state leptons become more likely to satisfy the trigger thresholds.
Electron identification is based on variables describing the longitudinal and transverse shapes of the electromagnetic showers in the calorimeters, properties of tracks in the inner detector, and track-cluster matching [20,21]. Muons are identified using information from the muon spectrometer, the inner tracking detector and calorimeters, with the requirements depending upon the angular region and p T of the muon [22].
Using the candidates identified in this way, the detector-level event selection looks for four prompt leptons, as detailed in Table 2. Electrons are required to satisfy a loose-identification working point for which the efficiency is about 95% [23], have E T > 7 GeV and |η| < 2.47. Muons must likewise satisfy a loose-identification working point, designed to achieve high efficiencies of about 99% with relatively low backgrounds [22], and have p T > 5 GeV, or p T > 15 GeV if they are tagged solely in the calorimeter ("calorimeter-tagged muon"). To select leptons originating from the primary proton-proton interaction, their tracks are required to have a longitudinal impact parameter (z 0 ) satisfying |z 0 sin(θ)| < 0.5 mm from the primary interaction vertex. Background from cosmic-ray muons is rejected by requiring each muon track's transverse impact parameter (d 0 ) to satisfy |d 0 | < 1 mm. This additionally discriminates against non-prompt muons.
Using the leptons selected in this way, a quadruplet is formed according to the kinematic selection criteria defining the fiducial phase space described in Section 3. The quadruplet is then subjected to further requirements in order to suppress the contribution of leptons from secondary decays or misidentifications related to jet activity. It must not contain more than one muon identified solely in the calorimeter or solely in the muon spectrometer. None of the leptons constituting the quadruplet may have a transverse impact parameter significance d 0 /σ d 0 > 5 (3) for electrons (muons). All leptons of the quadruplet are required to satisfy isolation criteria based on particle-tracks measured in the inner detector and energy deposits in the electromagnetic calorimeter. When evaluating these criteria, tracks or deposits originating from leptons in the quadruplet are not considered in order to retain events with close-by prompt leptons. Finally, the four leptons of the quadruplet are required to be loosely compatible with originating from a common vertex, evaluated by means of the reduced-χ 2 vertex fit using the four lepton trajectories. This further suppresses the contribution of secondary leptons from band c-hadron decays.

Theoretical predictions and simulation
Simulated events are used to correct the observed events for detector effects, as well as to estimate the expected numbers of signal and background events and the systematic uncertainty of the final results.
Events from Monte Carlo simulation (MC) were passed through a detailed simulation of the ATLAS detector and trigger [24], and the same reconstruction and analysis software as applied to the data. The effect of multiple pp interactions per bunch crossing, as well as the effect on the detector response due to interactions from bunch crossings before or after the one containing the hard interaction, referred to as "pile-up", is emulated by overlaying inelastic pp collisions onto the generated events. The events are then reweighted to reproduce the distribution of the number of collisions per bunch-crossing observed in the data. This procedure is known as "pile-up reweighting". To allow the contamination from events with τ-leptons to be evaluated, generated samples include τ-leptons.
The pair production of two Z bosons via the qq → 4 process was simulated with the S 2.  [30] was used, and the QCD renormalisation and factorisation scales were set to m 4 /2. The total cross-section from this calculation agrees within scale uncertainties with an NNLO QCD prediction obtained using the MATRIX program [31][32][33][34]. A reweighting for virtual NLO EW effects [35,36] was applied as a function of the four-lepton invariant mass, m 4 , which modifies the differential cross-section by between +3% (for m 4 ∼ 130 GeV) and −20% for m 4 > 800 GeV. The real higher-order electroweak contribution to 4 production in association with two jets (which includes vector-boson scattering) is not included in the sample discussed above but it was modelled separately using S 2.2.2 with the NNPDF3.0NNLO PDF set. A second qq → 4 sample was generated at NLO precision in QCD using P -B v2 [37][38][39] configured with the CT10 PDF set [40] and interfaced to P 8.186 [41,42] for parton showering. A correction to higher-order precision (K-factor), defined for this process as the ratio of the cross-section at NNLO QCD accuracy to the one at NLO QCD accuracy, was obtained using the MATRIX NNLO QCD prediction and applied to this sample as a function of m 4 , modifying the inclusive cross-section by between +10% for m 4 < 180 GeV and +25% for m 4 > 800 GeV. The reweighting for virtual NLO EW effects discussed above for the S case was also applied to this sample.
The purely gluon-initiated Z Z production process enters at next-to-next-to-leading order (NNLO) in α S . It was modelled using S 2.2.2 [43], at LO precision for zero-and one-jet final states, and the NNPDF3.0NNLO PDF set was chosen. This sample includes the box diagram, the s-channel process proceeding via a Higgs boson, and the interference between the two. Recently, a NLO QCD calculation for the three components became available [44,45] allowing m 4 differential K-factors to be calculated with the 1/m t expansion below 2m t , and assuming a massless quark approximation above this threshold. This NLO QCD calculation was used to correct the s-channel process gg → H * → Z Z ( * ) → 4 , the box diagram gg → 4 and the interference with separate K-factors. These represent significant corrections of the order of +100% to the leading-order cross-section. There are, however, NNLO QCD precision calculations for the off-shell Higgs boson production cross-section [46,47] which show additional enhancement of the cross-section. Since these corrections are not known differentially in m 4 for all three components, the prediction for each component is scaled by an additional overall correction factor of 1.2, assumed to be the same for the signal, background and interference. This additional constant scale factor is justified by the approximately constant behaviour of the NNLO/NLO QCD prediction. In addition, a purely leading-order prediction for the gg → 4 process was obtained using the MCFM program [18] with the CT10 PDF set [40], interfaced to P 8 [41,42].
In the mass range 100 GeV < m 4 < 150 GeV, where on-shell Higgs production dominates and the effect of interference is negligible, dedicated samples are used to model the on-shell Higgs and box diagram continuum Z Z production processes. In the case of the box diagram, the same combination of NLO QCD K-factor and a factor of 1.2 to account for higher-order effects, as described above, is applied to correct the cross-section. The Higgs production processes via gluon-gluon fusion (ggF) [48] (which dominates the on-shell Higgs production), via vector-boson fusion (VBF) [49] and in association with a vector boson (V H) [50] were all simulated at NLO precision in QCD using P -B v2 with the PDF4LHC next-to-leading-order (NLO) set of parton distribution functions [51] and interfaced to P 8.186. The decay of the Higgs and Z bosons was performed within P . The description of the gluon-gluon fusion process was further improved by reweighting to NNLO QCD accuracy using the HNNLO program [52][53][54], referred to as the NNLOPS method [55], and the resulting prediction was normalised using cross-sections calculated at N3LO precision in QCD [47]. For VBF production, full NLO QCD and EW calculations were used with approximate NNLO QCD corrections. The VH production was calculated at NNLO in QCD and NLO EW corrections are applied. Production in association with a top-quark pair was simulated to NLO accuracy in QCD using M G 5_aMC@NLO [56,57] configured with the CT10 PDF set and interfaced to Herwig++ [58,59]. The contribution from this process is very small in the analysis.
Other SM processes resulting in four prompt leptons in the final state are considered as irreducible backgrounds, and were also simulated using MC generators. These include triboson production (ZWW, Z ZW and Z Z Z) and tt pairs produced in association with vector bosons (tt Z, ttWW) collectively referred to as ttV(V). The triboson processes were generated with S 2.1.1 using the CT10 PDF set. The WW Z predicted has leading-order QCD precision for up to two additional outgoing partons while the W Z Z and Z Z Z prediction has next-to-leading-order QCD precision for zero additional outgoing partons and leading-order QCD precision for up to two partons. The ttV processes were generated with S 2.2.0 at leading-order QCD precision and the NNPDF3.0NNLO PDF set.
In addition to these contributions, reducible background processes which can contribute to the final event selection but contain at least one non-prompt or mis-reconstructed lepton are estimated using a partially data-driven method detailed in Refs. [16,17]. These processes include one or more leptons produced from heavy-flavour hadron decays, muons from pion or kaon decays, or electrons from either photon conversion or hadron misidentification. The majority of these events originate from Z bosons produced in association with jets, tt production with leptons from heavy-flavour decay, and W Z production in association with jets. Contributions from these processes are estimated separately depending on the flavour of the leptons in the secondary pair and the source of the non-prompt lepton(s). This estimation procedure uses a number of different control regions and simultaneous fits, and for some specific processes the estimation is taken directly from MC simulation. The data-driven results were validated in separate control regions using data. This contribution is small compared to that of prompt four-lepton production, and negligible for m 4 > 200 GeV.

Unfolding for detector effects
The measured four-lepton mass spectrum and additional double-differential spectra are "unfolded" to correct for experimental effects, including the resolution and efficiency of the detector and trigger system. This allows direct comparison with particle-level predictions within the fiducial phase space.
The unfolding procedure is based on describing the relationship between the number of events measured in a bin d of a particular detector-level differential distribution and the yield in bin p of the corresponding particle-level distribution using a single response matrix R dp . This matrix consists of three contributions: • The reconstruction efficiency is measured as the ratio of the number of events which pass both the fiducial and detector event selections to the number passing the fiducial selection, as a function of the kinematic observable(s) at particle level. Above m 4 = 200 GeV, it is typically between 60% and 80%, while for lower values of m 4 , values as low as 30% are reached for the 4e final state, due to reduced detector efficiency when reconstructing leptons of low transverse momenta. It enters R dp as a diagonal matrix.
• A "migration matrix" which contains the probabilities that a particle-level event from a given fiducial bin which passes the detector selection will be found in a particular reconstructed bin. It accounts for bin-to-bin migrations. For all measurements, the diagonal elements of this matrix, also referred to as the "fiducial purity" in each bin, have values above 80%, with most of the small amount of migration occurring between neighbouring mass bins.
• Finally, the fiducial fraction accounts for events which pass the detector selection but fail the fiducial event selection. This can occur due to the resolution of the detector, or leptons originating from leptonically decaying τ-leptons. It is measured by taking the ratio of events which pass both the fiducial and detector selection to the total passing the detector selection. It is close to unity for m 4 > 200 GeV, and above 90% below this threshold. It enters R dp as a diagonal matrix.
In the unfolding procedure, first, the fiducial fraction is accounted for by multiplying the backgroundsubtracted observation in each bin of the measurement with the fiducial fraction for that particular bin. Then, an iterative Bayesian procedure [60], using the particle-level predicted distribution as the initial prior and the migration matrix, is used to correct for bin migration. The iteration procedure reduces the dependence on the initial prior. The number of iterations is used as a regularisation parameter and controls the statistical uncertainty. Two iterations are found to be optimal for all distributions by MC studies aiming to minimise both the statistical uncertainty and the bias. Finally, the resulting estimate of the particle-level distribution is divided by the reconstruction efficiency bin by bin to obtain the final result. This approach represents a compromise between accounting for the small migration effects that occur and minimising the effect of small fluctuations in the detector-level distributions through the regularisation approach.
The binning used for the measurements presented in this paper is driven by the requirements of the procedure described above. Bin edges are placed to cover as wide as possible a phase-space interval with fine granularity while ensuring a fiducial purity of at least 80%. In addition, a minimum predicted detector-level yield of 10 events is required in each bin to ensure the numerical stability of the unfolding procedure and the viability for reinterpretation.
The robustness of the unfolding procedure to possible deviations of the data from the SM prediction was studied to ensure the model-independence of the analysis. Three scenarios were checked by unfolding pseudo-data after including the following: a greatly varied rate from off-shell Higgs production, or gluon-induced Z Z production, (−75% / +200% and −100% / +400% respectively) and the injection of an additional scalar resonance (masses of 200, 400 and 900 GeV were used). For the smooth, non-resonant modifications of the lineshape, the true lineshape was reproduced by unfolding with the SM-based response matrix with excellent accuracy, with residual biases far less than statistical precision. For large, resonant BSM contributions the bias is larger, up to the order of the statistical uncertainty when using the high-D ME region (defined in Section 8). This type of interpretation is not considered here, but it is noted for any reinterpretations which may be affected.

Uncertainties
The limiting source of uncertainty in this measurement is the statistical uncertainty, which is many times greater than the total systematic uncertainty in some bins. Experimental and theoretical sources both contribute to the systematic uncertainty, and their relative impact varies depending on the bin.
The statistical uncertainty of the data is estimated using 2000 Poisson-distributed pseudo-datasets centred on the observed value in each bin, and repeating the unfolding procedure for each set. The root mean square of the differences between the resulting unfolded distributions and the unfolded data is taken as the statistical uncertainty in each bin.
Experimental systematic uncertainties affect the response matrix used in the unfolding procedure. They are dominated by the reconstruction, identification and isolation efficiency uncertainties for electrons [23,61] and muons [22]. There are smaller contributions from lepton momentum resolution and scale uncertainties, and the uncertainty in the pile-up reweighting.
The uncertainty in the combined 2015+2016 integrated luminosity is 2.1%. It is derived, following a methodology similar to that detailed in Ref. [62], and using the LUCID-2 detector for the baseline luminosity measurements [63], from calibration of the luminosity scale using x-y beam-separation scans. This uncertainty is fully correlated across all measured cross-section bins and is propagated to the limit setting in the interpretations of the results. All other sources of systematic uncertainty are propagated to the final unfolded distributions by varying the inputs within their uncertainty, repeating the unfolding, and taking in each bin the resulting deviation from the nominal response matrix as the uncertainty.
Theoretical uncertainties primarily affect the particle-level predictions obtained from simulation. Since they affect the contribution of individual subprocesses to the total cross-section and the final-state lepton kinematics, they also impact the response matrix and hence the measured cross-sections. However, this is a very small effect compared to the experimental uncertainties and the statistical uncertainty. The most significant sources of theoretical uncertainty are the choice of factorisation and renormalisation scales, PDF set, and parton showering model within the event generator for the qq → 4 and gg → 4 MC samples.
In the case of qq → 4 , the full uncertainty due to the scale choice was estimated using seven sets of values for the the renormalisation and factorisation scales obtained by independently varying each to either one half, one, or two times the nominal value while keeping their ratio in the range of [0.5, 2]. Since a NLO QCD K-factor obtained within the fiducial phase space is applied in the gg → 4 samples, the uncertainty due to the scale choice for this production process within the fiducial phase space is evaluated using the differential scale uncertainty of this K-factor. In addition, seven sets of two values for the scales as described above are used to evaluate the impact of the scale choice on the acceptance for gg → 4 .
Due to the reweighting of the purely gluon-induced Z Z production processes described in Section 5, there are several other uncertainties affecting the normalisation in addition to the scale-induced uncertainties calculated together with the NLO QCD K-factors discussed above. In the m 4 region below 2m t , the higher-order corrections were computed solely for events not featuring jets with p T > 150 GeVto ensure a good description by the 1/m t expansion. Therefore, the default scale uncertainty is doubled for about 8% of the events in this region which contain such jets. Likewise, the scale uncertainty is also doubled at 2m t , with a Gaussian-smoothed transition from this maximal value down to the default uncertainty within a distance of 50 GeV to either side of the threshold. The inflated uncertainty is intended to account for potential effects as the top quarks become on-shell. It is assumed that the relative NLO QCD corrections for massless and massive loops behave similarly beyond 2m t and that the NNLO QCD correction calculated for the off-shell Higgs production process mimics the continuum production and the interference well, so no further uncertainty is considered. It is expected that the NLO QCD scale uncertainty covers these effects, as it is larger than the one calculated at NNLO QCD.
The uncertainty due to the choice of PDF set was estimated for both qq → 4 and gg → 4 by reweighting the sample to the alternative PDF sets CT10 and MSTW [64] as well as evaluating eigenvector variations of the default NNPDF3.0NNLO PDF set. In the case of qq → 4 , the envelope of these three variations is used to assign an uncertainty. For gg → 4 , the envelope is formed using only the effect of the variations on the shapes, as the cross-section is taken from the higher-order reweighting.
The impact on the detector corrections originating from differences in the showering model was assessed for both processes by varying the CKKW matching scale [65,66] from the S 2.2.2 default, changing the dipole recoil scheme in the shower to the one in [67] and by varying the resummation scale up and down by a factor of two. Furthermore, in order to account for non-factorising effects, qq → 4 events with high QCD activity [68] were assigned an additional uncertainty of the size of the NLO EW correction. As the NLO EW reweighting is only applied for qq → 4 , this last uncertainty is not applied to the gg → 4 or gg → H ( * ) → 4 processes.
Theoretical uncertainties in the modelling of resonant Higgs boson production do not have a significant effect on the response matrix, since this process is confined to a single bin in the m 4 spectrum. They mainly affect the predicted particle-level differential cross-sections. The same uncertainties as reported in Ref.
[16] are applied in this paper. They are dominated by QCD scale and PDF uncertainties affecting the gluon-gluon fusion component.
In order to cross-check and estimate the uncertainty due to the choice of generator used to model the qq → 4 process, the difference between the unfolded results using the nominal S 2.2.2 samples and the alternative P + P 8 sample is taken as a systematic uncertainty.
The MC statistical uncertainty in the unfolding procedure is evaluated using a bootstrap method with 2000 toy samples, each assigning a Poisson weight with an expected value of one to every MC event used in the analysis. The RMS of the unfolded result in each bin for all toy samples is then taken as an uncertainty, and is typically between 0.5% and 1.5% per bin.
The uncertainty due to the unfolding method itself is estimated as follows. The MC events are reweighted with fitted functions of the particle-level observables to give good agreement between the reconstructed MC distribution and the observed data distribution. The reconstructed MC distribution is then unfolded using the nominal response matrix and compared with the reweighted particle-level distribution, with the difference between the two taken as a systematic uncertainty in each bin. For the majority of bins this is less than 1%, with the exception of two bins with the fewest number of events in the double-differential m 4 -p 4 T distribution (defined in Section 8) which result in 3% and 5% uncertainties. For comparison, the statistical uncertainty is around 25% and 45% in those respective bins.  the first and third bins correspond to single Z boson production and radiative decay, and on-shell Higgs production, respectively. An enhancement at around 180 GeV due to the onset of on-shell Z Z production is also clearly visible. Overall, no significant discrepancy between the prediction and observation is found.

Measured distributions
The observed distributions are then corrected for detector effects by unfolding as described in Section 6. The resulting measured differential cross-section as a function of m 4 and double-differential cross-sections as functions of m 4 and p 4 T , |y 4 |, the D ME discriminant, or the final-state lepton flavour configuration are shown in Figures 10-14, and compared with particle-level predictions.
Overall the predictions are consistent with the measurement when using either S 2.2.2 or P + P 8 to describe the dominant qq → 4 component, considering the systematic and statistical uncertainties.
Furthermore, the predictions from S 2.2.2 and P + P 8 are in excellent agreement. This gives confidence in the validity of the procedure used to reweight P -B events to NNLO QCD accuracy by applying m 4 -based K-factors calculated with MATRIX [31][32][33][34]. It also indicates that, at least for this observable, an analogous reweighting of S events is not required due to this generator's intrinsic higher accuracy. The fixed-order NNLO QCD prediction by MATRIX shows an expected underestimation at and below the on-shell m Z Z threshold. This underestimation is mainly due to missing real, wide-angle QED emission effects in events where both Z bosons are on-shell, and amounts to several tens of percent of the total population in the region just below the on-shell threshold [36]. For the S 2.2.2 and P + P 8 samples, QED effects are included from estimates taken from QED shower programs. Moreover, the fixed-order MATRIX prediction is equivalent to having leading-order precision for the continuum gg → 4 process and on-shell Higgs boson production, while the event generator samples include sizeable higher-order contributions. The predictions from S ,  [GeV]    Prediction / Observation Figure 10: Measured differential cross-section (black dots) compared with particle-level SM predictions (coloured lines) for the m 4 distribution. The total systematic plus statistical uncertainty of the measured cross-section is displayed as a grey band. Two SM predictions with different event generator samples for qq → 4 (described in Section 5) are shown with different line colours and styles. In addition, an unmodified NNLO-precision fixed-order calculation using the MATRIX program is shown with a grey histogram, to illustrate the effects of additional higher-order corrections and QED final state radiation included in the event generator predictions. The ratio of the particle-level MC predictions to the unfolded data is shown in the lower panel.  Figure 11: Measured differential cross-section (black dots) compared with particle-level SM predictions (coloured lines) as a function of m 4 in slices of p 4 T . The total systematic plus statistical uncertainty of the measured cross-section is displayed as a grey band. Two SM predictions with different event generator samples for qq → 4 (described in Section 5) are shown with different line colours and styles. In addition, an unmodified NNLO-precision fixed-order calculation using the MATRIX program is shown with a grey histogram, to illustrate the effects of additional higher-order corrections and QED final state radiation included in the event generator predictions.The m 4 bins are given along the horizontal axis, and the bins of the secondary variable are stacked vertically and labelled with the bin range values. The ratio of the particle-level MC predictions to the unfolded data as a function of m 4 for each secondary variable bin is given in the panel to the right-hand side.  Figure 12: Measured differential cross-section (black dots) compared with particle-level SM predictions (coloured lines) as a function of m 4 in slices of |y 4 |. The total systematic plus statistical uncertainty of the measured cross-section is displayed as a grey band. Two SM predictions with different event generator samples for qq → 4 (described in Section 5) are shown with different line colours and styles. In addition, an unmodified NNLO-precision fixed-order calculation using the MATRIX program is shown with a grey histogram, to illustrate the effects of additional higher-order corrections and QED final state radiation included in the event generator predictions.The m 4 bins are given along the horizontal axis, and the bins of the secondary variable are stacked vertically and labelled with the bin range values. The ratio of the particle-level MC predictions to the unfolded data as a function of m 4 for each secondary variable bin is given in the panel to the right-hand side.  Figure 13: Measured differential cross-section (black dots) compared with particle-level SM predictions (coloured lines) as a function of m 4 in slices of the D ME discriminant. The total systematic plus statistical uncertainty of the measured cross-section is displayed as a grey band. Two SM predictions with different generator samples for qq → 4 (described in Section 5) are shown with different line colours and styles. The m 4 bins are given along the horizontal axis, and the bins of the secondary variable are stacked vertically and labelled with the bin range values. The ratio of the particle-level MC predictions to the unfolded data as a function of m 4 for each secondary variable bin is given in the panel to the right-hand side.  Figure 14: Measured differential cross-section (black dots) compared with particle-level SM predictions (coloured lines) as a function of m 4 for each final-state lepton flavour configuration. The total systematic plus statistical uncertainty of the measured cross-section is displayed as a grey band. Two SM predictions with different generator samples for qq → 4 (described in Section 5) are shown with different line colours and styles. In addition, an unmodified NNLO-precision fixed-order calculation using the MATRIX program is shown with a grey histogram, to illustrate the effects of additional higher-order corrections and QED final state radiation included in the event generator predictions. The m 4 bins are given along the horizontal axis, and the bins of the secondary variable are stacked vertically and labelled with the bin range values. The ratio of the particle-level MC predictions to the unfolded data as a function of m 4 for each secondary variable bin is given in the panel to the right-hand side.

Interpretations
The measured particle-level differential and double-differential fiducial cross-sections can be interpreted to measure SM parameters and set limits on BSM contributions. To explore and demonstrate this potential, a range of interpretations are presented in this paper. The production rate of gg → 4 is extracted with respect to the SM prediction using the differential cross-section measured as a function of m 4 . The Z → 4 branching fraction is estimated from the measured fiducial cross-section in the mass bin corresponding to m Z . Constraints on the rate of off-shell Higgs boson production (gg → H * → 4 ) are derived using the double-differential cross-section measured as a function of m 4 and the D ME discriminant, which greatly enhances sensitivity to this type of process. Constraints on modified couplings of the Higgs boson to top quarks and gluons in the off-shell region are also derived, using the measured differential cross-section as a function of m 4 .
All interpretations use a common statistical approach. A multivariate Gaussian likelihood function is used to quantify the level of agreement between a given prediction and observed data simultaneously across all bins of a measurement, taking into account correlations due to bin migration. The χ 2 function defining the exponential component of the likelihood takes the form: where y data is a vector of unfolded observed values in each of the distribution bins, y pred is a vector of the predicted values in each of the distribution bins, which is a function of the parameter of interest (POI) and nuisance parameters (NP), and C −1 is the inverse of the total covariance matrix for the prediction being tested. This covariance matrix is obtained by rescaling the covariance matrix resulting from unfolding the detector-level SM prediction, to account for the change in the predicted yield relative to the original prediction for the values of the POI and NP under consideration. Each element C(i, j) of the rescaled matrix corresponding to bins i and j can be expressed using the systematic, statistical and background components C SM syst , C SM stat and C SM bkg of the covariance matrix corresponding to the SM prediction: where R k = N pred k (POI, NP)/N pred k (POI = SM, NP = 0) is the ratio of the predicted yield in bin k assuming the given values of the parameter of interest and nuisance parameters to the yield in bin k using the SM value of the POI and a nominal value of the NP. All sources of experimental uncertainty, including those related to the unfolding procedure itself, are included in the systematic covariance matrix. The background component includes any uncertainties in the estimated background subtracted prior to unfolding and does not vary with the POI or NP. Theoretical uncertainties in the predictions do not enter the covariance matrix but are modelled with a nuisance parameter for each of the shape and normalisation components, constrained with Gaussian probability density functions.
Upper limits on the values of the parameters of interest are set using the CL s method [69] with a confidence level of 95%.

Signal strength for gluon-induced 4 production
The best prediction for the fiducial cross-section for gluon-induced 4 production (gg → 4 ) in the interval 180 GeV < m 4 < 1200 GeV, where the Higgs resonance is not dominant, is approximately 6.5 fb, compared to a leading order MCFM prediction of 3.0 fb. The relative contribution of gg → 4 to the differential pp → 4 cross-section is greatest in the region 180 GeV < m 4 400 GeV, contributing around 18% at m 4 ∼ 200 GeV, as visible in Figure 6. For a comparison with the best theoretical prediction, the signal strength for this process, µ gg = σ measured gg→4 /σ SM gg→4 , is extracted. The differential m 4 distribution is used for this interpretation, as NLO QCD precision is available in the description of this variable. A likelihood scan is performed using the procedure outlined above. The contribution from qq → 4 production is set to the theoretical prediction as described in Section 5 and allowed to vary within the associated theoretical uncertainties described in Section 7 by means of nuisance parameters with Gaussian constraints. The best available simulation of gg → 4 as described in Section 5 is scaled by the parameter of interest, µ gg , and in addition also allowed to vary within the associated theoretical uncertainties. A signal strength µ gg = 1.3 ± 0.5 is measured with an expected value of 1.0 ± 0.4. In addition, a signal strength µ LO gg = σ measured gg→4 /σ SM, LO QCD gg→4 , is extracted relative to an uncorrected leading-order precision MCFM prediction of gg → 4 as µ LO gg = 2.7 ± 0.9, with an expected value of 2.2 ± 0.9. This value can be compared with a previous ATLAS measurement of µ LO gg = 2.4 ± 1.4 performed at √ s = 8 TeV [7]. In both cases, the uncertainty is dominated by data statistics. The largest systematic uncertainty contribution is the QCD scale choice in the qq → 4 prediction, and is small compared to the statistical uncertainty. Consistent results were also obtained when using the double-differential m 4 -p 4 T or m 4 -y 4 distributions and the m 4 measurement per final-state flavour configuration, all of which showed comparable sensitivity.

Extraction of the Z → 4 branching fraction
The branching fraction of Z → 4 is extracted using the lowest m 4 bin (75-100 GeV) in the unfolded m 4 distribution shown in Figure 10. This bin is dominated by single Z boson production (Figure 1(c)), but there are minor non-resonant contributions from t-channel qq production (Figure 1(a)) and gg → Z Z ( * ) (Figure 1(b)). The measurement is performed in an extended phase space defined by values of the invariant mass of the four-lepton system m 4 and the lowest dilepton invariant mass in the event, m , satisfying 80 < m 4 < 100 GeV and m >4 GeV. The branching fraction is then calculated as: where N fid is the number of unfolded events in this bin, A fid is the fiducial acceptance, defined as the ratio of the events passing the fiducial selection to those in the extended phase space, σ Z is the total cross-section for single Z production, L is the integrated luminosity, and f non-res is the fraction of non-resonant events in the extended phase space, calculated using P -B . The acceptance (including the non-resonant contribution) is calculated using MC simulation as A fid = (4.75 ± 0.02)% and the fraction of non-resonant events as f non-res = (4.8 ± 0.5)%, where the uncertainty includes the statistical uncertainty of the samples used and the systematic uncertainty from the theoretical variations described in Section 7.
The branching fraction is measured to be B Z→4 = [4.70 ± 0.32(stat) ± 0.21(syst) ± 0.14(lumi)] × 10 −6 using the measured value for σ Z from Ref. [70]. Here, the systematic uncertainty includes the systematic uncertainty of the measured σ Z and the systematic uncertainty of the unfolded cross-section in the bin used for the measurement, as well as the uncertainty in A fid and f non-res . As Ref. [70] is based on 81 pb −1 of pp collision data taken during the 2015 LHC run while this measurement uses the full 2015-2016 ATLAS dataset comprising 36 fb −1 , all detector-related systematic uncertainties as well as the luminosity uncertainty of σ Z are conservatively treated as uncorrelated with the equivalent uncertainties in the measured cross-section in the lowest m 4 bin.
This result is compared with previous dedicated measurements by the ATLAS [8] and CMS [6] collaborations in Table 3. The largest contributing systematic uncertainties in this mass region come from lepton identification and reconstruction efficiencies, as shown in Figure 3. The difference in systematic uncertainties compared to Ref.
[8] is due to the assumptions of non-correlation between uncertainties in the two contributing measurements discussed above. The larger statistical uncertainty compared to Ref. [6] arises from an acceptance which has not been fully optimised for this interpretation. Nevertheless, the final precision including all error sources allows this measurement to contribute an improvement in the total precision of the Z → 4 branching fraction.

Constraint on off-shell Higgs boson signal strength
The double-differential distribution for m 4 -D ME is used to constrain the off-shell Higgs production process at high mass (m 4 >180 GeV), assuming that the contribution of the box diagram is as predicted by the Standard Model. As in the extraction of the signal strength for gluon-induced 4 production, a likelihood scan is performed where the contribution of qq → 4 is set to the Standard Model prediction and allowed to float within the associated theoretical uncertainties. The total yield from gg → 4 is then parameterised [9] as where µ OS H = σ gg→H * →4 /σ SM gg→H * →4 is the signal strength for the off-shell Higgs production process, the parameter of interest for this measurement. The yields N gg→H * →Z Z ( * ) →4 SM , N gg→4 (box) SM , and N gg→4 SM are those predicted by the Standard Model for only the off-shell Higgs production process, only the box diagram, and the total gg → 4 contribution including interference, respectively, and are set to the best available prediction as discussed in Section 5. They are allowed to float within the associated theoretical uncertainties discussed in Section 7. The observed 95% CL upper limit on the signal strength obtained in this way is 6.5. This agrees with the expected 95% CL upper limit of 5.4 within the range of [4.2, 7.2] for ±1σ uncertainty. This extraction demonstrates the degree to which an interpretation of measured cross-sections can approach the precision of dedicated measurements performed at detector level. The result can be compared to the upper limit of 4.5 obtained by the dedicated detector-level measurement [9] in the 4 final state using the same dataset and the same model. The sensitivity of this interpretation is slightly lower in comparison, due to the restrictions the unfolding procedure imposes on the binning of observables, the D ME discriminant in particular.

Constraint on modified Higgs boson couplings
Finally, the detector-corrected four-lepton mass distribution is used to constrain possible BSM modifications of the couplings of the Higgs boson to top quarks (c t ) and gluons (c g , zero in the SM) [71]. On-shell rates for Higgs production via gluon-gluon fusion are only sensitive to |c t + c g | 2 , but measurements at higher mass (> 180 GeV) can be used to probe these parameters independently, as the partonic centre-of-mass energy of the process becomes larger than the top-quark mass. This provides an interesting test of the off-shell behaviour beyond dedicated measurements based on the rare ttH production mode [72]. Again, the yield from qq → 4 is set to the Standard Model prediction and allowed to float within the associated theoretical uncertainties, while the yield from gg → 4 is parameterised as a function of c t and c g using the procedure described in Ref. [71]. The observed and expected 95% CL exclusion contours obtained using the CL s method [69] are shown in Figure 15, and the expected limit has green and yellow bands indicating uncertainties of ±1σ and ±2σ. The parameter space which lies outside of the observed contour is excluded at 95% CL.  Figure 15: Observed (solid) and expected (dashed) exclusion limits at 95% CL in the c g versus c t plane for modified ttH and ggH couplings. The uncertainties in the expected limit corresponding to one and two standard deviations are displayed as green and yellow bands respectively. The hollow circle denotes the tree-level SM values of the parameters: c g = 0 and c t = 1.
Exclusion limits were also explored for a model of anomalous triple gauge couplings considered in a dedicated search region of the ATLAS on-shell Z Z → 4 measurement [5]. Here, it was found that the present detector-corrected analysis is far less sensitive. This is a general feature of cross-section measurement reinterpretations in terms of models with effects that appear in the very poorly populated tails of distributions: the statistical requirements of unfolding mean that bins will need to be wide in these regions, and therefore sensitivity will be decreased.

Conclusion
The four-lepton mass distribution has been measured using 36.1 fb −1 of proton-proton collision data at a centre-of-mass energy of √ s =13 TeV recorded with the ATLAS detector at the LHC. The measurement is made differentially in the invariant mass m 4 of the four-lepton system, and double-differentially as a function of m 4 versus the transverse momentum of the four-lepton system, the rapidity of the system, the matrix-element discriminant D ME designed to isolate off-shell Higgs boson contributions, and the final state lepton flavour channel.
The measurements are consistent with the predictions of the SM. All measurements made are readily reinterpretable in terms of improved SM calculations or additional BSM scenarios. A range of example interpretations are presented to demonstrate and explore this potential. The signal strength of the gluongluon fusion production process is measured to be µ gg = 1.3 ± 0.5 compared to an expected value of 1.0 ± 0.4. A value for the Z → 4 branching fraction of [4.70±0.32(stat)±0.25(syst)]×10 −6 is obtained, consistent with existing measurements and exceeding the precision of previous ATLAS results. An upper limit on the signal strength for the off-shell Higgs production process of µ OS H < 6.5 is obtained at 95% CL. Finally, limits on anomalous couplings of the Higgs boson to gluons and top quarks are derived.