Measurements of the pp$\to$WZ inclusive and differential production cross section and constraints on charged anomalous triple gauge couplings at $\sqrt{s} =$ 13 TeV

The WZ production cross section is measured in proton-proton collisions at a centre-of-mass energy $\sqrt{s} =$ 13 TeV using data collected with the CMS detector, corresponding to an integrated luminosity of 35.9 fb$^{-1}$. The inclusive cross section is measured to be $\sigma_{\text{tot}}($pp$\to$WZ$) =$ 48.09 $^{+1.00}_{-0.96}$ (stat) $^{+0.44}_{-0.37}$ (theo) $^{+2.39}_{-2.17}$ (syst) $\pm$ 1.39 (lumi) pb, resulting in a total uncertainty of $-$2.78/$+$2.98 pb. Fiducial cross section and charge asymmetry measurements are provided. Differential cross section measurements are also presented with respect to three variables: the Z boson transverse momentum p$_\mathrm{T}$, the leading jet p$_\mathrm{T}$, and the $m$(WZ) variable, defined as the invariant mass of the system composed of the three leptons and the missing transverse momentum. Differential measurements with respect to the W boson p$_\mathrm{T}$, separated by charge, are also shown. Results are consistent with standard model predictions, favouring next-to-next-to-leading-order predictions over those at next-to-leading order. Constraints on anomalous triple gauge couplings are derived via a binned maximum likelihood fit to the $m${WZ) variable.


Introduction
The measurement of the diboson production cross section is sensitive to the self-interaction between gauge bosons via triple gauge couplings (TGCs) and is therefore an important test of the standard model (SM). Such couplings directly result from the nonabelian SU(2)×U(1) gauge symmetry of the SM. In the SM, the values of the couplings are fully determined by the structure of the Lagrangian; any deviation of the observed diboson production strength from the SM prediction, typically manifested as a change in the cross section, would indicate new physics. The expected change would lead to an overall increase Figure 1. Feynman diagrams for WZ production at leading order in perturbative QCD in protonproton collisions for the s-channel (left), t-channel (middle), and u-channel (right). The contribution from s-channel proceeds through TGC. of the cross section, although in some portions of phase space there will be a negative interference between the SM and new physics beyond the SM (BSM).
Associated WZ production is particularly interesting, as it is the only process directly sensitive to the WWZ coupling with a Z boson in the final state. Furthermore, WZ production is a major background to searches for new physics in multilepton final states; a precise determination of its cross section is crucial to improve the sensitivity of these searches. In addition, initial state radiation can be used as a probe of the boost of the WZ system through a differential study of the leading jet transverse momentum, since an initial state particle can radiate a jet and this jet will recoil against the WZ system.
In the SM at leading order (LO) in perturbative quantum chromodynamics (QCD), WZ production in proton-proton (pp) collisions proceeds via quark-antiquark interactions in the s-, t-, and u-channels. Figure 1 shows the tree-level production diagrams for each channel. The s-channel, which proceeds through the WWZ TGC, is the only channel sensitive to anomalous values of this coupling.
After a first inconclusive observation of candidate events for WZ production at UA1 [1], standard model WZ production has been studied at √ s = 1.96 TeV at the Fermilab Tevatron [2,3] and also in pp collisions at the CERN LHC by the ATLAS [4-10] and CMS [11][12][13][14][15][16] Collaborations. The most relevant of these results to this paper is the CMS analysis reporting the pp → WZ production cross section at √ s = 7 and 8 TeV as well as a search for anomalous TGCs (aTGCs) at √ s = 8 TeV in the multilepton final state using the full 2011 and 2012 data sets [15]. The ATLAS Collaboration has similarly analyzed the full 8 TeV data set [7], measured the inclusive and differential cross section, and set limits on aTGCs. The inclusive pp → WZ production cross section at √ s = 13 TeV was measured in the multilepton final state by the ATLAS [8] and CMS [14] Collaborations, using the full 2015 data set.
This paper presents a new analysis of pp → WZ production at √ s = 13 TeV using multilepton final states in which the Z boson decays into a pair of electrons or muons, and the W boson decays into a neutrino and either an electron or a muon. Compared to the previous results, the inclusive and differential cross sections are measured with increased precision (the overall uncertainty in the inclusive cross section is reduced by half), and more stringent confidence intervals on aTGCs are set, yielding the current best limits in two of the parameters.
-2 -This paper is organized as follows: the detector is described in section 2; the data and Monte Carlo (MC) simulated samples are described in section 3; the object definition and the event selection are described in section 4 and section 5, respectively; the background estimation is described in section 6, and the systematic uncertainties affecting the analysis are described in section 7. Finally, the inclusive cross section measurement is presented in section 8, the differential cross section measurement is presented in section 9, and the confidence regions for aTGCs are presented in section 10. A summary of the results is shown in section 11.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.
Events of interest are selected using a two-tiered trigger system [17]. The first level (L1), composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a time interval of less than 4 µs. The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage.
A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in ref. [18].

Data and simulated samples
This study is performed using proton-proton (pp) collisions at a centre-of-mass energy of 13 TeV at the LHC. Data taken in 2016 with the CMS detector are analyzed, corresponding to a total integrated luminosity of 35.9 fb −1 . The data are filtered to remove detector noise and unphysical events.
Event generators based on the MC method are used to simulate the behaviour of signal and background processes. The powheg v2.0 [19,20] software is used to generate both the WZ signal and the ZZ background samples without additional partons besides the ones included in the matrix element calculations at next-to-leading order (NLO) in perturbative QCD. The rest of the SM background samples are produced with the Mad-Graph5 amc@nlo v2.3.3 generator [21] at LO or NLO accuracy, including up to one or two additional partons in the matrix element calculations. The procedure for accounting correctly for parton multiplicities larger than one is referred to as the merging scheme; where applicable, the FxFx merging scheme [22] is used for the NLO samples, and the MLM merging scheme [23] is used for the LO samples. The modelling of the aTGCs -3 -is done by applying the matrix element reweighting method [24] at LO accuracy to a signal sample generated with the MadGraph5 amc@nlo v2.3.3 generator [21] at NLO accuracy. The procedure is applied to a set of samples produced for different ranges of the Z boson transverse momentum (p Z T ) such that the statistical power of the MC at higher energies, where anomalous couplings are expected to dominate, is enhanced. The NNPDF3.0LO (NNPDF3.0NLO) [25] parton distribution functions (PDFs) are used for the simulated samples generated at LO (NLO). The computations are interfaced with the pythia v8.205 generator [26] to include the effects of parton showering and hadronization using the CUETP8M1 tune [27,28].
The effect of additional interactions in the same or adjacent bunch crossing (referred to as pileup) is accounted for by simulating additional minimum bias interactions for each hard scattering event. Simulated events are then reweighted so that the pileup distribution matches that observed in data, which is characterized by an average of 23 collisions per bunch crossing. The generated events are interfaced with a model of the CMS detector response implemented using the Geant4 package [29] and reconstructed using the same software as the real data.

Event reconstruction
Events are reconstructed using the particle-flow (PF) algorithm [30] by matching information from all CMS subdetectors to obtain a global description of the event. The resulting objects are classified into mutually exclusive categories: charged hadrons, neutral hadrons, photons, electrons, and muons.
Interaction vertices are identified by grouping tracks consistent with originating from the same location in the beam interaction region. The reconstructed vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex. The aforementioned physics objects are the jets, clustered using the jet finding algorithm [31,32] with the tracks assigned to the vertex as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the p T of those jets. More details are given in section 9.4.1 of ref. [33].
Photons are identified as ECAL energy clusters not linked to the extrapolation of any charged particle trajectory to the ECAL, but are not used in this analysis. Electrons are identified as a primary charged particle accompanied by potentially many ECAL energy clusters [34]; such clusters are matched to the extrapolation of this track to the ECAL and to possible bremsstrahlung photons emitted along the way through the tracker material. Muons are identified as a track in the central tracker consistent with either a track or several hits in the muon system, in association with an energy deficit in the calorimeters. Charged hadrons are identified as charged particle tracks neither identified as electrons, nor as muons. Finally, neutral hadrons are identified as HCAL energy clusters not linked to any charged-hadron trajectory, or as ECAL and HCAL energy excesses with respect to the expected charged-hadron energy deposit. The energy of photons is directly obtained from the ECAL measurement, corrected for zero-suppression effects. The energy of electrons is -4 -determined from a combination of the track momentum at the main interaction vertex, the corresponding ECAL cluster energy, and the energy sum of all bremsstrahlung photons attached to the track. The energy of muons is obtained from the corresponding track momentum. The energy of charged hadrons is determined from a combination of the track momentum and the corresponding ECAL and HCAL energy, corrected for zero-suppression effects and for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energy.

Electrons and muons
In this analysis, leptons [35] coming from the primary vertex play a prominent role amongst all reconstructed event objects because of the very distinct trilepton (3ℓ) signature of the signal process. Prompt signal leptons are defined as the light leptons (electrons or muons) from the decays of particles in the signal processes, such as those coming from W and Z boson and τ lepton decays. Leptons originating from hadrons, primarily b hadron decays, are referred to as nonprompt leptons.
Electrons are reconstructed as described in section 4.1; candidates are further required to have |η| < 2.5, to be within the tracking acceptance, and to have p T > 7 GeV. The identification is performed using a multivariate discriminant with inputs related to the shower shape and to the tracking and track-cluster matching. Additional identification criteria are applied for electrons with p T > 30 GeV to mimic the identification applied at trigger level described in section 5; this ensures consistency between the measurement region and application region of the misidentification rate estimate.
Muon candidates are reconstructed as described in section 4.1 by combining the information from both the silicon tracker and the muon spectrometer in a global fit [36]. Candidates are identified by checking the quality of the geometrical matching between the tracker and the muon system measurements. Only candidates within the muon system acceptance |η| < 2.4 and with p T > 5 GeV are considered.
The energy scale of the leptons is corrected to account for mismeasurements in the tracker and muon systems, and in the ECAL. For both electrons and muons, the average difference between the corrected and uncorrected energies is zero; however, a spread is induced in the lepton p T of about 1% that is assigned as a systematic uncertainty in the energy of each lepton.
In order to improve the rejection of pileup and misreconstructed tracks and, more importantly, to reject background leptons from b hadron decays, loose selections are applied to variables related to the track impact parameter, as described in refs. [37,38].
The charged leptons produced in decays of heavy particles, such as W and Z bosons, are typically spatially isolated from the hadronic activity in the event, whereas the leptons produced in the decays of hadrons or misidentified leptons are usually spatially embedded in jets. For high-energy W and Z bosons the decay products tend to be collimated (a boosted system) and this distinction based on a simple definition of isolation is not effective anymore.
-5 -Therefore, the PF-based isolation definition used in the √ s = 8 TeV analysis [15], which included all the photons and the neutral and charged hadrons in a cone of ∆R = √ (η ℓ − η i ) 2 + (φ ℓ − φ i ) 2 < 0.3 (where i indicates the hadrons and ℓ the lepton) around the leptons, is improved [39,40] by using a p T -dependent cone size given by the formula: ∆R(p T (ℓ)) = 10 GeV min max(p T (ℓ), 50 GeV), 200 GeV . (4.1) The discrimination between prompt leptons and nonprompt leptons is improved by exploiting the differences in isolation-related variables and in impact-parameter-related variables between the two categories of leptons (prompt and nonprompt). An identification algorithm, based on a multivariate analysis (MVA) using boosted decision trees (BDTs), is trained to discriminate signal leptons (from W and Z decays) from background leptons (mostly b hadron decays). The resulting classifier is referred to as the lepton MVA discriminator, and was trained using a sample of ttZ events: signal leptons originate from leptonic ttZ decays, and background leptons originate mostly from b hadron decays. Further details on the lepton MVA discriminator can be found in refs. [37,38]. The efficiencies have a high dependence on the lepton p T and η; typical values for electrons are 3-7% misidentification efficiency and 20/40/80/90% identification efficiency for low p T electrons in the endcap, low p T electrons in the barrel, high p T electrons in the endcap, and high p T electrons in the barrel, respectively. For muons, typical values are 2-10% misidentification efficiency and 80-100% identification efficiencies where higher values correspond to higher p T muons. Throughout the analysis, leptons passing a high threshold on the lepton MVA discriminator are referred to as tight leptons.

Jets
Jets are reconstructed by clustering PF candidates using the anti-k T algorithm [31,32] with a distance parameter of 0.4. The jet momentum is determined as the vector sum of all particle momenta in the jet, and is estimated from simulation to be within 5-10% of the true momentum over the whole p T spectrum and detector acceptance [41]. Charged hadrons not originating from the primary vertex are subtracted from the PF candidates considered in the clustering; this procedure is referred to as charged-hadron subtraction. Jet energy corrections are derived from simulation to bring the measured response of jets to that of particle level jets on average, and are applied to the energy of the jet as a function of the jet p T and η. In situ measurements of the momentum balance in dijet, photon+jet, Z +jet, and multijet events are used to correct for any residual differences in jet energy scale between data and simulation [42]. The jet energy resolution amounts typically to 15% at 10 GeV, 8% at 100 GeV, and 4% at 1 TeV. Additional selection criteria are applied to each jet to remove jets potentially dominated by anomalous contributions from various subdetector components or reconstruction failures [43]. Jets with a minimum p T > 30 GeV are required to be separated from any lepton candidate passing the minimal lepton selection by selecting ∆R = The combined secondary vertex (CSV) b tagging algorithm [44] is used to identify jets that are likely to originate from the hadronization of bottom quarks (referred to as b jets). This algorithm combines both secondary vertex information and track impact parameter information together in a likelihood discriminant with output values ranging from zero to one. A jet is tagged as a b jet if the CSV discriminator output exceeds a threshold value, referred to as the working point. This analysis uses the medium working point corresponding to requiring that CSV > 0.8. This working point gives approximately 70% efficiency for tagging b jets and 1.5% efficiency for mistakenly tagging jets coming from light quarks or gluons [44]. Jets that pass the medium CSV working point and have a minimum p T > 30 GeV are defined in this analysis as b jets.
Corrections accounting for differences in the b tagging performance between data and simulation are derived by applying weights dependent on the jet p T , η, b tagging discriminator, and flavour to each simulated jet [44]. The jet flavour is defined as the flavour of the object originating the jet, which is known in simulation. The weights are derived from tt and Z+jets events. The per-event weight is defined as the product of the per-jet weights, including the weights of the jets overlapping with leptons. Uncertainties in the weights are propagated throughout the analysis as systematic uncertainties.

Missing transverse momentum
The missing transverse momentum vector is computed as the negative vector p T sum of all PF objects identified in the event. The magnitude of this vector is referred to as p miss T . The jet energy corrections, introduced previously, are propagated to the estimation of p miss T .

Event selection
Events with contributions from beam halo processes or calorimeter noise are rejected using dedicated filters [45,46]. The remaining events are required to pass one of several triggers involving either a single loosely isolated light lepton or a pair of them with any flavour composition. For the single-lepton cases, the p T threshold is 27 (24) GeV for electrons (muons). The lower p T thresholds for the same-flavour dilepton triggers are 23 (17) GeV for the leading and 12 (8) GeV for the subleading electron (muon). The cross-flavour triggers require a leading lepton p T of 23 GeV and a subleading lepton p T of 8 GeV.
The baseline selection is defined by the presence of at least three tight leptons with at least one opposite-sign same-flavour (OSSF) pair. To exploit the specific kinematic properties of the process, each of the three leading leptons is tentatively assigned to its most likely parent boson. The first stage of the algorithm assigns the OSSF pair of leptons with an invariant mass closest to the Z boson mass [47], m Z , to the Z boson. These two leptons are denoted as ℓ Z1 (leading) and ℓ Z2 (subleading), ranked by p T . The remaining lepton is assigned to the W boson and is denoted as ℓ W . The performance of the algorithm is studied by using simulated events and comparing the assigned parent particle with MC generator level truth; this algorithm properly assigns the leptons in about 95% of cases.
The baseline selection includes additional requirements on the p T of each lepton: p T (ℓ Z1 ) > 25 GeV, p T (ℓ Z2 ) > 10 GeV, and p T (ℓ W ) > 25 GeV. The total efficiency of Table 1. Requirements for the definition of the signal region of the analysis and the three different regions designed to estimate the main background sources.
the set of triggers used to record the data is measured to be close to 99% with respect to this baseline selection; this is because all the trigger paths can be triggered by more than one object, yielding a very high efficiency for the combined set of triggers even if the efficiency of the individual triggers is lower. The baseline selection is split, on the basis of the flavour composition of the leptonic triplet, into four categories denoted as: eee, eeµ, eµµ, and µµµ. The signal region (SR) is defined by applying to the baseline selection additional requirements that are designed to increase the purity of the region by reducing specific background contributions. Consistency with the Z boson mass peak is enforced by requiring the invariant mass of the two leptons assigned to the Z boson to be close to m Z , |M (ℓ Z1 ℓ Z2 ) − m Z | < 15 GeV. This requirement greatly reduces the contribution from nonresonant multilepton production processes such as tt production. A requirement that p miss T > 30 GeV is found to greatly reduce the Z+jets background contribution; in the following, residual events are included in the contribution labelled nonprompt. A large reduction in the number of events that include both a tt pair and a Z boson is obtained by rejecting events that contain one or more b-tagged jets. The invariant mass of the trilepton system M (ℓ Z1 ℓ Z2 ℓ W ) is required to exceed 100 GeV. Finally, contributions from tetraleptonic decays in ZZ events are reduced by rejecting any event with an additional fourth lepton that passes a looser lepton selection. Generator-level requirements for the signal, designed to avoid infrared divergences, are matched at reconstruction level by vetoing events that do not contain a lepton pair passing a minimum invariant mass requirement of M (ℓℓ ′ ) > 4 GeV. This requirement also has the desirable effect of reducing the contribution from low-mass resonance processes. The distribution of the key observables used in the definition of the signal region after the signal extraction fit is displayed in figure 2.
Multiple control regions (CRs) are defined to cross-check or estimate the different background processes. Each of them follows the same selection as the signal region, except that individual specific selection criteria are inverted in order to increase the fraction of the targeted background process in the region. A summary of the three orthogonal control regions used in the analysis is presented in table 1. The control regions are labelled according to the expected dominant background process in each region. A detailed explanation of their use is given in the next section.
-8 -   For each distribution all the signal region requirements are applied except the requirement relating to the particular observable so that the effect of the requirement on that observable can be easily seen. The last bin contains the overflow. Vertical bars on the data points include the statistical uncertainty and shaded bands over the prediction include the contributions of the different sources of uncertainty at their values after the signal extraction fit.

Background estimation
The background contributions fall in two categories, depending on the origin of the finalstate leptons. Prompt background sources consist of the SM processes where the leptons originate in the decay of an SM boson or τ lepton; the nonprompt backgrounds consist of SM processes where the leptons originate in the decay of b hadrons. The nonprompt background contributions are heavily dominated by Z+jets production, with additional contributions from dileptonic tt decays. The total contribution of these processes to the signal region is estimated using the tight-to-loose method described in detail in ref. [40]. The probability for a loose lepton to pass the tight criteria is measured in a single-lepton+jets signal region enriched in nonprompt leptons. For each specific selection, an application region is defined starting from the same requirements and additionally requesting that at least one of the leptons passes the loose selection but fails the tight selection. Depending on the p T , |η|, and multiplicity of the failing leptons, the extrapolation from the control region to the application region is derived for each event as a transfer factor, based on the previously measured probability. The contamination of the application region due to the prompt contribution is estimated from simulation and its effect is subtracted from the total nonprompt estimation in the selection, using the same transfer factors. Uncertainties in the determination of the nonprompt contribution are estimated with simulated events by comparing the prediction of the method and the one derived directly from simulation; they are found to be dominated by the statistical uncertainty due to the limited amount of simulated events, and estimated to be about 30%. An additional source of systematic uncertainty is estimated to range between 5 and 30% from the differences observed amongst a number of methods used for the subtraction of the prompt background processes in the signal region.
The leading SM prompt background comes from the tetraleptonic decay of ZZ pairs when one of the produced leptons is too soft or does not pass the quality requirements of the identification selection. Our estimation of the contribution of this process is based on MC simulation. To validate the behaviour of this simulation we use a dedicated sideband region (CR-ZZ) that requires exactly four leptons in the final state; the resulting selection is dominated by ZZ production, therefore no additional ZZ-specific constraint is applied. To illustrate the behaviour of this associated control region, the key observables used in the different measurements of the analysis are shown in figure 3. As a numerical cross-check, we estimate the possible variations over the simulated prediction in the four flavour-dependent categories. For each category, we subtract the predicted non-ZZ yields from the observed data and divide the result by the expected ZZ contribution. Statistical and normalization uncertainties are propagated to this measurement. The numerical values are consistent with unity for all categories and for the whole region, the value of the data minus the background divided by the predicted ZZ yield is q ZZ = 0.99 ± 0.09.
Top quark enriched prompt processes are dominated by ttZ and tZq production, where the Z boson and one of the top quarks decay leptonically. A procedure similar to the one used for the estimation of the ZZ background is performed in CR-top region. The key observables of the analysis in this sideband region are shown in figure 4. The estimation -10 -procedure results in good agreement across the different flavour categories; the global quotient of data minus background over the predicted ttZ plus tZq yields is consistent with unity, q ttZ+tZq = 1.09 ± 0.20.
The last major background that contributes to our search is the production of asymmetrical final-state photon conversions. The production of Zγ events makes up 99% of this contribution. The lepton assignment algorithm tends to match the electron originating from the photon to the W boson so the contribution in the eee and eµµ categories is highly enhanced. The procedure used for the prompt contributions is used to validate the behaviour of the simulated conversion processes in a region denoted as CR-conv and defined in table 1.
Good agreement is found in the eµµ and eee categories, where sufficient statistical power is available. The normalization of the X+γ background is estimated from the difference between the data and the other backgrounds, divided by the X+γ (X=tt, V, t) SM prediction; the result is q X+γ = 1.11 ± 0.14, consistent with unity. Validation plots for key observables used in the analysis are shown in figure 5.
Additional minor background contributions include the leptonic decays of multiboson production processes, dominated by VH and VVV production where V is either the Z or the W boson and H is the SM Higgs boson. Their contribution is estimated from simulation.

Systematic uncertainties
The major sources of systematic uncertainty can be grouped into three different categories: normalization uncertainties that are assigned to each of the background processes individually; global uncertainties related to the definition and energy measurement of the different physical objects, affecting both the background and signal acceptances; and a global uncertainty, correlated across all processes, that accounts for a possible mismeasurement of the total integrated luminosity.
As stated in section 6, the contribution from prompt SM processes is estimated using MC samples and validated in appropriate control regions. The uncertainties in the normalization of such processes are taken from experimental measurements performed at a centre-of-mass energy of √ s = 13 TeV, and correspond to assigning flat uncertainties of 7, 15, and 35% to the contributions of the ZZ, ttV, and tZq background processes respectively [48][49][50]. The uncertainty in the normalization of the photon conversion background contribution is obtained from the observations in the dedicated control region, and estimated to be about 20%. The normalization uncertainties applied to the minor contributions of the multiboson production are estimated to be about 25% for the VH process and 50% for the VVV ones. The nonprompt background estimation includes two different sources of systematic uncertainties. First, a 30% normalization uncertainty is applied to account for the observed variations in the performance of the method when applied to MC simulations. Second, a p T -and η-dependent uncertainty that ranges between 5 and 30% is applied to account for the differences observed amongst different W/Z background subtraction procedures considered for the tight-to-loose method.
-11 -       -14 -Lepton identification and isolation introduce a sizeable uncertainty in the final measurement. Lepton efficiencies are computed using the tag-and-probe technique [11,34,51]. Since electron and muon identification efficiencies are computed separately, the uncertainty in their estimate is split by flavour and evaluated separately. The largest effects are in the eee category for the electron efficiency (about 5%) and in the µµµ category for the muon efficiency (about 3%). The uncertainty in the energy scale of the leptons is estimated to produce a variation of 1% in their p T ; the reconstructed muon p T is computed with a different method for high-p T muons (above 200 GeV), thus a conservative 5% uncertainty is assigned to each high-p T muon. The uncertainty in the lepton energy scale is assigned to each lepton -separately for electrons and muons -and propagated to the yields, with effects smaller than 1% in most cases.
A total trigger efficiency uncertainty is applied across all channels and processes to account for the differences observed between data and MC samples. Two different sources are considered for the estimation of this uncertainty. First, trigger efficiencies are measured in data and simulation samples, with the difference between the two being assigned as a systematic component of the trigger efficiency uncertainty. Second, the effect of limited statistical power in the data measurement is computed using Clopper-Pearson intervals [52], which is an estimation method that yields intervals in the physical region using an estimation that is statistically robust even when the efficiency is close to its extreme valuesin this case the value 1-, and added quadratically to the first source. A final asymmetric flat uncertainty of −1.8 and +1.4% is applied.
The efficiency of the b tag veto is also corrected by comparing the measurements in data and simulation and propagated to each of the events. Separate uncertainty sources are considered for the b jet identification efficiency and the misidentification of light-flavour jets as b-tagged jets, with effects of up to 1.6 and 0.7% in the final signal acceptance, respectively.
Each of the reconstructed jets has an associated energy scale uncertainty of 2-10% depending on its p T and η. The final measurement is sensitive to this kind of variation through the changes in acceptance that arise in the p miss T estimations and the b tag veto. The effect on the final signal acceptance amounts to about 1%.
The pileup modelling uncertainty is evaluated by varying the inelastic cross section up and down by 5% and propagating the effect to the final signal region, resulting in an uncertainty of up to 1.2%.
A fiducial region is defined by imposing requirements that mimic the lepton kinematic characteristics in the signal region. The acceptance A is defined as the fraction of events in the total phase space that pass the requirements of the fiducial region. The efficiency ǫ is estimated as a transfer factor from the fiducial region to the signal region. Both acceptance and efficiency are estimated using generator-level information; details on the fiducial region, the acceptance, and the efficiency are provided in section 8. Two sources of theoretical uncertainty in A and ǫ are considered. Effects due to factorization (µ F ) and renormalization (µ R ) scale choices are evaluated with powheg by varying the scales up and down independently by a factor of two around the nominal value µ 0 = (m Z + m W )/2, under the constraint 0.5 < µ F /µ R < 2.0. The envelope of the set of variations is assigned as -15 -  a systematic uncertainty on the yields. Parametric (PDF +α S ) uncertainties are estimated using the PDF4LHC prescription [53] with the NNPDF3.0 set [25].
Finally, a 2.5% correlated normalization uncertainty is applied to all signal and background processes to account for the variations in the measurement of the total integrated luminosity [54].

Inclusive measurement
The inclusive WZ production cross section is measured by performing a simultaneous maximum likelihood fit to the total yields in the four flavour categories of the signal region, as presented in figure 6. The normalization of the WZ signal process is modelled via a parameter representing a multiplicative factor for the total NLO production cross section; the parameter is referred to as signal strength r WZ and is a free parameter in the fit.
The contributions from the background processes are allowed to vary around the predicted yields, according to the systematic contributions described in section 7. The systematic contributions are modelled in the likelihood as nuisance parameters with log-normal priors. The expected and observed yields for the processes involved in each of the flavour categories can be seen in table 2. The final contribution of the different sources of uncertainty to the measurement is described in table 3.
-16 -A fiducial region is defined by imposing requirements that mimic the lepton kinematic characteristics in the signal region. We require three light leptons located inside the detector acceptance, |η ℓ | < 2.5(2.4) for electrons (muons), with at least one OSSF pair. Electrons and muons from W/Z→τ +X→ ℓ+X decays are included in this selection. These leptons are assigned to the W and Z bosons using the algorithm described in section 5. Minimum transverse momenta requirements of p T (ℓ Z1 ) > 25 GeV, p T (ℓ Z2 ) > 10 GeV, and p T (ℓ W ) > 25 GeV are applied. We also apply the two additional criteria M (ℓ Z1 ℓ Z2 ℓ W ) > 100 GeV and |M (ℓ Z1 ℓ Z2 ) − m Z | < 15 GeV. The total yields in the signal region for the expected background, N bkg , and observed data, N obs , are used to obtain the fiducial cross section of the WZ process through the expression: where the efficiency ǫ is estimated as a transfer factor from the fiducial region to the signal region using MC truth and the integrated luminosity L amounts to 35.9 fb −1 . Scale and PDF uncertainties are considered in the computed efficiency and are propagated to the final cross section measurement. Table 4 summarizes the efficiencies and their uncertainties. Final state generator-level leptons are dressed by adding to their momenta those of generator-level photons within a cone of ∆R(ℓ, γ) < 0.1. The efficiency is estimated from simulation for each of the flavour channels separately, and for the inclusive case, as the ratio of expected reconstructed events in the signal region to the number of generated trilepton events in the fiducial region. The statistical uncertainty in the measurement of the efficiency, which originates from the limited number of simulated events, is below 1% and is added quadratically to the total sum of statistical uncertainties in the measurement. Theoretical uncertainties in the cross section measurements arise from renormalization and factorization scale and PDF uncertainties and are added quadratically to the experimental uncertainties. The results are presented in table 5 and can be compared to the NLO prediction from powheg + pythia of σ powheg fid = 227.6 +8.8 −7.3 (scale) ± 3.2 (PDF) fb; the NLO prediction is disfavoured by this measurement.
The phase space used for the computation of the total cross section is defined by having three generated light leptons that pass the requirement 60 GeV < m OSSF Z < 120 GeV, where m OSSF Z is the mass closest to the Z boson mass among those computed with all possible OSSF lepton pairs. Light leptons originating from tau decay are included in the definition of the total region selection. The extrapolation to the total associated production cross section of WZ bosons is computed as: where the leptonic branching ratios of the W and Z bosons, that pass the requirements of the fiducial region and is estimated using generator-level information. The same procedure used in the fiducial measurement is applied to estimate the effect of theoretical uncertainties and the limited number of simulated events used in the measurement. The results obtained for each flavour category are listed in table 6. The combined measurement is defined as the measurement obtained from a simultaneous fit to the four categories; the resulting value is: which can be compared to theoretical predictions at parton level [55] using MATRIX [55] at NLO, σ NLO (pp → WZ) = 45.09 +4.9% −3.9% pb, and next-to-next-to-leading order (NNLO) [56], σ NNLO (pp → WZ) = 49.98 +2.2% −2.0% , in perturbative QCD, as well as the prediction obtained with powheg + pythia at NLO QCD, of σ NLO Pow = 42.5 +1.6 −1.4 (scale) ± 0.6 (PDF) pb. Uncertainties in the theoretical values are derived from scale variations.

Charge-dependent measurements
The signal process is further divided depending on the charge of the W boson in order to compute the W + Z and W − Z production cross sections and their ratio; the value obtained for the ratio is then compared with theoretical predictions. The procedure described in the previous section is applied separately for the two categories classified according to the charge of the lepton associated with the W boson. The results are: Theoretical 0.9 0.9 0.9 0.9 0.9 Table 3. Summary of the total postfit impact of each uncertainty source on the uncertainty in the signal strength measurement, for the four flavour categories and their combination. Theoretical uncertainties are only included in the signal acceptance during the extrapolation to the total phase space, so they are not included in the likelihood fit. The values are percentages and correspond to half the difference between the up and down variation of each systematic uncertainty component.  63.4 +2.6 −2.6 (stat) +0.6 −0.5 (theo) +3.5 −3.2 (syst) ± 1.9 (lumi) µµµ 67.1 +2.1 −2.0 (stat) +0.6 −0.5 (theo) +3.3 −3.0 (syst) ± 1.9 (lumi) Combined 257.5 +5.3 −5.0 (stat) +2.3 −2.0 (theo) +12.8 −11.6 (syst) ± 7.4 (lumi)  The ratio between the charge-dependent production cross sections is calculated. Statistical uncertainties are treated as completely uncorrelated between the two values, while the other sources of uncertainty are considered completely correlated in their propagation to the ratio. The final effect is that most of the systematic uncertainties show a similar behaviour in both cases so they are greatly reduced when computing the ratio. The value obtained for the ratio is: 9 Differential measurement The differential WZ cross sections are measured in the full volume of the detector as a function of three observables. To better model the data, in the definition of such observables leptons are dressed in simulation by adding to their momenta those of all generator-level photons within a cone of ∆R(ℓ, γ) < 0.1. The first observable is the p T of the Z boson, defined as the transverse sum of the momenta of the two final-state leptons assigned to the Z boson decay. The second observable -20 - is the p T of the leading jet, which represents a probe for the boost of the WZ system recoiling against initial-state radiation. The generated leading jet is defined using the anti-k T algorithm with a cone radius of 0.4, and by requiring a spatial separation of ∆R > 0.5 from the leptons coming from the WZ decay. The third observable is the M (WZ) variable, defined as the invariant mass of the system composed of the three leptons and the p miss T . A general formula for the definition of the variable is: where p(ℓ i ) is the measured four-momentum of each lepton. The four-momentum of the neutrino is defined in the (mass, p T , η, φ) base as p(ν) = (0, p T (p miss T ), 0, φ(p miss T )). Slightly different choices (solving the W → ℓν system for η(ν) or setting the neutrino four-momentum to zero) were tested as well, giving similar results.
The reconstructed quantities are defined using the objects described in section 4, and the pair of tight leptons most likely to come from the Z decay, as well as the tight lepton most likely to come from the W boson decay, are selected using the algorithm described in section 5.
For these three measurements, events must pass the selection used for the inclusive cross section measurement, which is described in section 5. The resulting reconstructed (reco) level distributions are shown in figure 8. The differential WZ cross section is measured in the signal region of the inclusive measurement, here referred to as the inclusive final state, and in four exclusive categories -21 -corresponding to a classification by lepton flavour (eee, eeµ, eµµ, and µµµ), referred to as exclusive final states.
The reconstructed and generated distributions are assumed to differ by the effects of the detector response. This response can be modelled using a two-dimensional matrix that summarizes the bin migration effects induced by the detector on the target observables. Response matrices are obtained in the signal region for the inclusive and exclusive categories, using the powheg and MadGraph5 amc@nlo NLO generators. These matrices are shown for the inclusive selection in figure 9, where the bin contents are normalized to the expected NLO yield for the integrated luminosity analyzed in this paper. The binning scheme is chosen such that for all the matrices the width of the diagonal bins in each dimension are larger than the standard deviation of the average of the bin content across the orthogonal axis.
The process of inverting the detector response matrix is known as unfolding [57], and several techniques are available in the literature for solving the problem [58], although in many cases it may be argued that the best strategy would be to perform any comparison in the reconstructed space. In the following, the space populated by reconstructed events (the reco-level distributions) is denoted as folded space (folded distributions), while the generator-level space (distributions) is denoted as unfolded space (unfolded distributions).
The unfolding procedure consists of performing a least-squares fit with optional Tikhonov regularization [59,60], as implemented in the TUnfold software package [61]. The unfolding problem, and the least-squares fit used to solve it, are modelled according to: Here y is the vector of observed yields, A is the response matrix, and x is the unfolded result. L 1 models the least-squares minimization, where V yy is the covariance matrix of y, with elements V ij defined by the correlation coefficients obtained by rescaling each covariance e ij by the variances e ii and e jj , V ij = e ij /e ii e jj . The regularization is described by L 2 , which reduces the fluctuations in x -induced by the statistical fluctuations of yvia the regularization conditions defined in the matrix L. The strength of the regularization is described by the parameter τ , and a bias vector f b x 0 defines the reference with respect to which large deviations are suppressed. An optional area constraint governs whether the normalization of the unfolded result is bound to the total yield in the folded space, as modelled by L 3 .
To evaluate the performance of the algorithm, the data counts y are substituted with a pseudodata distribution obtained by sampling from the signal plus background MC dis-  tributions. This distribution is then unfolded and folded back; the resulting distribution agrees with the MC truth in the unfolded space and the original sampled distribution, respectively, within uncertainties.
The default configuration for the unfolding performed in this paper is as follows. The powheg generator is used to model the response matrix and the area constraint is applied; such constraint accounts for the difference between the expected yields, from the NLO predictions, and the observed yields, which were shown by the inclusive measurement to be more compatible with the NNLO predictions. The bias vector is the generator-level distribution rescaled to the NNLO prediction by a bias scale of 1.13. By default, no regularization is performed. These settings are chosen following a series of checks using the pseudodata distributions, to evaluate the effect of the area constraint, the bias scale and vector, and the regularization scheme.
- 24 -In particular, the effect of regularization has been checked by applying Tikhonov regularization to the curvature of the unfolded distribution x. The best value for the regularization parameter τ is chosen using the well-established L-curve method [62]. The regularization process is applied for each of the variables and in no case is there an appreciable gain. No regularization is thus applied to obtain the final result. Figure 10 shows the results in the inclusive final state for the Z boson p T distribution (top left), leading jet p T distribution (top right), and mass of the WZ system (bottom). Good agreement is found between the unfolded data distribution and the MC predictions at particle level, and is quantified by χ 2 /N DOF values given in the plot legends. Results in the four different flavour channels are compatible. The results for the differential cross section in the inclusive and exclusive final states are expressed as a fraction of the total cross section and tabulated in tables 7, 8, and 9 for the Z boson p T , tables 10 and 11 for the leading jet p T , and table 12 for the mass of the WZ system. The total cross section is constrained by the aforementioned area constraint. The bottom line test [63] is performed, in which goodness of fit tests are performed in the folded and in the unfolded space to ensure that the agreement between the data and the model does not become worse after unfolding. The purpose is to check that the unfolding procedure is not enhancing the ability to reject incorrect models. The test shows a substantial agreement, giving further confidence in the unfolding procedure.
The results are derived using all the systematic uncertainties described in section 7, including their effect on the response matrix. In addition, a systematic uncertainty due to the unfolding procedure is defined as the difference between the nominal result and the result obtained by unfolding the nominal shape using an alternative response matrix. Such alternative matrix is modelled using the MadGraph5 amc@nlo generator. The effect of such uncertainty on the result is smaller than the effect of statistical fluctuation and of the background subtraction, and is included in the tables together with all the other sources of uncertainty within the other syst category.

Differential measurement split per W boson charge
The differential WZ cross section is computed as a function of the same observables as in section 9 and categorized according to the sign of the charge of the lepton associated with the W boson. Additionally, the differential cross section is computed as a function of the momentum of the lepton that is assigned to the W boson using the procedure outlined in section 5.
The charge of the W boson is estimated using as a proxy the charge of the lepton associated to the W boson. Results are shown here for the inclusive final state, but similar results have been obtained in the four exclusive categories (µµµ, eµµ, eeµ, and eee).
Results for the leading jet p T are shown in figure 11, results for the Z boson p T are shown in figure 12, results for the mass of the WZ system are shown in figure 13, and results for the W boson p T are shown in figure 14. The overall description of the data by the simulation is good. The agreement is quantified by χ 2 /N DOF values that are given in the plot legends. As in the case of the measurement not split by charge, the total uncertainty is dominated by the statistical and background subtraction uncertainties. The -25 -  Figure 10. Differential distributions for the Z boson p T (top left), leading jet p T (top right), and mass of the WZ system (bottom). Data distributions are unfolded at the dressed leptons level and compared with the powheg, MadGraph5 amc@nlo NLO generators, and pythia predictions, as described in the text. The red band around the powheg prediction represents the theory uncertainty in it; the effect on the unfolded data of this uncertainty, through the unfolding matrix, is included in the shaded bands described in the legend.
remaining uncertainties include the one due to the unfolding procedure and are grouped into the other syst. category.

Confidence regions for anomalous triple gauge couplings
The WZ production process is sensitive to the presence of BSM physics through the presence of deviations from the SM predictions of the coupling constants between the SM vector bosons. Because of the dominant SM production modes, the process is expected to be particularly influenced by TGCs of the W and Z bosons. Such couplings are called anomalous when they assume values different from the SM predictions. The total set of al--26 -  Figure 11. Differential distributions for W + (left) and W − (right), in the full SR. The leading jet transverse momentum is unfolded at the dressed leptons level, as described in the text. The red band around the powheg prediction represents the theory uncertainty in it. The effect on the unfolded data of this uncertainty, through the unfolding matrix, is included in the shaded bands described in the legend.  Figure 12. Differential distributions for W + (left) and W − (right), in the full SR. The transverse momentum of the Z boson is unfolded at the dressed leptons level, as described in the text. The red band around the powheg prediction represents the theory uncertainty in it; the effect on the unfolded data of this uncertainty, through the unfolding matrix, is included in the shaded bands described in the legend.  Figure 13. Differential distributions for W + (left) and W − (right), in the full SR. The mass of the WZ system data distribution is unfolded at the dressed leptons level, as described in the text. The red band around the powheg prediction represents the theory uncertainty in it; the effect on the unfolded data of this uncertainty, through the unfolding matrix, is included in the shaded bands described in the legend.  Figure 14. Differential distributions for W + (left) and W − (right), in the full SR. The W boson transverse momentum is unfolded at the dressed leptons level, as described in the text. The red band around the powheg prediction represents the theory uncertainty in it; the effect on the unfolded data of this uncertainty, through the unfolding matrix, is included in the shaded bands described in the legend.
lowed operators of dimension six can be summarized in three independent parameters [64]. Usually the choice of a basis for these parameters is based on the effective field theory (EFT) approach, where the anomalous coupling Lagrangian can be written as: where W ± µν , B µν are the field strengths associated to the SM electroweak bosons and H is the SM Higgs field. The parameters representing different aTGC effects are noted as {c W , c WWW , c b }. Values predicted by the SM are c W = c WWW = c b = 0. The typical energy scale at which BSM physics are dominant is represented by Λ 2 and it is usually absorbed in the definition of the aTGC parameters.
The behaviour of the SM prediction and those of different configurations of anomalous couplings values are compared in figure 15 for two different observables that aim to recon--29 -
struct the mass of a hypothetical BSM particle decaying to a WZ pair. The predictions corresponding to four different anomalous couplings are drawn for comparison to outline the behaviour of the most asymmetric one (c WWW ). The M (WZ) variable, defined in section 9, is chosen to determine confidence regions for each of the anomalous parameters considered. A different behaviour as a function of this variable is expected at high energy values in the presence of anomalous couplings, because of the nature of the proper anomalous terms, which include the momenta of the bosons through the field strength terms.
For each of the bins presented in figure 15 (left), a three-dimensional quadratic fit is performed to the predicted yields of the anomalous couplings in a grid of simulated points in order to extrapolate the prediction to the continuous space of parameter values. A binned likelihood function is built with the signal yields for each bin depending on the values of -30 -
Events/bin  Figure 15. Distributions of discriminant observables in the anomalous triple gauge couplings searches, before the fit used to determine confidence regions on the couplings. The invariant mass of the three lepton and missing transverse momentum system (left) and the transverse mass of the same configuration (right). The dashed lines represent the total yields expected from the sum of the SM processes, with the total WZ yields for different values of the associated anomalous coupling (AC) parameters. The SM prediction for the WZ process is obtained from the aTGC simulated sample with the AC parameters set to 0.
The procedure we described includes both the interference term between the SM amplitude and the BSM one, and for the square of the dimension-6 contribution. If the quadratic term used to build the statistical model is suppressed in the fit, the resulting confidence intervals include the interference term between the SM amplitude and the BSM one only, neglecting the square of the dimension-6 contributions. The results corresponding to this approximation are tabulated in table 14.
Restricting the effect of the anomalous couplings to a given range in the invariant mass of the diboson system can be used to impose unitarity in the aTGC models. While no direct computation of the invariant mass is possible in the leptonic decay of the WZ channel, we use the M (WZ) variable as a reasonable substitute. We compute the confidence interval for each parameter based on multiple cutoff values of the M (WZ) value to obtain the results shown in figure 17.
No anomalous effect has been observed, and the confidence regions obtained represent a significant improvement with respect to previous searches performed by the ATLAS [65] and CMS [15,16] Figure 16. Two-dimensional confidence regions for each of the possible combinations of the considered aTGC parameters. The contours of the expected confidence regions for 68% and 95% confidence level are presented in each case. The parameters considered in each plot are c W -c WWW (top), c W -c b (middle) and c WWW -c b (bottom).
-35 -    Table 14. Expected and observed one-dimensional confidence intervals (CI) at 95% confidence level for each of the considered EFT parameters, accounting only for the interference term between the SM amplitude and the BSM one. The one-dimensional intervals for each parameter are computed fixing the other two parameters to zero, the SM value.

Summary
The production process pp → WZ is studied in the trilepton final state at √ s = 13 TeV, using the full 2016 data set with a total integrated luminosity of 35.9 fb −1 collected with the CMS detector. Fiducial results are obtained in each of the flavour categories (eee, eeµ, eµµ, and µµµ) and in the combined category, and are extrapolated to the total WZ production cross section for 60 < m OSSF Z < 120 GeV. The combined measurement yields a cross section of σ tot (pp → WZ) = 48.09 +1.00 −0.96 (stat) +0.44 −0.37 (theo) +2.39 −2.17 (syst) ± 1.39 (lumi) pb, for a total uncertainty of +2.98 and −2.78 pb. The result is in good agreement with the MATRIX nextto-next-to-leading-order (NNLO) prediction [55,56], of σ NNLO (pp → WZ) = 49.98 +2.2% −2.0% pb. This result supersedes the result from the CMS Collaboration using data corresponding to a smaller integrated luminosity of 2.3 fb −1 [14]. A measurement in the fiducial region yields a value of σ fid (pp → WZ) = 257.5 +5.3 −5.0 (stat) +2.3 −2.0 (theo) +12.8 −11.6 (syst)±7.4 (lumi) fb, pointing to an excess over the powheg next-to-leading-order cross section σ powheg fid = 227.6 +9.4 −8.0 fb. The cross sections are also measured independently for the two possible values of the W boson charge, yielding a ratio of A +− WZ = σ tot (pp → W + Z)/σ tot (pp → W − Z) = 1.48 ± 0.06, which is compatible within uncertainties with the powheg + pythia prediction of 1.43 +0.06 −0.05 . Similar results are obtained when splitting by flavour category. All the measurements of this paper are compatible with the SM when the appropriate order of theoretical calculations is considered.
Differential cross sections are measured as a function of the transverse momentum of the Z boson, of the transverse momentum of the leading jet, and of an estimate of the mass -37 -of the WZ system; results are compared with predictions from the powheg and Mad-Graph5 amc@nlo generators. Differential cross sections as a function of the transverse momentum of the leading jet are also measured for each sign of the W boson charge. Confidence intervals for anomalous triple gauge boson couplings are extracted for each of the possible one-and two-dimensional combinations of the anomalous couplings parameters, using the M (WZ) variable in a maximum likelihood fit. The confidence intervals obtained represent the most stringent results on the anomalous WWZ triple gauge coupling to date.
[4] ATLAS collaboration, Measurement of the W ± Z production cross section and limits on anomalous triple gauge couplings in proton-proton collisions at √ s = 7 TeV with the ATLAS [50] CMS collaboration, Measurement of the cross section for top quark pair production in association with a W or Z boson in proton-proton collisions at √ s = 13 TeV, JHEP 08 -42 -