Measurement of the Drell-Yan forward-backward asymmetry at high dilepton masses in proton-proton collisions at $\sqrt{s} =$ 13 TeV

A measurement of the forward-backward asymmetry of pairs of oppositely charged leptons (dimuons and dielectrons) produced by the Drell-Yan process in proton-proton collisions is presented. The data sample corresponds to an integrated luminosity of 138 fb$^{-1}$ collected with the CMS detector at the LHC at a center-of-mass energy of 13 TeV. The asymmetry is measured as a function of lepton pair mass for masses larger than 170 GeV and compared with standard model predictions. An inclusive measurement across both channels and the full mass range yields an asymmetry of 0.612 $\pm$ 0.005 (stat) $\pm$ 0.007 (syst). As a test of lepton flavor universality, the difference between the dimuon and dielectron asymmetries is measured as well. No statistically significant deviations from standard model predictions are observed. The measurements are used to set limits on the presence of additional gauge bosons. For a Z' boson in the sequential standard model the observed (expected) 95% confidence level lower limit on the Z' mass is 4.4 (3.7) TeV.


Introduction
Drell-Yan (DY) production of pairs of same-flavor, oppositely charged leptons ( + − ) in protonproton (pp) collisions occurs via the s-channel exchange of Z/γ * bosons. At leading order (LO) in quantum chromodynamics (QCD), the Z/γ * bosons are produced via quark-antiquark (qq) annihilation. The presence of both vector and axial couplings of gauge bosons to leptons results in a forward-backward asymmetry, A FB , in the angular distribution of the final-state lepton with respect to the initial-state quark. Deviations from the standard model (SM) predictions of A FB can result from the presence of an additional neutral gauge boson (Z ) [1][2][3][4][5][6][7], quark-lepton compositeness [8], large extra dimensions [9], vector-like fermions [10], certain dark matter candidates [11], or leptoquarks [12]. Hints of a violation of lepton flavor universality in several measurements recently reported by the LHCb Collaboration [13][14][15] have sparked interest in models for physics beyond the SM that could explain these effects. Many of these models include heavy neutral gauge bosons or leptoquarks with non-flavor-universal couplings [16,17], which could produce signatures in high-mass dilepton distributions [18]. As compared to measurements of differential cross sections at high dilepton masses, measurements of A FB have a reduced dependence on systematic uncertainties related to the reconstruction and identification of high-momentum leptons. Hence, precision measurements of A FB can provide stringent tests of the electroweak sector of the SM and of lepton flavor universality. The main results of this paper are measurements of A FB as a function of dilepton mass for both muons and electrons in the high-mass region (>170 GeV), which is particularly sensitive to potential contributions from new physics.
The asymmetry is defined in terms of the angle θ between the negatively charged final-state lepton and the initial-state quark (meant here in contrast to the antiquark) in the + − centerof-mass frame, where σ F (σ B ) is the total cross section for forward (backward) events, defined by cos θ > 0 (<0). The sign of θ is defined so that cos θ = 1 events are those in which the negatively charged final-state lepton is traveling in the same direction as the incident quark. The A FB value can also be directly related to the parton-level differential cross section. The latter can be written as [19,20]: dσ d cos θ ∝ 3 8 1 + cos 2 θ + A 0 2 1 − 3 cos 2 θ + A 4 cos θ , where A 0 and A 4 are the standard dimensionless constants parameterizing the angular distribution of the DY process [19,20]. The terms that are functions of cos θ with even parity do not contribute to A FB , leading to the relation: This form of the angular distribution is general for any s-channel processes mediated by a spin-1 boson and is thus applicable in the case of interference from additional heavy vector bosons. The angular coefficients A 0 and A 4 vary as functions of the mass (m), transverse momentum (p T ), and rapidity (y) of the dilepton system. As the dilepton transverse momentum approaches zero, A 0 vanishes. It is nonzero for finite Z/γ * p T caused by processes in which Z/γ * bosons are produced in association with additional jets. Thus, measurements of A 0 probe higher-order corrections in perturbative QCD. Additional results of this paper are the measurements of A 0 as a function of the dilepton mass in the same high-mass region (>170 GeV).
Electroweak interference between the photon and the Z boson leads to negative values of A FB with large absolute value for masses below the Z pole (m < 80 GeV), and leads to large and positive values of A FB above the Z pole (m > 110 GeV). Near the Z boson mass peak, A FB reflects pure Z exchange and is close to zero because of the small value of the charged-lepton vector coupling to Z bosons. In the high-mass region, above the Z boson peak, the SM value of A FB is approximately constant with a value of ≈0.6. The A 0 coefficient is expected to be ≈0.06 at 170 GeV and should decrease at higher masses.
For the case of a new heavy Z , off-shell interference can produce deviations at masses significantly lower than the Z mass. The deviation from the SM A FB is insensitive to the width of the Z [6,7]. This measurement thus offers a complementary approach to previous searches for new physics in the dilepton channel that have searched for a peak in the invariant mass distribution caused by the resonant production of a new particle [21,22]. Additionally, because the constraining power of this technique is based on interference rather than direct production, its sensitivity to higher mass scales is not limited by the center-of-mass energy and will continuously improve with the increased statistical precision with the addition of future LHC data.
When the dilepton system has nonzero p T , the exact directions of the incident partons are unknown since they are no longer collinear with the proton beams. To minimize the impact of this effect on the asymmetry measurement, the Collins-Soper rest frame [23] is used. In this frame, θ * is defined as the angle between the negatively charged lepton and the axis that bisects the angle between the incident parton directions. Approximating the leptons as massless, cos θ * can be computed from lab frame variables as: with where E i is the energy and p z,i the longitudinal momentum of the lepton (i = 1) and antilepton (i = 2). The sign of cos θ * is defined with respect to the direction of the incident quark, which is unknown for collisions in a pp collider. Typically, the quarks in the collision will carry a larger momentum fraction than the antiquarks, since only the quarks are valence partons of protons. Thus, usually the lepton pair will have longitudinal momentum along the direction of the incident quark. Therefore, instead of using the initial quark direction, one can define the positive axis to be the longitudinal direction of the lepton pair. We denote the angular variable defined in this way as: where p z is the longitudinal momentum of the dilepton system. The presence of events where the quark direction does not match the lepton pair direction dilutes the asymmetry observed in cos θ R as compared with the underlying asymmetry in cos θ * . The fraction of events for which the quark direction is the same as the longitudinal momentum of the lepton pair increases with the absolute value of the dilepton rapidity. At an invariant mass of 500 GeV, this fraction is ≈60% at |y| = 0.2 and ≈95% at |y| = 2.  [26,27]. The results presented in this paper at √ s = 13 TeV takes a new approach to measuring A FB . Rather than counting forward and backward events, A FB is extracted by fitting a set of templates constructed to represent the different terms in Eq. (2) to the measured angular distribution. These templates are constructed from Monte Carlo (MC) simulations, which include the dilution effect. The A FB measured by the template fitting method corrects for the dilution effect and thus determines the asymmetry in cos θ * , whereas the A FB measured by the traditional counting method does not correct for the dilution and determines the asymmetry in cos θ R . Additionally, the template-based measurement of A FB optimally combines the information from the full fiducial dilepton rapidity range into a single measurement, rather than being reported differentially in rapidity as is done in the counting method. This means that the A FB values from this measurement are not directly comparable with the previous CMS results. The previous ATLAS publication utilized the counting method but also included additional results unfolding the dilution effect, that are comparable with this measurement. This template-based measurement is a maximum-likelihood estimate of A FB and leads to a ≈20% smaller uncertainty than a simple counting-based measurement [28]. However it relies more heavily on parton distribution functions, and cannot be as easily reinterpreted using new models as a counting-based measurement of the diluted A FB .

Analysis strategy
The observed distribution of the reconstructed scattering angle, cos θ R , abbreviated as c R , can be expressed as a convolution of the cos θ * , abbreviated c * , distribution defined in Eq. (2), where C is a normalization constant, R is a "reconstruction function" that incorporates detector resolution and the dilution effect, and ε is an efficiency function. It is important to point out that convolutions are linear operations, which implies that the resulting reconstructed distribution can be expressed as a sum of convolutions of each of the terms in Eq. (2).
This linearity allows the fitting function to be represented by a set of parameter-independent templates corresponding to the different terms in Eq. (2). By fitting the observed data distribution to combinations of these signal templates and additional background templates, one can extract A FB and A 0 . The DY signal templates, representing the reconstructed angular distribution corresponding to each of the terms in the true angular distribution of Eq. (2), are constructed by reweighting MC simulated events. The details of the signal template construction are discussed in Section 7.
The templates for the dominant backgrounds are extracted from MC simulation and validated in a control region of eµ events. Templates for additional backgrounds that are not well modeled by MC are constructed based on control samples in data.
Because the form of Eq. (2) is general for any s-channel spin-1 process, the measured values of A FB can be used to set limits on the existence of heavy Z bosons.
To search for signs of lepton flavor universality violation, the difference between A FB and A 0 in the muon and electron channels can be measured directly. The parameters of interest in these measurements are ∆A FB = A FB,µµ − A FB,ee and ∆A 0 = A 0,µµ − A 0,ee . The direct measurement of ∆A FB is less sensitive to systematic uncertainties common to the muon and electron channels than the measurement of A FB in the individual channels, and thus has reduced systematic uncertainty.

Fiducial corrections and measurement interpretation
In this analysis, A FB and A 0 are measured as functions of the dilepton mass, but A FB and A 0 also depend on the dilepton p T and rapidity. A single A 0 parameter and a single A FB parameter are measured in each mass bin, representing effective values, integrated over p T and rapidity. However, because the acceptance of dilepton events in the CMS detector is not independent of the dilepton p T and rapidity, the effective A 0 or A FB of recorded events is not the same as the effective value in the full phase space. The effect was studied in MC simulations and it was found that differences between the fiducial and full phase space values of A 0 were insignificant, but the differences in A FB could be as large as 2% in the lower mass bins. Using these MC simulations, a correction factor was derived to convert the fiducial A FB directly measured in the data to that of the full phase space. This correction factor is then applied to the result of the template fit so that the final reported value is a measurement of the A FB in the full phase space. Uncertainties in the MC simulation are propagated through the evaluation of this correction factor, and are treated as a source of systematic uncertainty in the final measurement This template fitting technique automatically accounts for all other resolution, dilution, migration, and acceptance effects since they are modeled in the simulation. Uncertainties in the simulation of these effects can be accounted for using variations of the corresponding templates, as discussed in Section 8. The final result can thus be interpreted as a measurement of the partonic A FB in different bins of the reconstructed lepton pair mass. Because the mass bins used are large and the SM A FB is essentially independent of the dilepton mass in the high-mass region of our measurement, the measurement is not unfolded to parton-level mass bins.

The CMS detector and physics objects
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, is reported in Ref. [29].
Events of interest are selected using a two-tiered trigger system. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 µs [30]. The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [31].
The particle-flow algorithm [32] aims to reconstruct and identify each individual particle in an event, with an optimized combination of information from the various elements of the CMS detector. The energy of photons is obtained from the ECAL measurement. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the elec-tron track. The energy of muons is obtained from the curvature of the corresponding track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies. The candidate vertex with the largest value of summed physics-object p 2 T is assigned to be the primary pp interaction vertex. The physics objects are the jets, clustered using the anti-k T jet finding algorithm [33,34] with the tracks assigned to candidate vertices as inputs.
Muons are measured in the range |η| < 2.4, with detection planes made using three technologies: drift tubes, cathode strip chambers, and resistive-plate chambers. The single-muon trigger efficiency exceeds 90% over the full η range, and the efficiency to reconstruct and identify muons is greater than 96%. Matching muons to tracks measured in the silicon tracker results in a relative p T resolution of 1% in the barrel and 3% in the endcaps for muons with p T up to 100 GeV, and of better than 7% for muons in the barrel with p T up to 1 TeV [35].
The single-electron trigger efficiency is approximately 80% over the full η range, and the efficiency to reconstruct and identify electrons is greater than 65% for electrons with p T > 20 GeV. The momentum resolution for electrons with p T ≈ 45 GeV from Z → ee decays ranges from 1.7-4.5%. It is generally better in the barrel region than in the endcaps, and also depends on the bremsstrahlung energy emitted by the electron as it traverses the material in front of the ECAL [36].
Jets are clustered from the particle-flow candidates in an event using the anti-k T jet finding algorithm with a distance parameter of 0.4. Jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found from simulation to be, on average, within 5-10% of the true momentum over the whole p T spectrum and detector acceptance. Additional pp interactions within the same or nearby bunch crossings ("pileup") can contribute additional tracks and calorimetric energy depositions, increasing the apparent jet momentum. To mitigate this effect, tracks identified as originating from pileup vertices are discarded and an offset correction is applied to correct for remaining contributions [37]. Jet energy corrections are derived from simulation studies so that the average measured energy of jets becomes identical to that of particle level jets [38]. In situ measurements of the momentum balance in dijet, photon+jet, Z+jet, and multijet events are used to determine any residual differences between the jet energy scale in data and in simulation, and appropriate corrections are made [38]. The missing transverse momentum vector ( p miss T ) is defined as the negative vector p T sum of all the particle-flow candidates in an event, and its magnitude is denoted as p miss   [48,49] are used for all three data-taking periods.

Data and Monte Carlo samples
Other processes that can give a final state with two oppositely charged same-flavor leptons are diboson production (WW, WZ, ZZ), photon-induced dilepton production (γγ → ), top quark pair production (tt), and single top quark production in association with a W boson (tW). The tt and tW backgrounds are generated at NLO using POWHEG v2.0 [50][51][52][53], and interfaced to PYTHIA with the CUETP8M2T4 [54] (CP5) tune for the 2016 (2017-2018) data-taking period. Background samples of ZZ → , ZZ → νν, and WW → ν ν are generated at NLO with POWHEG interfaced to PYTHIA. Background samples of WZ → qq and ZZ → qq are generated at NLO with aMC@NLO interfaced to PYTHIA. The WW, WZ, and ZZ samples corresponding to the 2016 (2017-2018) data-taking period are interfaced with PYTHIA using the CUETP8M1 (CP5) tune. The tt, tW, and diboson backgrounds corresponding to the 2016 (2017-2018) data-taking period use the NNPDF 3.0 (3.1) PDFs. The photon-induced background, γγ → , is simulated using the CEPGEN [55] implementation of LPAIR [56,57], interfaced to PYTHIA v6.429 [58], and using the default proton structure function parameterization of Suri-Yennie [59]. This contribution is split into three parts because the interaction at each proton vertex can be elastic or inelastic.
The cross sections of WZ and ZZ diboson samples are normalized to the NLO predictions calculated with MCFM 6.6 [60], whereas the cross sections of the WW samples are normalized to the next-to-NLO (NNLO) predictions [61]. The total cross section of the tt process is normalized to the prediction with NNLO accuracy in QCD and next-to-next-to-leading-logarithmic accuracy for the soft gluon radiation resummation calculated with TOP++ 2.0 [62].
The detector response for all MC samples is simulated using a detailed description of the CMS detector based on GEANT4 [63]. The pileup distribution in simulation is weighted to match the one observed in data.

Event selection
Events are required to have two leptons of the same flavor and opposite charges. The dimuon and dielectron events are selected by single-muon and single-electron triggers, respectively. The thresholds of these triggers are different for the different data-taking years, and the leading muon or electron in the event is required to have a p T above the trigger threshold. In the analysis, the leading muon p T requirement for the years 2016/2017/2018 is 26/29/26 GeV and for electrons it is 29/38/35 GeV. The subleading lepton is required to have p T > 15 GeV. All muons are required to be within the acceptance of the muon system (|η| < 2.4) and all electrons must be within |η| < 2.5, excluding the barrel-endcap transition region of the ECAL (1.44 < |η| < 1.57). Additionally, to remove cosmic-ray-induced events, the azimuthal angle (φ) between the two muons is required to differ from π by more than 5 mrad.
Each reconstructed muon is required to pass identification criteria that are based on the number of hits observed in the tracker, the response of the muon detectors, and a set of matching criteria between muon track parameters, as measured by the inner tracker and muon detectors. To suppress nonprompt muons coming from heavy-flavor decays, both muons must be isolated from other particles in a cone of size ∆R = 0.3, where ∆R = (∆η) 2 + (∆φ) 2 refers to the distance from the muon to a given track. More details on the muon identification and reconstruction used in this analysis can be found in Refs. [35,64].
The reconstructed electron candidates are required to pass identification criteria that are based on electromagnetic shower shape variables. Electrons originating from photon conversions are suppressed by requiring that the candidates have at most one missing inner tracker hit and are not consistent with being part of a conversion pair. Electrons are also required to be isolated from other particles within a cone of size ∆R = 0.3. The electron isolation criteria are based on the ratio of the electron p T to the sum of energy deposits associated with the photons as well as with the charged and neutral hadrons reconstructed by the particle-flow algorithm. More details on the electron reconstruction and identification criteria used in this analysis are described in Ref. [36].
To suppress backgrounds that contain the decays of top quarks, two additional requirements are applied. First, events are required to have p miss T < 100 GeV. Second, it is required that neither of the two highest p T jets in the event with |η| < 2.4 are identified as a bottom quark jet (b tagged). Jets originating from decays of bottom quarks are identified using an algorithm that combines lifetime information from tracks and secondary vertices [65]. A working point is used that has a 68% efficiency of correctly identifying a bottom quark jet and a 1% probability of misidentifying a light-flavor quark or gluon jet as a bottom quark jet.

Backgrounds
At large lepton pair masses, the dominant background comes from fully leptonic decays of tt events. There are also backgrounds from tW events, diboson processes and ττ leptonic decays. All of these backgrounds are well modeled in simulation, and the estimated event yields are validated in data in a control region of eµ events. Multijet and W+jets events where one or more jets are incorrectly identified as leptons (denoted as "MisID" events) are also a source of background. This background is larger for the ee channel than for the µµ channel. A technique based on control samples in data is employed to estimate this background. There are also backgrounds coming from t-channel photon-induced dilepton production, which are modeled with MC simulation as well.
The MisID background is estimated from data using the "misidentification rate" method. Descriptions of this method are reported in Refs. [66,67]. The misidentification rate is defined as the probability of a jet, having been reconstructed as a lepton candidate, to pass the lepton selection requirements. This rate is measured in a sample with two leptons coming from a Z boson decay and an additional, potentially misidentified, lepton. These two leptons are required to pass the lepton identification requirements and have an invariant mass within 7 GeV of the Z boson mass [68]. The presence of the third lepton candidate is used as a probe to measure the misidentification rate. These lepton candidates are required to pass a set of identification and isolation requirements less stringent than the full selection requirements of the analysis. The MisID background can then be estimated from a sample of data events with two lepton candidates where at least one of the candidates fails the full selection requirements. Events from this sample are assigned weights based on the expected misidentification probability of the failing lepton candidates. Contamination of this sample by lepton pairs from DY and other prompt processes is subtracted using MC simulation. These reweighted events are used to estimate the yield and shape of the MisID background.
This method of background estimation is validated in a control region. This control region has all the same selection criteria and covers the same dilepton mass range (m > 170 GeV) as the signal region except the lepton pairs are required to have same-sign rather than opposite-sign charges. A large fraction of the events in this control region stem from misidentified jets. In the ee channel, there is also a significant contribution to this sample from opposite-sign DY events where one of the electrons has had its charge incorrectly assigned. The rate of this misassignment is not modeled well in simulation, and so a correction to the MC is derived in the mass window of the Z peak (70 < m < 110 GeV) and applied. The MisID background estimate as well as MC estimates of other backgrounds are compared with the observed yield of same-sign events. A cos θ R -dependent correction to the MisID background estimate is derived using the ratio of the MisID estimate to the number of observed same-sign events minus other backgrounds. The uncertainty in this correction is calculated as the quadratic sum of the statistical uncertainties from the limited sample size, uncertainty in the amount of DY pairs reconstructed as same-sign pairs, as well as systematic uncertainty reflecting possible differences in shape between same-sign and opposite-sign MisID estimates.
Because the ττ, tW, tt, and diboson backgrounds all have decays to eµ pairs as well, the MC simulation of these processes can be validated in data. The eµ control region used has all the same selection criteria as the signal region, except that events are required to have one muon and one electron rather than a pair of the same flavor. The muon is required to have p T > 26/29/26 GeV for the years 2016/2017/2018 and pass the muon trigger used in the main analysis, and the electron is required to have p T > 15 GeV. There are also MisID events in this eµ control region coming from QCD multijet and W+jets backgrounds. These are estimated using the misidentification rate technique previously described. The dilepton invariant mass and cos θ R distributions of eµ events are shown in Fig. 1. Good agreement is observed between the simulated and observed yield of eµ events across the entire mass range.  Because diboson events are produced via electroweak processes, they are expected to have a small forward-backward asymmetry in their lepton pairs; tt events are also known to have an asymmetry, but it is too small to be detected in its lepton pairs alone [69]. The QCD and W+jets backgrounds should have no asymmetry. Based on MC and control sample estimates, we predict an overall forward-backward asymmetry of ≈0.01 in the sample of eµ events. Using the definition in Eq. (1), the A FB of the eµ sample is found to be 0.012 ± 0.003, consistent with this expectation.
Although the overall normalizations and asymmetries of the MC eµ estimates are observed to be consistent with data, there are discrepancies between the predicted and observed shapes of the cos θ R distribution in certain ranges of dilepton mass. To address these discrepancies, a cos θ R correction is derived from the eµ sample in data, based on the ratio of the observed and predicted cos θ R distributions. Because the asymmetry is modeled well, this correction is derived symmetrically in cos θ R using four bins of |cos θ R |. To account for the changing cos θ R shapes as a function of the dilepton mass, the correction is derived separately in different mass bins. Corrections are derived in five mass bins (170, 200, 250, 320, 510, 3000 GeV) with edges matching those used in the A FB measurement (defined in Section 7), combining the highest mass bins because of a limited event count. These corrections are applied to modify the shapes of the templates used to model the corresponding backgrounds in the signal region. They change the estimated cos θ R shape of these backgrounds and introduce uncertainties in these shapes at the level of by ≈ 5 ± 4% and ≈ 15 ± 10% in the lowest and highest mass bins, respectively. Figure 2 shows a comparison between measured µµ and ee events and our estimates in the signal region after all corrections have been applied. Good agreement is observed between the simulated and observed amount of µµ and ee events.

Template construction
From the MC sample of DY events, templates for each of the three terms in Eq. (2) need to be constructed. To avoid any of the templates having a negative yield in any of the bins, two reparameterizations are performed. The angular distribution is rewritten in a slightly different form than Eq. (2): This form is equivalent to Eq. (2) for α = 2A 0 /(2 − A 0 ). The term corresponding to NLO QCD production (1 − c 2 * ) is now strictly positive. The three templates, f S , f α , and f A , can now be constructed to represent the 1 + c 2 * , 1 − c 2 * , and c * terms in Eq. (7). These templates are constructed by reweighting MC events using generatorlevel quantities but are binned in the reconstructed variables. Note that the generator-level c * and the reconstructed c R can have different signs. To minimize the impact of statistical fluctuations in our MC simulations, each event is used twice in the template construction. Each event is used once with +c * and once with −c * , and each use is given half weight to keep the normalization unchanged. The following reweighting factors are used to construct the f S , f α , and f A templates respectively: The α values in the denominator are extracted from fits to generator level distributions of our MC simulation events.   The left plot shows the µµ channel and the right plot the ee channel. The black points with error bars represent the data and their statistical uncertainties, whereas the combined signal and background expectation is shown as stacked histograms. The hatched band shows the systematic uncertainty in the expected signal and background yield. The sources of this uncertainty are discussed in Section 8. The lower panels show the ratio of the data to the expectation. The gray bands represents the total uncertainty in the predicted yield.
Then a histogram of CMS data events, f data , could be parameterized by: where the normalization factor N(α) = 3/[4(2 + α)] has been introduced for convenience and f j bkg represents templates for the different backgrounds.
To avoid the negative values of the antisymmetric template ( f A ), two positive-valued linear combinations of f A and f S , are constructed as: Then CMS data in each mass bin can be fit by the following parameterization: where the asymmetry A FB and α are allowed to float. Within each mass bin, the two-dimensional (2D) templates are binned in c R and lepton pair rapidity |y|. Because of the acceptance effects, there are very few events with high rapidity and large |c R |. Therefore, a different c R binning is adopted at high rapidity. There are four bins in |y| with bin edges at: 0.0, 0.6, 1.0, 1.5, and 2.4. For |y| < 1.0, there are eight uniform c R bins from -1.0 to 1.0. For |y| > 1.0, the pairs of bins with |c R | > 0.5 are merged leaving six c R bins in this region. Additionally, there are few events with large rapidity at high mass so for the highest two mass bins (700-1000 and >1000 GeV) the two rapidity bins with |y| > 1.0 are merged leaving three |y| bins in this region.
There are several templates for various background processes. Because of their similar shapes and the small magnitude of the tW background, the top quark related background processes (tt, tW) are combined into a single background template. Diboson backgrounds (WW, ZZ, WZ) are combined into a single background template as well. The γγ → process is a separate background template and constitutes 1-4% of the total cross section in the mass range of the measurement. The MisID estimate is also included as a separate background template. A Z/γ * → ττ template is included as well. To minimize statistical fluctuations, the templates for the top and MisID backgrounds are symmetrized in cos θ R . The templates for the diboson, Z/γ * → ττ, and γγ → processes are not symmetrized because these processes have perceptible asymmetries.
Differential measurements of A FB and A 0 are performed by fitting each mass bin independently, with a set of nuisance parameters uncorrelated with those in other mass bins. The results of an inclusive measurement in which all mass bins (m > 170 GeV) are fit simultaneously is also reported. Within each mass bin all three years are fit simultaneously as separate categories. Two sets of fits are performed to extract A FB and A 0 : one in which the A FB and A 0 parameters in the ee channel and µµ channels are allowed to float independently, and one in which they are fit with common A FB and A 0 parameters. An additional set of fits is performed to extract ∆A FB and ∆A 0 . In these fits, the A FB and A 0 parameters in the ee channel are used as a reference and the A FB and A 0 parameters in the µµ channel are constructed in relation to them using the definitions of ∆A FB and ∆A 0 . The A FB and A 0 values in the ee channel are allowed to freely float in the measurements of ∆A FB and ∆A 0 . These measurements of ∆A FB and ∆A 0 are performed separately for each mass bin, as well as inclusively across all mass bins.

Systematic uncertainties
Systematic uncertainties in the normalization and shape of templates arise from a variety of sources and are defined through nuisance parameters in the likelihood. For systematic uncertainties that can change the shape of a template, shifted templates are constructed by varying the source of the systematic uncertainty up and down within its uncertainty. Shape uncertainties are incorporated into the likelihood by interpolation between the nominal and shifted templates, constrained with a Gaussian prior. The interpolation is calculated with a sixth-order polynomial for shifts smaller than one Gaussian σ, and with a linear function for shifts beyond one Gaussian σ. Normalization uncertainties are included using log-normal priors.
PDFs: variations in PDFs can change the dilution factor and kinematic distributions of the MC simulation samples. The 100 NNPDF set replicas are converted to 60 Hessian eigenvector variations [70]. For each of these 60 variations an "up" template is constructed using the variation's weight. A "down" template is then constructed that symmetrizes the deviation from the nominal template. Each variation is treated as a separate shape uncertainty yielding 60 nuisance parameters associated with PDFs in the fit.
MisID background estimate: the uncertainty in the normalization of the MisID background and the uncertainty in the shape of the MisID cos θ R distribution are considered separately. Their normalizations are assigned a 50% uncertainty based on closure studies of the misidentification rate technique performed in simulation. The uncertainties in the shape corrections come from the statistical uncertainties of the same-sign validation and systematic uncertainties reflecting the shape differences between the same-sign and opposite-sign MisID estimates.
Lepton efficiencies: scale factors are derived and applied to the simulated MC events to account for the differences in the detector performance between CMS data and the MC simulation. The efficiencies for the trigger, lepton identification, and lepton isolation are measured as functions of lepton p T and η using the "tag-and-probe" [35, 36] method for both data and simulation. For muons, separate uncertainties are included for the muon trigger, identification, and isolation efficiencies, with the uncertainty in reconstruction efficiency being negligible. For electrons, isolation is part of the identification requirements rather than a separate selection, so separate uncertainties are included for the electron trigger, reconstruction, and identification/isolation efficiencies. The efficiency uncertainties also include the effects of timing issues in the ECAL endcaps and muon chambers that caused inefficiencies in the first-level trigger in the range of 0-3% for the different data-taking periods. For each of these uncertainties, up and down templates are constructed from MC simulations using the up and down values from the uncertainty in the scale factors.
Lepton scale corrections: the muon momentum and electron energy scales can be affected by detector alignment and imperfect calibration. Such issues are corrected by additional energy and momentum scale corrections applied to both simulation and data. A bias in the muon momentum reconstruction can occur because of the differences in the tracker alignment between CMS data and simulation, as well as a residual magnetic field mismodeling. The muon momentum scale corrections are applied using the procedure described in Refs. [64,71]. The electron energy deposits, as measured in the ECAL, are subject to a set of corrections involving information from both the ECAL and tracker [36]. Separate up and down templates are constructed for the electron smearing, and muon and electron scales based on the uncertainty of these corrections.
Cross sections: a systematic uncertainty is attributed to the normalization of signal and background samples estimated by MC event generators. Based on NNLO calculations using the FEWZ v3.1 simulation code [72], a 3% uncertainty is assigned to the DY cross section. Based on NLO calculations, a 5% uncertainty is attributed to the backgrounds from top quark decays [62], and a 4% uncertainty in the diboson backgrounds [61,73,74]. A 6% uncertainty is attributed to the normalization of the γγ → background, and it has been verified that this value covers the differences of samples generated using Suri-Yennie structure functions and the LuxQED [75,76] photon PDFs.
Statistical uncertainties in templates: we treat the uncertainty due to the finite number of MC events, as well as the finite number of data events used to build the MisID templates, using a modified "Barlow-Beeston-like" approach [77], which adds a Gaussian uncertainty reflecting the number of events accumulated in each template bin. However, template bins at ± cos θ R have partially correlated uncertainties because of events being used multiple times in the construction of signal, MisID background, and top quark background templates, but not reused in the γγ → , ττ, and diboson background templates. To account for this correlation, additional constraint terms are added in the fit. These additional terms constrain the differences in the Barlow-Beeston-like nuisance parameters in correlated bins with Gaussian constraints. For each channel and mass bin, the correlation coefficient between bins at ± cos θ R is computed and the standard deviation of the Gaussian function chosen conservatively based on the maximum amount of uncorrelated uncertainty in each mass bin. Based on the computed correlations, a variance of 0.1 is used for the Gaussian constraints on the template bins in the first four mass bins, and a variance of 0.6 for the Gaussian constraints on template bins in the highest three mass bins. Renormalization + factorization scale and strong coupling: the renormalization (µ R ) and factorization scales (µ F ), and strong coupling (α S ) used in MC generation can also have an effect on the shape and normalization of the templates. The µ R and µ F uncertainties are included by reweighting simulated events to match alternative scenarios where µ R and µ F are scaled up and down by a factor of two, both independently and simultaneously. The unphysical combinations where µ R and µ F are scaled in opposite directions are not included. Three nuisance parameters, describing µ R , µ F and combined µ R and µ F scale variations are included in the fit and are taken to be uncorrelated with respect to each other. Simulated events are also reweighted to account for variations of α S and used to construct up and down templates. A central value of α S = 0.118 and variations of ± 0.0015 are used [78].
Background shape corrections: as discussed in Section 6, a shape correction based on eµ data events is applied to MC simulation derived backgrounds from top quarks and diboson events. The uncertainties in this shape correction are modeled with four independent nuisance parameters. Each parameter allows the values of the shape correction in a particular |cos θ R | bin to vary up and down within its statistical uncertainty. DY p T correction: it is known that the DY dilepton p T spectrum is difficult to model at low p T [79,80]. Because acceptance is correlated with p T , mismodeling of the p T spectrum can result in mismodeling of the DY contribution. For this reason, the DY MC p T spectrum in each mass bin is corrected based on data. For each mass bin used in the measurement, the normalized DY MC simulation p T distribution is compared with the normalized data distribution, less the contributions from backgrounds. A correction factor is then derived based on the ratio of these two distributions. When building the templates used in the measurement, events are reweighted with this additional factor so that the events used to build the templates match the p T spectrum of data events. To model the uncertainty in this correction seven separate nuisance parameters are used, one for each p T bin. For each parameter the correction in a particular p T bin is varied up and down within its statistical uncertainty to produce up and down templates. The total uncertainty in these corrections leads to an uncertainty in the signal templates of roughly 1%.
Pileup: to account for the uncertainty originating from the differences in the measured and simulated pileup distribution, the total inelastic cross section is varied by ± 4.6% [81], and the reweighting factor applied to MC simulated samples is recomputed. Up and down MC simulation templates are constructed based on the up and down variations of the reweighting factor.
b tag veto efficiency: scale factors dependent on jet flavor, p T , and η are applied to simulated events to correct for differences in b tagging efficiency and mistag rates between data and simulation [65]. Up and down templates are constructed based on the uncertainty in these efficiencies. As discussed in Section 2, there is also an uncertainty in the final measurement of A FB based on the extrapolation from the fiducial region to the full phase space. This uncertainty is added as an additional source of systematic uncertainty when the correction is applied after the template fit.
The contributions from different sources of systematic uncertainty are evaluated by redoing the fit while different groups of nuisances are fixed to their nominal values and taking the quadrature difference between the resulting uncertainty and the full uncertainty. A comparison of the magnitude of the different systematic uncertainties on the A FB and ∆A FB measurements is shown in Table 1. The single largest source of systematic uncertainty in the measurement of A FB are the PDFs, but it is a negligible source of uncertainty in the measurement of ∆A FB . The dominant sources of systematic uncertainty are common to the electron and muon channels and thus their combination reduces the statistical uncertainty of the measurement but does not significantly reduce the systematic uncertainty.

Results
For all relevant results presented in this section, the best-fit value of the parameter and an approximate 68% CL confidence interval are extracted following the procedure described in Section 3.2 of [82]. The results for the template fits to data to extract A FB in different mass bins are presented in Table 2 and shown graphically in Fig. 3; the results for A 0 are presented in Table 3 and shown graphically in Fig. 4. The results for the extraction of ∆A FB and ∆A 0 in different mass bins and inclusively are presented in Table 4. The results for ∆A FB are also shown graphically in Fig. 5. The resulting exclusion limits are shown in Fig. 6. The contribution of γγ → events as compared to DY events in each mass bin is shown in Table 5. Comparisons of the data and the postfit predictions are shown in Figs. 7-9. 0.592 ± 0.012 ± 0.010 0.635 ± 0.015 ± 0.011 0.608 ± 0.009 ± 0.010 0.608 ± 0.006 250-320 0.558 ± 0.014 ± 0.009 0.610 ± 0.018 ± 0.010 0.578 ± 0.011 ± 0.009 0.603 ± 0.007 320-510 0.598 ± 0.014 ± 0.009 0.583 ± 0.018 ± 0.009 0.592 ± 0.011 ± 0.008 0.603 ± 0.005 510-700 0.610 ± 0.027 ± 0.008 0.624 ± 0.033 ± 0.009 0.616 ± 0.021 ± 0.008 0.604 ± 0.004 700-1000 0.617 ± 0.042 ± 0.009 0.563 ± 0.048 ± 0.008 0.594 ± 0.032 ± 0.008 0.606 ± 0.002 >1000 0.595 ± 0.070 ± 0.011 0.694 ± 0.076 ± 0.014 0.638 ± 0.052 ± 0.011 0.610 ± 0.002 Inclusive, Mass >170 0.602 ± 0.006 ± 0.007 0.628 ± 0.008 ± 0.007 0.612 ± 0.005 ± 0.007 0.608 ± 0.006 Comb./SM Figure 3: Measurement of the DY forward-backward asymmetry as a function the dilepton mass compared with the MC predictions. The green line is the predicted value for A FB from the aMC@NLO simulation and the shaded green region its uncertainty. The red, blue, and black points and error bars represent the dimuon, dielectron, and combined measurements, respectively. Error bars on the measurements include both statistical and systematic components. The bottom panel shows the ratio between the combined measurement and the aMC@NLO prediction. In the bottom panel, the vertical error bars represent the uncertainty in the combined measurement and the shaded green band the uncertainty in the aMC@NLO prediction. Table 3: Results for the measurement of A 0 from the maximum likelihood fit to data in different dilepton mass bins in the different channels as well as an inclusive measurement across all mass bins. The first and second uncertainties listed with each measurement are statistical and systematic, respectively. The last columns lists the predictions of A 0 and associated uncertainties from aMC@NLO. To help in the interpretation of these results, we also list the average dilepton p T of the data events in each mass bin.  Figure 4: Measurement of the DY forward-backward asymmetry as a function the dilepton mass compared with the MC predictions. The green line is the predicted value for A FB from the aMC@NLO simulation and the shaded green region its uncertainty. The red, blue, and black points and error bars represent the dimuon, dielectron, and combined measurements, respectively. Error bars on the measurements include both statistical and systematic components. Table 4: Results for the measurement of ∆A FB and ∆A 0 between the muon and electron channels from the maximum likelihood fit to data in different dilepton mass bins as well as an inclusive measurement across all mass bins. The first and second uncertainties listed with each measurement are statistical and systematic, respectively.

Mass (GeV)
170-200 200-250 250-320 320-510 510-700 700-1000 > 1000 γγ → fraction 1.8% 2.1% 2.5% 2.8% 3.3% 3.7% 4.1% In the combined measurement of A FB , no statistically significant deviations from the SM predictions are observed. A small difference is found between the muon and electron A FB 's, with ∆A FB found to be consistently below zero in the lowest three mass bins, as well as in the inclusive measurement. Based on a likelihood scan, the inclusive measurement ∆A FB differs from zero at the level of 2.4 standard deviations.
Given that the existence of new gauge bosons would change the asymmetry well below their resonance masses, the measurement of A FB can be used to constrain the existence of heavy Z bosons. The constraining power is model dependent, because the interference between the Z boson and Z/γ * depends on the couplings of the Z boson. To give an example of how these measurements can be used to constrain models with Z bosons, constraints are derived for the sequential standard model (SSM) [83]. The SSM contains a Z boson with the same coupling strength as the SM Z boson, up to an overall normalization factor in the left-handed coupling, κ L . In order to set 95% confidence level (CL) upper limits on properties of the Z boson, a hypothesis test between the SSM and SM is performed based on a comparison of the predicted values of A FB in each model and our measurements. The test statistic used is the difference in χ 2 between the two models. Predictions for A FB in the SSM Z are derived from MC estimates using aMC@NLO [84]. Only the measurements of A FB in the three highest mass bins are used for the χ 2 calculation. It was checked that the interference from an off-shell several-TeV SSM Z produces negligible effects in the lower mass bins and their inclusion would not improve the limit. Figure 6 shows the expected and observed 95% CL exclusion limits on SSM Z bosons. For κ L = 1, which corresponds to a Z boson with exactly the same couplings as the SM Z boson, a Z with m Z < 4.4 TeV is excluded at 95% CL, whereas the expected limit was 3.7 TeV.
Currently these limits are not as strong as those from direct dilepton resonance searches [21,22], which report limits at 4.90 TeV and 5.15 by ATLAS and CMS, respectively. However, the A FBbased approach is sensitive to wide-width resonances, which may be missed in a search for narrow resonances [7]. Additionally, the sensitivity of the A FB -based approach will continue to improve with additional data, with the sensitivity to heavy mass scales expected to scale as the fourth root of the increase in integrated luminosity. This means limits from the A FBbased approach should surpass the projected sensitivity of resonance searches [85] during the High-Luminosity upgrade of the LHC [86].
The full tabulated results are provided in HEPData [87].  Figure 6: Exclusion limits at 95% CL on the coupling parameter κ L for a Z in the sequential standard model as a function of the Z mass. The expected (observed) limit is shown by the dashed (solid) line. The inner and outer shaded areas around the expected limits show the 68% (green) and 95% (yellow) CL intervals, respectively. The dashed blue line shows κ L = 1 which corresponds to a Z with exactly the same couplings as the SM Z boson.      Data / fit

Summary
The CMS detector at the LHC has been used to measure the Drell-Yan forward-backward asymmetry (A FB ) and the angular coefficient A 0 as functions of dilepton mass for muon and electron pairs with invariant mass above 170 GeV. The measurement is performed using protonproton collision data collected in 2016-2018 at √ s = 13 TeV with an integrated luminosity of 138 fb −1 using a template fitting approach. The combined dimuon and dielectron A FB measurements show good agreement with the standard model predictions across the full mass range. An inclusive measurement across the full mass range yields an A FB of 0.612 ± 0.005 (stat) ± 0.007 (syst) and an A 0 of 0.047 ± 0.005 (stat) ± 0.013 (syst). As a test of lepton flavor universality, the difference between the dimuon and dielectron A FB s is measured and found to agree with zero to within 2.4 standard deviations. Using the combined A FB measurements, limits are set on the existence of additional gauge bosons. For a Z boson in the canonical sequential standard model the observed (expected) 95% confidence level lower limit on the Z mass is 4.4 TeV (3.7 TeV).

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid and other centers for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC, the CMS detector, and the supporting computing infrastructure provided by the   [31] CMS Collaboration, "The CMS trigger system", JINST 12 (2017) P01020, doi:10.1088/1748-0221/12/01/P01020, arXiv:1609.02366.