Measurement of the $\mathrm{Z}\gamma^{*} \to \tau\tau$ cross section in pp collisions at $\sqrt{s} = $ 13 TeV and validation of $\tau$ lepton analysis techniques

A measurement is presented of the $\mathrm{Z}/\gamma^{*} \to \tau\tau$ cross section in pp collisions at $\sqrt{s} = $ 13 TeV, using data recorded by the CMS experiment at the LHC, corresponding to an integrated luminosity of 2.3 fb$^{-1}$. The product of the inclusive cross section and branching fraction is measured to be $\sigma(\mathrm{pp} \to \mathrm{Z}/\gamma^{*}\text{+X}) \, \mathcal{B}(\mathrm{Z}/\gamma^{*} \to \tau\tau) = $ 1848 $\pm$ 12 (stat) $\pm$ 67 (syst+lumi) pb, in agreement with the standard model expectation, computed at next-to-next-to-leading order accuracy in perturbative quantum chromodynamics. The measurement is used to validate new analysis techniques relevant for future measurements of $\tau$ lepton production. The measurement also provides the reconstruction efficiency and energy scale for $\tau$ decays to hadrons+$\nu_{\tau}$ final states, determined with respective relative uncertainties of 2.2% and 0.9%.

With a lifetime of 2.9 × 10 −13 s, the τ lepton usually decays before reaching the innermost detector. Approximately two thirds of τ leptons decay into a hadronic system and a τ neutrino. Constrained by the τ lepton mass of 1.777 GeV, the hadronic system is characterized by low particle multiplicities, typically consisting of either one or three charged pions or kaons, and up to two neutral pions. The hadrons produced in τ decays therefore also tend to be highly collimated. The τ lepton decays into an electron or muon and two neutrinos with a probability of 35%. We denote the electron and muon produced in τ → eνν and τ → µνν decays by τ e and τ µ , to distinguish them from prompt electrons and muons, respectively. The hadronic system produced in a τ → hadrons + ν τ decay is denoted by the symbol τ h .
The Drell-Yan (DY) [36] production of τ lepton pairs (qq → Z/γ * → ττ) is interesting for several reasons. First, the process Z/γ * → ττ represents a reference signal to study the efficiency to reconstruct and identify τ h , as well as to measure the τ h energy scale. Moreover, Z/γ * → ττ production constitutes the dominant irreducible background to analyses of SM H → ττ events, and to searches for new resonances decaying to τ lepton pairs. The cross section for DY production exceeds the one for SM H production by about two orders of magnitude. Signals from new resonances are expected to be even more rare. It is therefore important to control with a precision reaching the sub-percent level the rate for Z/γ * → ττ production, as well as its distribution in kinematic observables. In addition, the reducible backgrounds relevant for the study of Z/γ * → ττ are also relevant for studies of SM H production and to searches for new resonances.
This paper reports a precision measurement of the inclusive pp → Z/γ * +X → ττ+X cross section. The measurement demonstrates that Z/γ * → ττ production is well understood, and provides ways to validate techniques relevant in future analyses of τ lepton production. Most notably, a method based on control samples in data is introduced for determining background contributions arising from the misidentification of quark or gluon jets as τ h . Measurements of the τ h identification (ID) efficiency and of the τ h energy scale [37] are obtained as byproducts of the analysis.
The cross section for DY production of τ lepton pairs was previously measured by the CMS and ATLAS experiments in proton-proton (pp) collisions at √ s = 7 TeV at the LHC [38,39], and in proton-antiproton collisions at √ s = 1.96 TeV by the CDF and D0 experiments at the Fermilab Tevatron [40][41][42]. In this study, we present the pp → Z/γ * +X → ττ+X cross section measured at √ s = 13 TeV, using data recorded by the CMS experiment, corresponding to an integrated luminosity of 2.3 fb −1 . Events are selected in the τ e τ h , τ µ τ h , τ h τ h , τ e τ µ , and τ µ τ µ decay channels. The τ e τ e channel is not considered in this analysis, as it was studied previously in the context of the SM H → ττ analysis, and found to be the least sensitive of these channels [43]. The pp → Z/γ * +X → ττ+X cross section is obtained through a simultaneous fit of τ lepton pair mass distributions in all decay channels.
The paper is organized as follows. The CMS detector is described briefly in Section 2. Section 3 describes the data and the Monte Carlo (MC) simulations used in the analysis. The reconstruction of electrons, muons, τ h , and jets, along with various kinematic quantities, is described in

Event reconstruction
The information provided by all CMS subdetectors is employed in a particle-flow (PF) algorithm [69] to identify and reconstruct individual particles in the event, namely muons, electrons, photons, charged and neutral hadrons. These particles are then used to reconstruct jets, τ h candidates and the vector imbalance in missing transverse momentum in the event, referred to as p miss T , as well as to quantify the isolation of leptons.
Electrons are reconstructed using an algorithm [70] that matches trajectories in the silicon tracker to energy depositions in the ECAL. Trajectories of electron candidates are reconstructed using a dedicated algorithm that accounts for the emission of bremsstrahlung photons. The energy loss due to bremsstrahlung is determined by searching for energy depositions in the ECAL emitted tangentially to the track. A multivariate (MVA) approach based on boosted decision trees (BDT) [71] is employed to distinguish electrons from hadrons that mimic electron signatures. Observables that quantify the quality of the electron track, the compactness of the electron cluster in directions transverse and longitudinal relative to the electron motion, and the matching of the track momentum and direction to the sum and positions of energy depositions in the ECAL are used as inputs to the BDT. The BDT is trained on samples of genuine and false electrons, produced in MC simulation. Additional requirements are applied to remove electrons originating from photon conversions.
The identification of muons is based on linking track segments reconstructed in the silicon tracking detector and in the muon system [72]. The matching is done both by starting from a track in the muon system and starting from a track in the inner detector. When a link is established, the track parameters are refitted using the combination of hits in the inner and outer detectors, and the reconstructed trajectory is referred to as a global muon track. Quality criteria are applied on the multiplicity of hits, the number of matched segments, and the quality of the fit to a global muon track, the latter being quantified through a χ 2 criterion.
Electrons and muons in signal events are expected to be isolated, while leptons from heavy flavour (charm and bottom quark) decays, as well as from in-flight decays of pions and kaons, are often reconstructed within jets. Isolated leptons are distinguished from leptons in jets through a sum, denoted by the symbol I , of the scalar p T values of additional charged particles, neutral hadrons, and photons reconstructed using the PF algorithm within a cone in η and azimuth φ (in radians) of size ∆R = √ (∆η) 2 + (∆φ) 2 = 0.3, centred around the lepton direction. Neutral hadrons and photons within the innermost region of the cone, ∆R < 0.01, are excluded from the isolation sum for muons to prevent the footprint of the muon in ECAL and HCAL from causing the muon to fail isolation criteria. When computing the isolation of electrons reconstructed in the ECAL endcap region, we exclude photons within ∆R < 0.08 and charged particles within ∆R < 0.015 of the direction of the electron, to avoid counting photons emitted in bremsstrahlung and tracks originating from the conversion of such photons. As the amount of material that electrons traverse in the barrel region before reaching the ECAL is smaller, the resulting probability for bremsstrahlung emission and photon conversion is sufficiently reduced so as not to require exclusion of particles in the innermost cone from the isolation sum. Efficiency loss due to pileup is kept minimal by considering only charged particles originating from the lepton production vertex ("charged from PV"). The contribution from the neutral component of pileup to the isolation of the lepton is taken into account by means of ∆β corrections [69], which enter the computation of the isolation I , as follows: where corresponds to either e or µ, and the sums extend over, respectively, the charged particles that originate from the lepton production vertex and the neutral particles. The "max" function represents taking the largest of the two values within the brackets. The ∆β corrections are computed by summing the scalar p T of charged particles that are within a cone of size ∆R = 0.3 around the lepton direction, but do not originate from the lepton production vertex, ("charged from PU") and scaling that sum by a factor of one-half: The factor of 0.5 approximates the phenomenological ratio of neutral-to-charged hadron production in the hadronization of inelastic pp collisions.
Collision vertices are reconstructed using a deterministic annealing algorithm [73,74], with the reconstructed vertex position required to be compatible with the location of the LHC beam in the x-y plane. The primary collision vertex (PV) is taken to be the vertex that has the maximum ∑ p 2 T of tracks associated to it. Electrons, muons, and τ h candidates are required to be compatible with originating from the PV.
Hadronic τ decays are reconstructed using the "hadrons+strips" (HPS) algorithm [37], which is used to separate the individual decay modes of the τ into τ − → h − ν τ , τ − → h − π 0 ν τ , τ − → h − π 0 π 0 ν τ , and τ − → h − h + h − ν τ , where h ± denotes either a charged pion or kaon (the decay modes of τ + are assumed to be identical to their partner τ − modes through charge conjugation invariance). The τ h candidates are constructed by combining the charged PF hadrons with neutral pions. The neutral pions are reconstructed by clustering the PF photons within rectangular strips, narrow in the η, but wide in the φ directions, to account for the non-negligible probability for photons produced in π 0 → γγ decays to convert into electronpositron pairs when traversing the all-silicon tracking detector of CMS and the broadening of energy depositions in the ECAL that occurs when this happens. For the same reason, electrons and positrons reconstructed through the PF algorithm are considered in the reconstruction of the neutral pions besides photons. The momentum of the τ h candidate is taken as the vector sum over the momenta of the charged hadrons and neutral pions used in reconstructing the τ h decay mode, assuming the pion-mass hypotheses. We do not use the strips of 0.20 × 0.05 size in the η-φ plane, used in previous analyses [5-7, 9-13, 18, 21-23, 29-31, 33, 34, 38, 43], but an improved version of the strip reconstruction developed during the √ s = 13 TeV run. In the improved version, the size of the strip is adjusted as function of p T , taking into consideration the bending of charged particles in the magnetic field increasing inversely with p T . More details on strip reconstruction and validation of the algorithm with data are given in Ref. [75]. The main handle for distinguishing τ h from the large background of quark and gluon jets relies on the use of tight isolation requirements. The sums of scalar p T values from photons and from charged particles originating from the PV within a cone of ∆R = 0.5 centred around the τ h direction, are used as input to an MVA-based τ h ID discriminant. The set of input variables is complemented with the scalar p T sum of charged particles not originating from the PV, by the τ h decay mode, and by observables that are sensitive to the lifetime of the τ. The transverse impact parameter of the "leading" (highest p T ) track of each τ h candidate relative to the PV is used for τ h candidates reconstructed in any decay mode. For τ h candidates reconstructed in the τ − → h − h + h − ν τ decay mode, a fit of the three tracks to a common secondary vertex (SV) is attempted, and the distance between SV and PV is used as additional input to the MVA. The MVA is trained on genuine τ h and jets generated in MC simulation. Four working points (WP), referred to as barely, minimally, moderately, and tightly constrained, are defined through changes made in the selections on the MVA output. The thresholds are adjusted as functions of the p T of the τ h candidate, such that the τ h identification efficiency for each WP is independent of p T . The moderate and tight WP used to select events in different channels provide efficiencies of 55 and 45%, and misidentification rates for jets of typically 1 and 0.5%, depending on the p T of the jet [75]. Additional discriminants are employed to separate τ h from electrons and muons. The separation of τ h from electrons is performed via another MVA-based discriminant [75] that utilizes input observables that quantify the matching between the sum of energy depositions in the ECAL and the momentum of the leading track of the τ h candidate, as well as variables that distinguish electromagnetic from hadronic showers. The cutoff-based discriminant described in Ref.
[37] is used to separate τ h from muons. It is based on matching the leading track of the τ h candidate with energy depositions in the ECAL and HCAL, as well as with track segments in the muon detectors.
Jets within the range |η| < 4.7 are reconstructed using the anti-k t algorithm [76,77] with a distance parameter R = 0.4. Reconstructed jets are required not to overlap with identified electrons, muons, or τ h candidates within ∆R < 0.5, and to pass a set of minimal identification criteria that aim to reject jets arising from calorimeter noise [78]. The energy of reconstructed jets is calibrated as function of jet p T and η [79]. Average energy density corrections calculated using the FASTJET algorithm [80,81] are applied to compensate pileup effects. Jets originating from the hadronization of b quarks are identified using the "combined secondary vertex" (CSV) algorithm [82], which exploits observables related to the long lifetime of b hadrons and the higher particle multiplicity and mass of b jets compared to light-quark and gluon jets.
The vector p miss T , with its magnitude referred to as p miss T , is reconstructed using an MVA regression algorithm [83]. To reduce the impact of pileup on the resolution in p miss inputs are combined with a probabilistic model for leptonic and hadronic τ decays to estimate the momenta of the neutrinos produced in these decays. The algorithm achieves a resolution in m ττ of ≈15% relative to the mass of the τ lepton pairs at the generator level.

Event selection
The events selected in the τ e τ h , τ µ τ h , τ h τ h , τ e τ µ , and τ µ τ µ channels are recorded by combining single-electron and single-muon triggers, triggers that are based on the presence of two τ h candidates in the event, and triggers based on the presence of both an electron and a muon.
The τ e τ h and τ µ τ h channels utilize single-electron and -muon triggers with p T thresholds of 23 and 18 GeV, respectively. Selected events are required to contain an electron of p T > 24 GeV or a muon of p T > 19 GeV, both with |η| < 2.1, and a τ h candidate with p T > 20 GeV and |η| < 2.3. The electron or muon is required to pass an isolation requirement of I < 0.10 p T , computed according to Eq. (1). The τ h candidate is required to pass the moderate WP of the MVA-based τ h ID discriminant, and to have a charge opposite to that of the electron or muon. The τ h candidate is further required to pass a tight or minimal requirement on the discriminant that separates hadronic τ decays from electrons, and a minimal or tight selection on the discriminant that separates τ h from muons. Background arising from W+jets and tt production is reduced by requiring the transverse mass of electron or muon and p miss T to satisfy m T < 40 GeV. The transverse mass is defined by: where the symbol refers to the electron or muon, and ∆φ denotes the angle in the transverse plane between the lepton momentum and the p miss T vector. Events containing additional electrons with p T > 10 GeV and |η| < 2.5, or muons with p T > 10 GeV and |η| < 2.4, passing minimal identification and isolation criteria, are rejected to reduce backgrounds from Z/γ * → ee and µµ events, and from diboson production.
A trigger based on the presence of two τ h candidates is used to record events in the τ h τ h channel. The trigger selects events containing two isolated calorimeter energy deposits at the L1 trigger stage, which are subsequently required to pass a simplified version of the PF-based offline τ h reconstruction at the high-level trigger stage. The latter applies additional isolation criteria. The p T threshold for both τ h candidates is 35 GeV. The trigger efficiency increases with p T of the τ h , because different algorithms are used to reconstruct the p T at the L1 trigger stage and in the offline reconstruction. The trigger reaches an efficiency plateau of ≈80% for events in which both τ h candidates have p T > 60 GeV. Selected events are required to contain two τ h candidates with p T > 40 GeV and |η| < 2.1 that have opposite charge and satisfy the tight WP of the MVA-based τ h ID discriminant, as well as the minimal criteria on the discriminants used to separate τ h from electrons and muons. Events containing electrons with p T > 10 GeV and |η| < 2.5 or muons with p T > 10 GeV and |η| < 2.4, passing minimal identification and isolation criteria, are rejected to avoid overlap with the τ e τ h and τ µ τ h channels.
Events in the τ e τ µ channel are recorded with the triggers based on the presence of an electron and a muon. The acceptance for the Z/γ * → ττ signal is increased by using two complementary triggers. The first trigger selects events that contain an electron with p T > 12 GeV and a muon with p T > 17 GeV, while events containing an electron with p T > 17 GeV and a muon with p T > 8 GeV are recorded through the second trigger. The offline event selection demands the presence of an electron with p T > 13 GeV and |η| < 2.5, in conjunction with a muon of p T > 10 GeV and |η| < 2.4. Either the electron or the muon is required to pass a threshold of p T > 18 GeV, to ensure that at least one of the two triggers is fully efficient. Electrons and muons are further required to satisfy isolation criteria of I < 0.15 p T , and to have opposite charge. Background from tt production is reduced through a cutoff on a topological discriminant [86] based on the projections: where the symbolζ denotes a unit vector in the direction of the bisector of the electron and muon p T vectors. The discriminator takes advantage of the fact that the angle between the neutrinos and the visible τ lepton decay products is typically small, causing the p miss T vector in signal events to point in the direction of the visible τ decay products, which is often not true for tt background. Selected events are required to satisfy the condition P miss ζ − 0.85 P vis ζ > −20 GeV. The reconstruction of the projections P miss ζ and P vis ζ is illustrated in Fig. 1. The figure also shows the distribution in the observable P miss ζ − 0.85 P vis ζ for events selected in the τ e τ µ channel before that condition is applied.
[GeV] for events selected in the τ e τ µ channel, before imposing the condition P miss ζ − 0.85 P vis ζ > −20 GeV. Also indicated is the separation of the background into its main components. The sum of background contributions from W+jets, single top quark, and diboson production is referred to as "electroweak" background. The symbols p ν(e) T and p ν(µ) T refer to the vectorial sum of transverse momenta of the two neutrinos produced in the respective τ → eνν and τ → µνν decays.
The events selected in the τ µ τ µ channel are recorded using a single-muon trigger with a p T threshold of 18 GeV. The two muons are required to be within the acceptance of |η| < 2.4, and to have opposite charge. The muons of higher and lower p T are required to satisfy the conditions of p T > 20 and > 10 GeV, respectively. Both muons are required to pass an isolation criterion of I µ < 0.15 p µ T . The large background arising from DY production of µ pairs is reduced by requiring the mass of the two muons to satisfy m µµ < 80 GeV, and through the application of a cutoff on the output of a BDT trained to separate the Z/γ * → ττ signal from the Z/γ * → µµ background. The following observables are used as BDT inputs: the ratio of the p T of the dimuon system to the scalar p T sum of the two muons (p µµ T / ∑ p µ T ), the pseudorapidity of the dimuon system (η µµ ), the p miss T , the topological discriminant P ζ , computed according to Eq. (4), and the azimuthal angle between the muon of positive charge and the p miss T vector, denoted by the symbol ∆φ(µ + , p miss T ). The angle between the muon of negative charge and the p miss T vector, ∆φ(µ − , p miss T ), is not used as BDT input, as it is strongly anticorrelated with ∆φ(µ + , p miss T ).
We refer to the events passing the selection criteria detailed in this Section as belonging to the "signal region" (SR) of the analysis.

Background estimation
The accuracy of the background estimate is improved by determining from data the contributions from the main backgrounds, as well as from backgrounds that are difficult to model through MC simulation. In particular, the background from multijet production falls into the latter category. In the τ e τ h , τ µ τ h , and τ h τ h channels, the dominant background is from events in which a quark or gluon jet is misidentified as τ h . The estimation of background from these "false" τ h sources is discussed in Section 6.1. It predominantly arises from multijet production in the τ h τ h channel and from W+jets events, as well as from multijet production in the τ e τ h and τ µ τ h channels. A small additional background contribution in the τ e τ h , τ µ τ h , and τ h τ h channels arises from tt events with quark or gluon jets misidentified as τ h . The multijet background is also relevant in the τ e τ µ and τ µ τ µ channels. The estimation of the multijet background in these channels is described in Section 6.2. The contribution to the SR from the τ e τ µ and τ µ τ µ channels arising from backgrounds with misidentified leptons other than multijet production is small and not distinguished from background contributions with genuine leptons. Significant background contributions arise from tt production in the τ e τ µ channel and from the DY production of muon pairs in the τ µ τ µ channel. The normalization of the tt background in the τ e τ µ and τ µ τ µ channels is determined from data, using a control region that contains events with one electron, one muon, and one or more b-tagged jets. Details of the procedure are given in Section 6.3. The tt normalization factor obtained from this control region is also applied to the tt background events selected in the τ e τ h , τ µ τ h , and τ h τ h channels, in which the reconstructed τ h is either due to a genuine τ h or due to the misidentification of an electron or muon. The background rate from Z/γ * → ee and Z/γ * → µµ production is determined from the data through a maximum-likelihood (ML) fit of the m ττ distributions in the SR, described in Section 8. The contributions of minor backgrounds from single top quark and diboson production, as well as a small contribution from W+jets background in the τ e τ µ and τ µ τ µ channels, are obtained from MC simulation. The sum of these minor backgrounds is referred to as "electroweak" background. A Higgs boson with a mass of m H = 125 GeV, produced at the rate and with branching fractions predicted in the SM, is considered as background. Nevertheless, this contribution is found to be negligible.
The background estimates are summarized in Table 1. The quoted uncertainties represent the quadratic sum of statistical and systematic sources.
In preparation for future analyses of τ lepton production, the validity of the backgroundestimation procedures described in this section is further tested in event categories that are relevant to the SM H → ττ analysis, as well as in searches for new physical phenomena. Event categories based on jet multiplicity, p T of the τ lepton pair, and on the multiplicity of b jets in the event are used in H → ττ analyses performed by CMS in the context of the SM [43] and of its minimal supersymmetric extension [9-11], as well as in the context of searches for Higgs boson pair production [87]. The validation of the background-estimation procedures in these event categories is detailed in the Appendix. Table 1: Expected number of background events in the τ e τ h , τ µ τ h , τ h τ h , τ e τ µ , and τ µ τ µ channels in data, corresponding to an integrated luminosity of 2.3 fb −1 . The uncertainties are rounded to two significant digits, except when they are < 10, in which case they are rounded to one significant digit, and the event yields are rounded to match the precision in the uncertainties. Process

Estimation of false-τ h background in τ e τ h , τ µ τ h , and τ h τ h channels
The background arising from events in which a quark or gluon jet is misidentified as τ h in the τ e τ h , τ µ τ h , and τ h τ h channels is estimated via the "fake factor" (F F ) method. The method is based on selecting events that pass altered τ h ID criteria, and weighting the events through suitably chosen extrapolation factors (the F F ). The events passing the altered τ h ID criteria are referred to as belonging to the "application region" (AR) of the F F method. Except for modifying the τ h ID criteria, the same selections are applied to events in the AR and in the SR. The F F are measured in dedicated control regions in data. These are referred to as "determination regions" (DR) of the F F method, and are chosen such that they neither overlap with the SR nor with the AR.
The F F are determined in bins of decay mode and p T of the τ h candidate, and as a function of jet multiplicity. In each such bin, the F F is given by the ratio: where N nominal corresponds to the number of events with τ h candidates that pass the nominal WP of the MVA-based τ h ID discriminant in a given channel, and N altered is the number of events with τ h candidates that satisfy the altered τ h ID criteria. To satisfy the altered τ h ID criteria, τ h candidates must satisfy the barely constrained WP, but fail the nominal WP. The multiplicity of jets that is used to parametrize the F F is denoted by N jet , and is defined by the jets that satisfy the conditions p T > 20 GeV and |η| < 4.7, and do not overlap with τ h candidates passing the barely constrained WP of the MVA-based τ h ID discriminant, nor with electrons or muons within ∆R < 0.5. In each bin, the contribution from processes with genuine τ h , and with electrons or muons misidentified as τ h , are estimated through MC simulation, and subtracted from the numerator as well as from the denominator in Eq. (5).
As the probabilities for jets to be misidentified as τ h depend on the τ h ID criteria, and the latter differ in different channels, the F F are measured separately in each one of them. Moreover, the misidentification rates differ for multijet, W+jets, and tt events, necessitating a measurement of the F F in the DR enriched in contributions from multijet, W+jets, and tt backgrounds. The relative fractions of multijet, W+jets, and tt background processes in the AR, denoted by R p , are determined through a fit to the distribution in m T , and are used to weight the F F determined in the DR when computing the estimate of the false-τ h background in the SR. The procedure is illustrated in Fig. 2.

Application Region
Signal Region

Fit in Application Region
c 4 c w x 9 Y n z + n a Y / I < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " q 2 0 l 1 6 n A n g / Q y v i y w k i 7 1 Q 1 J c 4 c w x 9 Y n z + n a Y / I < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " q 2 0 l 1 6 n A n g / Q y v i y w k i 7 1 Q 1 J c 4 c w x 9 Y n z + n a Y / I < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " q 2 0 l 1 6 n A n g / Q y v i y w k i 7 1 Q 1 J < l a t e x i t s h a 1 _ b a s e 6 4 = " q 2 0 l 1 6 n A n g / Q y v i y w k i 7 1 Q 1 c 4 c w x 9 Y n z + n a Y / I < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " q 2 0 l 1 6 n A n g / Q y v i y w k i 7 1 Q 1 c 4 c w x 9 Y n z + n a Y / I < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " q 2 0 l 1 6 n A n g / Q y v i y w k i 7 1 Q 1 indicates that the F F depend on the background process p, where p refers to either multijet, W+jets, or tt background. The contribution of the Z/γ * → ττ signal in the AR is subtracted, based on MC simulation. The fractions R p are determined by a fit of the m T distribution in the AR (bottom right), described in more detail in Section 6.1.2. The fraction R 1 includes a small contribution from DY events in which the reconstructed τ h is due to the misidentification of a quark or a gluon jet.
The τ h ID criteria applied in the AR are identical to the τ h ID criteria applied in the denominator of Eq. (5). More specifically, the criteria on p T and η, as well as the requirements on the discriminators that distinguish τ h from electrons and muons, are the same as in the SR. The τ h candidates selected in the τ e τ h and τ µ τ h channels are required to pass the barely constrained, but fail the moderately constrained WP of the MVA-based τ h ID discriminant. In the τ h τ h channel, one of the two τ h candidates must pass the tight WP, while the other τ h candidate is required to pass the barely constrained, but fail the tight WP, precluding overlap of the AR with the SR.
The DR enriched in contributions from multijet, W+jets, and tt backgrounds contain specific mixtures of gluon, light-quark (u, d, s), and heavy-flavour (c, b) quark jets, with different probabilities for misidentification as τ h , as illustrated for simulated events in Fig. 3. The misidentification rates are shown for jets passing p T > 20 GeV and |η| < 2.3, and for jets satisfying in addition the barely constrained WP of the MVA-based τ h ID discriminant. In general, the misidentification rates are higher in quark jets compared to gluon jets, as the former typically have lower particle multiplicity and are more collimated than the latter, thereby increasing their probability to be misidentified as τ h . As it can be seen in the figure, the requirement for jets to pass minimal τ h selection criteria significantly reduce the flavour dependence of the misidentification rates. This in turn lowers the systematic uncertainty that arises from the limited knowledge of the flavour composition in the AR. Residual flavour dependence of the F F is taken into account by measuring separate sets of F F in each DR, and determining the relative fraction R p of multijet, W+jets, and tt backgrounds in the AR of the respective channel. Given the F F and the fractions R p , the estimate of the background from misidentified τ h in the SR is obtained by applying the weights to events selected in the AR, where the sum extends over the above three background processes p. The F F refer, as usual, to Eq. (5). The symbol F p F indicates that, in addition to their dependence on τ h decay mode, τ h candidate p T , and jet multiplicity, the F F depend on the background process p, where the superscript p refers to either multijet, W+jets, or tt background. In the τ h τ h channel, the F p F is determined by the decay mode and p T of the τ h candidate that passes the barely constrained, but fails the tight WP of the MVA-based τ h ID discriminant. The τ h candidate that passes the tight WP does not enter the computation of the weight w. The underlying assumption in the F F method is that the ratio of the number of events from background process p in the SR to the number of events from the same background in the AR is equal to the ratio N nominal /N altered that is measured in the background-specific DR.
The measurement of the F F is detailed in Section 6.1.1, while the fractions R p are discussed in Section 6.1.2. The estimate of the false-τ h background obtained from the F F method is validated in control regions devoid of Z/γ * → ττ signal. The result of this validation is presented in Section 6.1.3.

Measurement of F F
The F F are measured in DR chosen such that one particular background process is enhanced in each DR. The selection criteria applied in the DR are similar to those applied in the SR. In the following, we describe only the differences relative to the SR.
In the τ e τ h and τ µ τ h channels, three different DR are used to measure the F F for multijet, W+jets, and tt backgrounds. The DR dominated by multijet background contains events in which the charges of the τ h candidate and of the light lepton candidates are the same, and the electron or muon satisfies a modified isolation criterion of 0.05 < I /p T < 0.15. Depending on whether the τ h candidate passes or fails the moderate WP of the MVA-based τ h ID discriminant, the event contributes either to the numerator or to the denominator of Eq. (5). The DR dominated by W+jets background is defined by modifying the requirement for the transverse mass of lepton and p miss T to m T > 70 GeV. The contamination arising from tt background is reduced by vetoing events containing jets that pass the b tagging criteria described in Section 4. A common tt DR is used for the τ e τ h and τ µ τ h channels. The events are required to contain an electron, a muon, at least one τ h candidate, and pass triggers based on the presence of an electron or a muon. The offline event selection demands that the electron satisfy the conditions p T > 13 GeV and |η| < 2.5, the muon p T > 10 GeV and |η| < 2.4, and that both pass an isolation criterion of I < 0.10 p T . The event is furthermore required to contain at least one jet that passes the b tagging criteria described in Section 4. In case events contain multiple τ h candidates, the candidate used for the F F measurement is chosen at random.
In the τ h τ h channel, a single DR is used, which selects a high purity sample of multijet events, the dominant background in this channel. The multijet DR is identical to the SR of the τ h τ h channel, except that the two τ h candidates are required to have the same rather than opposite charge. One of the jets is chosen to be the "tag" jet, and required to pass the tight WP of the MVA-based τ h ID discriminant, while the measurement of the F F is performed on the other jet, referred to as the "probe" jet. The tag jet is chosen at random. The W+jets and tt backgrounds are small in the τ h τ h channel, making it difficult to define a DR that is dominated by these backgrounds, or that provides sufficient statistical information for the F F measurement. The F F in the multijet DR of the τ h τ h channel are therefore used to weight all events selected in the AR of the τ h τ h channel. Differences in the F F between W+jets, tt, and multijet events are accounted for by adding a systematic uncertainty of 30% on the part of the background from misidentified τ h expected from the contribution of W+jets and tt background processes. This contribution is estimated using MC simulation, and the magnitude of the systematic uncertainty is motivated by the difference found in the F F measured in multijet, W+jets, and tt DR in the τ e τ h and τ µ τ h channels.
The F F determined in the various DR are shown in Figs. 4 and 5. The decay modes τ − → h − ν τ , τ − → h − π 0 ν τ , and τ − → h − π 0 π 0 ν τ are referred to as "one-prong" decays and the mode τ − → h − h + h − ν τ as "three-prong" decays. The measured F F are corrected for differences in the τ h misidentification rates between DR and AR. The magnitude of these relative corrections is ≈10%, as discussed below.
For the multijet DR in the τ e τ h and τ µ τ h channels, correlations between the F F and the charge of the electron or muon and the τ h candidate, and between F F and the isolation of the electron or muon, are studied in data and taken into account as follows. A correction for the extrapolation from events in which the charges of lepton and τ h candidate have the same sign (SS) to events in which they have opposite sign (OS) is obtained by comparing F F in the SS and OS events containing electrons or muons that pass an inverted isolation criterion of 0.1 < I /p T < 0.2. The dependence of the F F on the isolation of the electron or muon is studied using an event sample selected with no isolation condition applied to the lepton. The results of this study are used to extrapolate the F F obtained in the multijet DR (0.05 < I /p T < 0.15) to the SR (I < 0.10 p T ).
For the DR dominated by W+jets background in the τ e τ h and τ µ τ h channels, closure tests of the F F method reveal a dependence of the F F on m T , which is not accounted for by the chosen parametrization of the F F as functions of jet multiplicity, τ h decay mode, and p T . The dependence on m T is studied using simulated W+jets events, and used to extrapolate the F F measured in the W+jets DR (m T > 70 GeV) to the SR (m T < 40 GeV).
In the τ h τ h channel, the F F determined in the multijet DR are corrected for a dependence of the F F on the relative charge of the two τ h candidates. This is studied in events in which the tag jet (the jet on which the FF measurement is not performed) fails the tight WP of the MVA-based τ h ID discriminant. The difference between the F F in OS and SS events defines this correction.

Determination of R p
In the τ e τ h and τ µ τ h channels, the relative fractions R p of multijet, W+jets, and tt backgrounds in the AR are determined through a fit to the distribution in m T . The distribution in m T ("template") used to represent the multijet background in the fit is obtained from a sample of events selected in data, in which the τ h candidate and the electron or muon have same charge, and where at least one of the leptons satisfies a modified isolation criterion of 0.05 < I /p T < 0.15. The contributions from other backgrounds to this control region are subtracted, based on MC simulation. The distribution representing the other backgrounds in the fit are also taken from simulation. The templates for tt, diboson, and DY events are split into three components: events in which the reconstructed τ h is due to a genuine τ h , events in which the τ h is due to the misidentification of an electron or muon, and events in which a quark or gluon jet is misidentified as τ h . The normalization of each component is determined independently in the fit. The relative fractions of the Z/γ * → ττ signal and all individual background processes are left unconstrained in the fit. Finally, the fractions R p are parametrized as function of m T and are normalized such that the contribution of all processes p in which the reconstructed τ h is due to a misidentified jet sums to unity, ∑ p R p = 1.
In the τ h τ h channel, the AR is dominated by multijet background. The contributions from the Z/γ * → ττ signal and all background processes, except multijet production, are small and taken from simulation. The fraction of multijet background in the AR is determined by subtracting the sum of all processes modelled in the MC simulation from the data in the AR, without performing a fit in this channel.
A small fraction of events in the AR of the τ e τ h , τ µ τ h , and τ h τ h channels arises from DY events in which quark or gluon jets are misidentified as τ h candidates. These events are treated as background and are included in the false-τ h estimate using the F F method. As the analysed data do not provide a way of determining F F in DY events with sufficient statistical accuracy, the F F   measured in W+jets events are used instead for the fraction of DY events with jets misidentified as τ h in the τ e τ h and τ µ τ h channels. The validity of this procedure is justified by studies of F F in simulated W+jets and DY events, which indicate that the flavour composition of jets and the F F are very similar in these events. In the τ h τ h channel, the F F measured in multijet events are used and the systematic uncertainty on the DY background with misidentified τ h is increased by 30%.

Validation of the false-τ h background estimate in control regions
The modelling of the background from jets misidentified as τ h in the τ e τ h , τ µ τ h , and τ h τ h channels through the F F method is validated by comparing the background estimates obtained in this method to the data in control regions containing events with SS eτ h , µτ h , and τ h τ h pairs. A dedicated set of F F , without corrections for the extrapolation from OS to SS events, is determined for this validation. The selection of events in the multijet DR is also altered in this validation, to avoid overlap with the AR. The distributions in m ττ in events containing SS eτ h , µτ h , and τ h τ h pairs are shown in Fig. 6. The data are compared to the sum of false-τ h background and other backgrounds. The contribution of other backgrounds, in which the reconstructed τ h is due either to a genuine τ h or to the misidentification of an electron or muon, is obtained from the MC simulation. The event yield of the Z/γ * → ττ signal in these control regions is small. The normalization of individual backgrounds and of the Z/γ * → ττ signal is determined through a fit to the distributions in m ττ in which the rate of each background is allowed to vary within its estimated systematic uncertainty. The good agreement observed between the data and the background prediction in the control regions of all three channels confirms the validity of false-τ h background estimates obtained through the F F method.

Estimation of multijet background in τ e τ µ and τ µ τ µ channels
The contributions from multijet background in the SR of the τ e τ µ or τ µ τ µ channels are estimated using control regions containing events with an electron and muon or two muons of same charge, respectively. An estimate for the contribution from multijet events in the SR is obtained by scaling the yield of the multijet background in the SS control region by a suitably chosen extrapolation factor, defined by the ratio of eµ or µµ pairs with opposite charge to those with same charge. The ratio is measured in events in which at least one lepton passes an inverted isolation criterion of I > 0.15 p T . We refer to this event sample as an isolation sideband region (SB). The requirement I > 0.15 p T ensures that the SB does not overlap with the SR. A complication arises from the fact that the ratio of OS to SS pairs depends on the lepton kinematics and the isolation criterion used in the SB. The nominal OS/SS ratio is measured in an isolation sideband (SB1) defined by requiring both leptons to satisfy a relaxed isolation criterion of I < 0.60 p T , with at least one lepton passing the condition I > 0.15 p T . The systematic uncertainty in the OS/SS ratio that arises from the choice of the upper limit on I applied in SB1 is estimated by taking the difference between the OS/SS ratio computed in SB1 and the ratio computed in a different isolation sideband region (SB2). The latter is defined by requiring at least one lepton to pass the condition I > 0.60 p T , without setting an upper limit on I in the SB2 region. The criteria to select events in the isolation sidebands are optimized to ensure high statistical accuracy in the measurement of the OS/SS extrapolation factor and at the same time the minimization of differences in lepton kinematic distributions between the SR and the SB.
In both isolation sidebands, the OS/SS ratio is measured as function of p T of the two leptons and and of their separation ∆R( , ) = (η − η ) 2 + (φ − φ ) 2 in the η-φ plane. The contributions to the SS control region, as well as to SB1 and SB2, from backgrounds other than multijet production are subtracted, based on results from MC simulation.

Estimation of tt background
While the m ττ distribution for tt background is obtained from MC simulation, the event yield in the tt background in the SR is determined from data, using a control region dominated by tt background. Events in the tt control region are required to satisfy selection criteria that are similar to the requirements for the SR of the τ e τ µ channel, described in Section 5. The main differences are that the cutoff on P miss ζ − 0.85 P vis ζ is inverted to P miss ζ − 0.85 P vis ζ < −40 GeV, and a condition p miss T > 80 GeV is added to the event selection in the tt control region. The tt event yield observed in the control region is a 1.01 ± 0.07 multiple of the expectation from the MC simulation. The ratio of the tt event yield measured in data to the MC prediction is applied as a scale factor to simulated tt events, to correct the tt background yield in the τ e τ µ and τ µ τ µ channels, as well as to correct the part of the tt background in the τ e τ h , τ µ τ h , and τ h τ h channels that is either due to genuine τ h or due to the misidentification of an electron or muon as τ h . The latter is not included in the background estimate obtained through the F F method, but modelled in the MC simulation.

Systematic uncertainties
Imprecisely measured or imperfectly simulated effects can alter the normalization and distribution of the m ττ mass spectrum in Z/γ * → ττ signal or background processes. These systematic uncertainties can be categorized into theory-related and experimental sources. The latter can be further subdivided into those associated with the reconstruction of physical objects of interest and with estimated backgrounds. The uncertainties related to the reconstruction of physical objects apply to the Z/γ * → ττ signal and to backgrounds modelled in the MC simulation. The main background contributions are determined from data, as described in Section 6, and are largely unaffected by the accuracy achieved in modelling data in the MC simulation.
The main experimental uncertainties are related to the reconstruction and identification of electrons, muons, and τ h , as follows. The efficiency to reconstruct and identify τ h and the energy scale of τ h (τ h ES) is measured using Z/γ * → ττ → τ µ τ h events. The former is done by comparing the number of Z/γ * → ττ → τ µ τ h events with τ h candidates passing and failing the τ h ID criteria, and the latter by comparing the distributions in the τ h candidate mass, as well as the visible mass of the muon and τ h system in data and in MC simulation [75], measured with respective uncertainties of ≈6 and ≈1%. The events selected for the τ h ID efficiency and τ h ES measurements overlap with the events in the τ µ τ h channel. We account for the overlap by assigning a 3% uncertainty to τ h ES. A 3% change in the τ h ES affects the acceptance in Z/γ * → ττ signal by 3, 3, and 17% in the τ e τ h , τ µ τ h , and τ h τ h channels, respectively. The impact on the signal acceptance and on the distribution in m ττ is illustrated in Fig. 7. It has been checked that the overlap and the choice in the τ h ES uncertainty have little impact on the final results. The ML fit performed to measure the Z/γ * → ττ cross section, described in Section 8, reduces the uncertainties in the τ h ID efficiency and in the τ h ES to 2.2 and 0.9%, respectively. The efficiency of the τ h trigger used in the τ h τ h channel is measured in Z/γ * → ττ → τ µ τ h events with an uncertainty of ≈4.5% per τ h . The measurement is detailed in Ref. [88].
Electron and muon reconstruction, identification, isolation, and trigger efficiencies are measured using Z/γ * → ee and Z/γ * → µµ events via the "tag-and-probe" method [89] at an accuracy of 2%. The energy scales for electrons and muons (e ES and µ ES) are calibrated using J/ψ → , Υ → , and Z/γ * → events (with referring to e and µ), and have an uncertainty of 1%. The e ES and µ ES uncertainties affect the acceptance in the Z/γ * → ττ signal in the τ e τ h , τ µ τ h , τ e τ µ , and τ µ τ µ channels by less than 1%. The p miss T response and resolution are known within uncertainties of a few percent from studies performed in Z/γ * → µµ, Z/γ * → ee, and γ+jets events [90]. The impact of these uncertainties on the acceptance in the Z/γ * → ττ signal is small, amounting to less than 1%. In the τ e τ h and τ µ τ h channels, the impact arises from the m T < 40 GeV selection criterion. In the τ e τ µ and τ µ τ µ channels, the impact is due to the P miss ζ − 0.85 P vis ζ > −20 GeV requirement and the use of p miss T and P ζ as input variables in the BDT that separates the Z/γ * → ττ signal from the Z/γ * → µµ background, respectively. The effect of uncertainties related to the modelling of the p miss T on the distribution in m ττ is small.
The uncertainty in the integrated luminosity is 2.3% [91].
The backgrounds determined from data are also subject to uncertainties that alter the normalization and distribution ("shape") of the m ττ mass spectrum. Background yields and their associated uncertainties are given in Table 1. The uncertainties in the backgrounds arising from the misidentification of quark and gluon jets as τ h candidates in the τ e τ h , τ µ τ h , and τ h τ h channels are obtained by changing the F F values as well as the relative fractions R p of multijet, W+jets, and tt backgrounds within their uncertainties. The resulting uncertainties in the m ττ distribution in the τ e τ h , τ µ τ h , and τ h τ h channels are illustrated in Fig. 8. The uncertainties in the size of the false-τ h backgrounds are 8, 6, and 16% in the τ e τ h , τ µ τ h , and τ h τ h channels, respectively. In the τ e τ µ and τ µ τ µ channels, the uncertainty in the size of the multijet background is ≈20%. The magnitude of the tt background is known to an accuracy of 7%. The uncertainty in the distribution of the tt background is estimated by changing the weights applied to the tt MC sample, to improve the modelling of the top quark p T distribution (described in Section 3), between no reweighting and the reweighting applied twice.
The uncertainties in the yields of single top quark and diboson backgrounds, modelled using MC simulation, are each ≈15%. Besides constituting the dominant background in the τ µ τ µ channel, the DY production of electron and muon pairs are relevant backgrounds in, respectively, the decay channels τ e τ h and τ µ τ h , because of the small but non-negligible rate at which electrons and muons are misidentified as τ h . The probability for electrons and muons to pass the tight-electron or tight-muon removal criteria applied, respectively, in the τ e τ h and τ µ τ h channels is measured in Z/γ * → ee and in Z/γ * → µµ events. The misidentification rates depend on η. For electrons in the ECAL barrel and endcap regions, the misidentifications are at respective levels of 0.2 and 0.1%, with accuracies of 13 and 29% [75]. The misidentification rate for muons lies between less than one and several tenths of a percent, and is known to within an uncertainty of 30%. The contribution from W+jets background in the τ e τ µ and τ µ τ µ channels is modelled using MC simulation, and is known to an accuracy of 15%. The production of SM Higgs bosons is assigned an uncertainty of 30%, reflecting the present experimental uncertainty in the H → ττ rate measured at √ s = 13 TeV [14].
The theoretical uncertainty in the product of signal acceptance and efficiency for the Z/γ * → ττ signal is ≈2% in the τ e τ h , τ µ τ h , τ e τ µ , and τ µ τ µ channels, and 6% in the τ h τ h channel. The quoted uncertainties include the effect of missing higher-order terms in the perturbative expansion for the calculated cross section, estimated through independent changes in the renormalization and factorization scales by factors of 2 and 1/2 relative to their nominal equal values [92,93], uncertainties in the NNPDF3.0 set of PDF, estimated following the recommendations given in Ref. [94], and the uncertainties in the modelling of parton showers (PS) and the underlying event (UE). The theoretical uncertainty is larger in the τ h τ h channel, as the acceptance depends crucially on the modelling of the p T distribution of the Z boson, which is also affected by the missing higher-order terms in the calculation.
The systematic uncertainties are summarized in Table 2. The table also quantifies the impact that each systematic uncertainty has on the measurement of the Z/γ * → ττ cross section, defined as the percent change in the measured cross section when individual sources are changed by one standard deviation relative to their nominal values. The impacts are computed for the values of nuisance parameters obtained in the ML fit used to extract the signal (described in Section 8).
The uncertainties in the integrated luminosity, in the cross section for DY production of electron and muon pairs, and in the electron, muon, and τ h reconstruction and identification efficiencies have greatest impact on the results.
The impact of the uncertainty on the integrated luminosity amounts to 1.9%. This is smaller than the 2.3% uncertainty in the integrated luminosity measurement, because of correlations of the nuisance parameter representing the integrated luminosity with other nuisance parameters. Table 2: Effect of experimental and theoretical uncertainties in the measurement of the Z/γ * → ττ cross section. The sources of systematic uncertainty are specified in the leftmost column, and apply to the processes given in the second column. The relative changes in the acceptance A for the Z/γ * → ττ signal, and in the yield from background processes that correspond to a one standard deviation change in a given source of uncertainty is given in the third column. The range in this column represents the range in signal acceptance or background yield across all decay channels and background processes. The impact that each change produces is quantified by its effect on the measured Z/γ * → ττ cross section, given in the rightmost column. When the integrated luminosity changes by 2.3%, the ML fit readjusts the nuisance parameters that represent the rates for background processes obtained from MC simulation, as well as identification and trigger efficiencies for e, µ, and τ h , such that the measured Z/γ * → ττ cross section changes by only 1.9%. The uncertainty in the integrated luminosity is not constrained in the ML fit.
The impact of the uncertainty in the production rate of Z/γ * → ee and Z/γ * → µµ background processes amounts to 1.8%. The impact is sizeable, because of the small statistical uncertainty in the Z/γ * → µµ background in the τ µ τ µ channel, which, in the absence of uncertainties in the Z/γ * → µµ production rate, would constrain the efficiency for muon reconstruction and identification, as well as the integrated luminosity.
The impact of uncertainties in the efficiencies to reconstruct and identify electrons and muons amounts to 1.5 and 1.6%, respectively. Their impact is considerable, because these uncertainties are not reduced greatly in the ML fit, as they affect all channels, except the τ h τ h channel, in a similar way.
The impact of the uncertainty in the efficiency to reconstruct and identify τ h is of similar size, amounting to 1.5%, despite that the uncertainty in the τ h ID efficiency is significantly larger than the uncertainties in the electron and muon ID efficiencies. This is because the simultaneous fit to the m ττ distributions in all five channels reduces the uncertainties in the τ h ID efficiency and the τ h ES significantly, diminishing thereby the impact that these uncertainties have on the Z/γ * → ττ cross section. When the Z/γ * → ττ cross section is measured in the individual τ e τ h , τ µ τ h , and τ h τ h channels, the impact of the uncertainty on the τ h ID efficiency increases to 6, 6, and 10%, respectively.
The uncertainty in τ h ES becomes relevant for the τ h τ h channel when the Z/γ * → ττ cross section is measured in this channel alone, and amounts to 9%. In the τ e τ h and τ µ τ h channels, the impact of the τ h ES uncertainty amounts to less than 1%, even when the Z/γ * → ττ cross section is measured just in these channels.

Signal extraction
The cross section σ(pp → Z/γ * +X) B(Z/γ * → ττ) for DY production of τ pairs is obtained through a simultaneous ML fit to the observed m ττ distributions in the five decay channels: τ e τ h , τ µ τ h , τ h τ h , τ e τ µ , and τ µ τ µ . The likelihood function L (data | ξ, Θ) depends on the value of the cross section, denoted by the symbol ξ, which defines the parameter of interest (POI) in the fit, and it also depends on the values of nuisance parameters θ k that represent the systematic uncertainties discussed in Section 7: The index i refers to individual bins of the m ττ distribution in each of the five final states. The set of all nuisance parameters θ k is denoted by the symbol Θ. Correlations among decay channels as well as between the Z/γ * → ττ signal and background processes are taken into account through relationships among channels, processes, and nuisance parameters in the ML fit. The probability to observe n i events in a given bin i, when ν i (ξ, Θ) events are expected in that bin is given by the Poisson distribution: The number of events expected in each bin corresponds to the sum of the number of signal (ν S i ) and background (ν B i ) events: The estimate in the number of background events is obtained as described in Section 6. The number of signal events is proportional to ξ, with the coefficient of proportionality depending on the signal acceptance and on the signal selection efficiency, with both obtained from MC simulation.
The function ρ θ k |θ k represents the probability to observe a valueθ k in an auxiliary measurement of the nuisance parameter, given that the true value is θ k . The nuisance parameters are treated via the frequentist paradigm, as described in Refs. [95,96]. Systematic uncertainties that affect only the normalization, but not the distribution in m ττ , are represented by the Gamma function if they are statistical in origin, e.g. corresponding to the number of events observed in a control region, and otherwise by log-normal probability density functions. Systematic uncertainties that affect the distribution in m ττ are incorporated into the ML fit via the technique detailed in Ref. [97], and represented by Gaussian probability density functions. Nuisance parameters representing systematic uncertainties of the latter type can also affect the normalization of the Z/γ * → ττ signal or of its backgrounds. The nuisance parameters corresponding to the cross sections for DY production of electron and muon pairs are left unconstrained in the fit.
The best fit valueξ of the POI is the value that maximizes the likelihood L (data | ξ, Θ) in Eq. (7). A 68% confidence interval (CI) on the POI is obtained using the profile likelihood ratio (PLR) [95,96,98]: The symbolΘ ξ denotes the values of nuisance parameters that maximize the likelihood for a given value of ξ. The combination ofξ andΘ correspond to the values of ξ and Θ for which the likelihood function reaches its maximum. The 68% CI is defined by the values of ξ for which −2 ln λ (ξ) increases by one unit relative to its minimum. To quantify the effects from individual statistical uncertainties, the uncertainty in the integrated luminosity, and other systematic uncertainties, we ignore some single source of uncertainties at a time, and recompute the 68% CI. The nuisance parameters θ k corresponding to uncertainties that are ignored are fixed at the valuesθ k that yield the best fit to the data. The square root of the quadratic difference between the CI, computed for all sources of uncertainties in the fit, and for the case that some given source is ignored, reflects the estimate of the uncertainty in the POI resulting from a single source. The procedure is illustrated in Fig. 9 for the combined fit of all five final states.
Correlations among different sources of uncertainty are estimated through this procedure.
The cross section for DY production of τ pairs is quoted within the mass window 60 < m true ττ < 120 GeV. The contribution from Z/γ * → ττ events that pass the selection criteria described in Section 5, but have a mass outside of this window is at the level of a few percent in the τ e τ h , τ µ τ h , τ e τ µ , and τ µ τ µ channels. In the τ h τ h channel, this contribution from outside of the mass window is ≈40%, the reason for this being so large is the high p T threshold on the τ h candidates required in the trigger. The Z/γ * → ττ events that have two τ h with p T > 40 GeV contain either a Z boson of high p T or a τ lepton pair above the mass of the Z boson. Only a small fraction of signal events pass either of these two conditions, which leads to the smallest event yield from the Z/γ * → ττ signal in the τ h τ h channel (as shown in Table 3), and to the largest fraction of signal events containing a τ lepton pair of mass outside of the 60 < m true ττ < 120 GeV window. The PLR depends on the τ h ID efficiency and on the τ h ES through its dependence on the corresponding two nuisance parameters. The τ h ID efficiency and τ h ES are determined by promoting these nuisance parameters to the role of POI. The cross section for DY production of τ pairs, [nb] ξ  Figure 9: Dependence of −2 ln λ (ξ) on the cross section ξ for DY production of τ pairs. The PLR is computed for the simultaneous ML fit to the observed m ττ distributions in the τ e τ h , τ µ τ h , τ h τ h , τ e τ µ , and τ µ τ µ channels. The dashed, dash-dotted, and solid curves correspond to situations when just the statistical uncertainties are used in the fit, when the uncertainty in integrated luminosity is also included, and when all uncertainties are included in the fit. The values of nuisance parameters, corresponding to uncertainties that are ignored, are fixed at the values that yield the best fit to the data. The horizontal line represents the value of −2 ln λ (ξ) that is used to determine the 68% CI on ξ.

fb
the τ h ID efficiency, and the τ h ES are left unconstrained in the fit, and the PLR is minimized as a function of all three parameters.

Results
The yields expected in Z/γ * → ττ signal and in background contributions from the ML fit to the m ττ distributions in the different decay channels are given in Table 3. The cross sections are displayed in Table 4, and the distributions in m ττ for the selected events are shown in Figs. 10 and 11.
The total uncertainty in the cross section is decomposed into statistical contributions, uncertainty in the integrated luminosity of the data, and other systematic uncertainties, as described in Section 8. The measured values are compatible with each other. The largest deviation, amounting to a little more than one standard deviation, is observed in the τ h τ h channel. A deviation of this magnitude is expected. We proceed to a simultaneous fit of the m ττ distributions in the five final states. The value of the cross section obtained from the combined fit is: The result is compatible with the prediction of 1845 +12 −6 (scale) ± 33 (PDF) pb, computed at NNLO accuracy [60] using the NNPDF3.0 PDF. The results are illustrated in Fig. 12. The inner and outer error bars represent, respectively, the statistical uncertainties, and the quadratic sum of the uncertainties in the statistical, systematic, and integrated-luminosity components. The uncertainty in σ(pp → Z/γ * +X) B(Z/γ * → ττ) arising from the uncertainty in the integrated luminosity is smaller than the uncertainty in the integrated luminosity, for the reasons discussed in Section 7. Table 3: Yields expected in Z/γ * → ττ signal events and backgrounds in the τ e τ h , τ µ τ h , τ h τ h , τ e τ µ , and τ µ τ µ channels, obtained from the ML fit described in Section 8. The uncertainties are rounded to two significant digits, except when they are < 10, in which case they are rounded to one significant digit, and the event yields are rounded to match the precision in the uncertainties. The analysed data corresponds to an integrated luminosity of 2.3 fb −1 .
Two-dimensional projections of −2 ln λ (ξ), obtained when the τ h ID efficiency and τ h ES are left unconstrained in the fit, are shown in Fig. 13. Measured values of the τ h ID efficiency and of τ h ES are quoted as scale factors (SF) relative to their MC expectation. The values of σ(pp → Z/γ * +X) B(Z/γ * → ττ), τ h ID efficiency, and τ h ES that minimize −2 ln λ (ξ), yielding the best fit to the data, are indicated by a cross. Contours for which −2 ln λ (ξ) exceeds its minimum value by 2.30 and 6.18 units, corresponding to coverage probabilities of 68 and 95% in the twodimensional parameter plane, are also shown. The 68% CIs for the τ h ID efficiency and τ h ES are obtained as the values of the respective parameter for which −2 ln λ (ξ) increases by one unit relative to its minimum. The measured SF for the τ h ID efficiency and for τ h ES amount to 0.979 ± 0.022 and 0.986 ± 0.009, respectively. Both SF are compatible with unity, indicating

Summary
The cross section for inclusive Drell-Yan production of τ pairs has been measured using pp collisions recorded by the CMS experiment at √ s = 13 TeV at the LHC. The analysed data  correspond to an integrated luminosity of 2.3 fb −1 . The signal yield was determined in a global fit to the mass distributions in five ττ decay channels: τ e τ h , τ µ τ h , τ h τ h , τ e τ µ , and τ µ τ µ . The measured cross section times branching fraction σ(pp → Z/γ * +X) B(Z/γ * → ττ) = 1848 ± 12 (stat) ± 57 (syst) ± 35 (lumi) pb is in agreement with the standard model expectation, computed at next-to-next-to-leading order accuracy in perturbation theory. As a byproduct of the global fit, the efficiency for reconstructing and identifying the decays of τ leptons to hadrons (τ → hadrons + ν τ ), as well as the τ h energy scale, have been determined. The results from data agree with Monte Carlo simulation within the uncertainties of the measurement, amounting to 2.2% relative uncertainty in the τ h identification efficiency, and 0.9% in the energy scale.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies:

A Validation of background model in event categories
The validity of the background estimation described in Section 6 is checked in event categories that are relevant for the SM H → ττ analysis as well as in searches for new physics.
Event categories based on jet multiplicity, p T of the τ lepton pair, and on the multiplicity of b jets are defined by the conditions given in Table 5. Category Selection 0-jet No jets 1 and no b jets 2 1-jet, low Z boson p T At least one jet 1 , no b jets 2 , p Z T < 50 GeV, excluding events selected in 2-jet VBF category 1-jet, medium Z boson p T At least one jet 1 , no b jets 2 , 50 < p Z T < 100 GeV, excluding events selected in 2-jet VBF category 1-jet, high Z boson p T At least one jet 1 , no b jets 2 , p Z T > 100 GeV, excluding events selected in 2-jet VBF category 2-jet VBF At least one pair of jets 1 satisfying m jj > 500 GeV and ∆η jj > 3.5, no b jets 2 1 b jet Exactly one b jet 2 2 b jet Exactly two b jets 2 1 With p T > 30 GeV and |η| < 4.7 2 With p T > 20 GeV, |η| < 2.4, and identified by the CSV algorithm as originating from the hadronization of b quarks The transverse momentum of the Z boson (p Z T ) is reconstructed by adding the momentum vectors from the visible τ decay products and the reconstructed p miss T in the transverse plane. The observables m jj and ∆η jj are used to select signal events produced through the fusion of virtual vector bosons (VBF) in the SM H → ττ analysis, and refer, respectively, to the mass and to the separation in pseudorapidity of the two jets of highest p T in events containing two or more jets.
Background contributions arising from Z/γ * → ee, Z/γ * → µµ, W+jets, tt, single top quark, and diboson production to the event categories defined in Table 5 in the τ e τ h , τ µ τ h , τ h τ h , and τ e τ µ channels are estimated as described above. The fractions R p of multijet, W+jets, DY, and tt backgrounds used in Eq. (6) are calculated separately for each of the event categories.
The contribution of Z/γ * → ττ is determined from data, using Z/γ * → µµ events. Events passing the single-muon trigger are selected by the presence of two muons of opposite charge passing tight identification and isolation criteria. At least one of the muons is required to have p T > 20 GeV and |η| < 2.1, while the other muon is required to satisfy the conditions p T > 10 GeV and |η| < 2.4. The number of Z/γ * → µµ candidate events selected in the different categories in data is compared to the MC expectation for Z/γ * → µµ production, and their ratio is used as a scale factor to correct the MC expectation for the Z/γ * → ττ event yield in that category. The expected contribution of background processes, obtained from MC simulation, is subtracted from the data before taking the ratio. The selection criteria applied on muon p T and η in Z/γ * → µµ, and on p T and η of the visible τ decay products in Z/γ * → ττ events are known to cause a bias in the p Z T distribution. The latter is correlated with the multiplicity of jets. The bias must be corrected, as its magnitude is very different for Z/γ * → µµ and Z/γ * → ττ events. The bias is emulated by replacing the muons reconstructed in Z/γ * → µµ candidate events with generator-level τ leptons. The τ leptons are decayed using TAUOLA++ 1.1.4 [99,100], and effects of τ lepton polarization in the decays are modelled through weights computed with the TAUSPINNER [101] program. A sample of 1000 random τ lepton decays is generated for each Z/γ * → µµ candidate event, and the weights computed in TAUSPINNER are recorded for each decay. The ratio of the sum of the weights for decays in which the visible products of both τ leptons pass selection criteria on p T and η, to the sum of all weights computed for the 1000 decays, is applied as event weight to the Z/γ * → µµ candidate, which corrects for the difference in bias of p Z T caused by selection criteria on between Z/γ * → µµ and Z/γ * → ττ events. The procedure is validated through MC simulation.
The contributions of background processes that are modelled in the MC simulation to the different categories are affected by uncertainties in the jet energy scale and resolution. The energy scale of jets is measured using the p T balance of jets with Z bosons and photons in Z/γ * → ee and Z/γ * → µµ and γ+jets events and the p T balance between jets in dijet events as described in Ref. [79]. The uncertainty in the jet energy scale is a few percent and depends on p T and η. The impact of jet energy scale and resolution uncertainties on the yields of background processes is evaluated by varying the jet energy scale and resolution within their uncertainties, redetermining the multiplicity of jets and b jets, and reapplying the event categorization conditions given in Table 5.
Distributions in m ττ for events selected in different event categories are shown for the τ e τ h , τ µ τ h , τ h τ h , τ e τ µ , and τ µ τ µ channels in Figs. 16 to 23.
The distributions expected for the Z/γ * → ττ signal and for backgrounds are shown for the values of nuisance parameters obtained from the ML fit described in Section 8. The ML fit is performed independently for each category. The m ττ distributions are shown within the range 50 < m ττ < 250 GeV, indicating good agreement with background expectations over that mass range. A similar level of agreement between the data and the background prediction is observed in the τ e τ h , τ h τ h , and τ µ τ µ channels.
The agreement confirms the reliability of the F F method to estimate the reducible backgrounds in the τ e τ h , τ µ τ h , and τ h τ h channels in future H → ττ analyses. It also validates the fact that the Z/γ * → ττ contribution to event categories, based on jet and b jet multiplicities and on the p T of the τ lepton pair, can be modelled using Z/γ * → µµ data, without the so-called "embedding" technique [43,102] used previously to model the Z/γ * → ττ background in H → ττ analyses of ATLAS and CMS. Observed