1 Introduction

Final states with \(\mathrm {\tau }\) leptons are important experimental signatures at the CERN LHC. In particular, the recently reported observation of decays of standard model (SM) Higgs bosons (\(\text {H} \)) [1,2,3] into pairs of \(\mathrm {\tau }\) leptons [4] suggests additional searches in the context of new charged [5,6,7,8] and neutral [9,10,11,12,13,14,15,16,17] Higgs bosons, lepton-flavor violation [18,19,20], supersymmetry [21,22,23,24,25,26,27,28], leptoquarks [29, 30], extra spatial dimensions [31, 32], and massive gauge bosons [33,34,35].

With a lifetime of \(2.9 \times 10^{-13}\hbox { s}\), the \(\mathrm {\tau }\) lepton usually decays before reaching the innermost detector. Approximately two thirds of \(\mathrm {\tau }\) leptons decay into a hadronic system and a \(\mathrm {\tau }\) neutrino. Constrained by the \(\mathrm {\tau }\) lepton mass of \(1.777\hbox { GeV}\), the hadronic system is characterized by low particle multiplicities, typically consisting of either one or three charged pions or kaons, and up to two neutral pions. The hadrons produced in \(\mathrm {\tau }\) decays therefore also tend to be highly collimated. The \(\mathrm {\tau }\) lepton decays into an electron or muon and two neutrinos with a probability of \(35\%\). We denote the electron and muon produced in \(\mathrm {\tau }\rightarrow \mathrm {e}\nu \nu \) and \(\mathrm {\tau }\rightarrow \mathrm {\mu }\nu \nu \) decays by \(\mathrm {\tau }_{\mathrm {e}} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \), to distinguish them from prompt electrons and muons, respectively. The hadronic system produced in a \(\mathrm {\tau }\rightarrow \text{ hadrons } + \nu _{\mathrm {\tau }}\) decay is denoted by the symbol \(\tau _\mathrm {h} \).

The Drell–Yan (DY) [36] production of \(\mathrm {\tau }\) lepton pairs (\(\mathrm {q}\bar{\mathrm {q}}\rightarrow \mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\)) is interesting for several reasons. First, the process \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) represents a reference signal to study the efficiency to reconstruct and identify \(\tau _\mathrm {h} \), as well as to measure the \(\tau _\mathrm {h} \) energy scale. Moreover, \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) production constitutes the dominant irreducible background to analyses of SM \(\text {H} \rightarrow \mathrm {\tau }\mathrm {\tau }\) events, and to searches for new resonances decaying to \(\mathrm {\tau }\) lepton pairs. The cross section for DY production exceeds the one for SM \(\text {H} \) production by about two orders of magnitude. Signals from new resonances are expected to be even more rare. It is therefore important to control with a precision reaching the sub-percent level the rate for \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) production, as well as its distribution in kinematic observables. In addition, the reducible backgrounds relevant for the study of \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) are also relevant for studies of SM \(\text {H} \) production and to searches for new resonances.

This paper reports a precision measurement of the inclusive \(\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X} \rightarrow \mathrm {\tau }\mathrm {\tau }\text {+X}\) cross section. The measurement demonstrates that \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) production is well understood, and provides ways to validate techniques relevant in future analyses of \(\mathrm {\tau }\) lepton production. Most notably, a method based on control samples in data is introduced for determining background contributions arising from the misidentification of quark or gluon jets as \(\tau _\mathrm {h} \). Measurements of the \(\tau _\mathrm {h} \) identification (ID) efficiency and of the \(\tau _\mathrm {h} \) energy scale [37] are obtained as byproducts of the analysis.

The cross section for DY production of \(\mathrm {\tau }\) lepton pairs was previously measured by the CMS and ATLAS experiments in proton-proton (\(\mathrm {p}\mathrm {p}\)) collisions at \(\sqrt{s} = 7\hbox { TeV}\) at the LHC [38, 39], and in proton–antiproton collisions at \(\sqrt{s} = 1.96\hbox { TeV}\) by the CDF and D0 experiments at the Fermilab Tevatron [40,41,42]. In this study, we present the \(\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X} \rightarrow \mathrm {\tau }\mathrm {\tau }\text {+X}\) cross section measured at \(\sqrt{s} = 13\hbox { TeV}\), using data recorded by the CMS experiment, corresponding to an integrated luminosity of \(2.3\hbox { fb}^{-1}\). Events are selected in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), \(\tau _\mathrm {h} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \), and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) decay channels. The \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {e}} \) channel is not considered in this analysis, as it was studied previously in the context of the SM \(\text {H} \rightarrow \mathrm {\tau }\mathrm {\tau }\) analysis, and found to be the least sensitive of these channels [43]. The \(\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X} \rightarrow \mathrm {\tau }\mathrm {\tau }\text {+X}\) cross section is obtained through a simultaneous fit of \(\mathrm {\tau }\) lepton pair mass distributions in all decay channels.

The paper is organized as follows. The CMS detector is described briefly in Sect. 2. Section 3 describes the data and the Monte Carlo (MC) simulations used in the analysis. The reconstruction of electrons, muons, \(\tau _\mathrm {h} \), and jets, along with various kinematic quantities, is described in Sect. 4. Section 5 details the selection of events in the different decay channels, followed in Sect. 6 by a description of the procedures used to estimate background contributions. The systematic uncertainties relevant for the measurement of the \(\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X} \rightarrow \mathrm {\tau }\mathrm {\tau }\text {+X}\) cross section are described in Sect. 7, and the extraction of the signal is given in Sect. 8. The results are presented in Sect. 9, and the paper concludes with a summary in Sect. 10.

2 The CMS detector

The central feature of the CMS apparatus is a superconducting solenoid of \(6\hbox { m}\) internal diameter, providing a magnetic field of \(3.8\hbox { T}\). A silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections, are positioned within the solenoid volume. The silicon tracker measures charged particles within the pseudorapidity range \(|\eta |< 2.5\). Trajectories of isolated muons with \(p_{\mathrm {T}} = 100\hbox { GeV}\), emitted at \(|\eta | < 1.4\), are reconstructed with an efficiency close to 100% and resolutions of 2.8% in \(p_{\mathrm {T}} \), and with uncertainties of 10 and \(30\,\upmu \hbox {m}\) in their respective transverse and longitudinal impact parameters relative to their points of origin [44]. The ECAL is a fine-grained hermetic calorimeter with quasi-projective geometry, segmented in the barrel region of \(|\eta | < 1.48\), as well as in the two endcaps that extend up to \(|\eta | < 3.0\). Similarly, the HCAL barrel and endcaps cover the region \(|\eta | < 3.0\). Forward calorimeters extend the coverage up to \(|\eta | < 5.0\). Muons are measured and identified in the range \(|\eta |< 2.4\) using gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. A two-level trigger system is used to reduce the rate of recorded events to a level suitable for data acquisition and storage. The first level (L1) of the CMS trigger system, composed of specialized hardware processors, uses information from the calorimeters and muon detectors to select the most interesting events in a fixed time interval of less than \(4\,\upmu \hbox {s}\). The high-level trigger processor farm decreases the event rate from around \(100\hbox { kHz}\) to less than \(1\hbox { kHz}\) before storage and subsequent analysis. Details of the CMS detector and its performance, together with a definition of the coordinate system and kinematic variables, can be found in Ref. [45].

3 Data and Monte Carlo simulation

The data were recorded in \(\mathrm {p}\mathrm {p}\) collisions at \(25\hbox { ns}\) bunch spacing and are required to satisfy standard data quality criteria. The analysed data correspond to an integrated luminosity of \(2.3\hbox { fb}^{-1}\).

The \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal and the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {e}\mathrm {e}\), \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\mu }\mathrm {\mu }\), \(\mathrm {W}\)+jets, \(\mathrm {t}\overline{\mathrm {t}}\), single top quark, and diboson (\(\mathrm {W}\mathrm {W}\), \(\mathrm {W}\mathrm {Z}\), and \(\mathrm {Z}\mathrm {Z}\)) background processes are modelled through samples of MC simulated events. Background contributions arising from multijet production via quantum chromodynamic interactions are determined from data. The \(\mathrm {Z}/\gamma ^{*} \rightarrow \ell \ell \) (where \(\ell \) refers to \(\mathrm {e}\), \(\mathrm {\mu }\), or \(\mathrm {\tau }\)) and \(\mathrm {W}\)+jets events are generated using leading-order (LO) matrix elements (ME) in quantum chromodynamics, implemented in the program MadGraph 5_amc@nlo 2.2.2 [46], and \(\mathrm {t}\overline{\mathrm {t}}\) and single top quark events are generated using the next-to-leading order (NLO) program powheg v2 [47,48,49,50,51]. The diboson events are modelled using the NLO ME program implemented in MadGraph 5_amc@nlo. The background events are complemented with SM \(\text {H} \rightarrow \mathrm {\tau }\mathrm {\tau }\) events, generated for an \(\text {H} \) mass of \(m_{\text {H}} = 125\,\text {GeV} \), using the implementation of the gluon-gluon and vector boson fusion processes in powheg [52, 53]. All events are generated using the NNPDF3.0 [54,55,56] set of parton distribution functions (PDF). Parton showers and parton hadronization are modelled using pythia 8.212 [57] and the CUETP8M1 underlying-event tune [58], which is based on the Monash tune [59]. The decays of \(\mathrm {\tau }\) leptons, including polarization effects, are modelled through pythia. The \(\mathrm {Z}/\gamma ^{*} \rightarrow \ell \ell \), \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) events are normalized to cross sections computed at next-to-next-to-leading order (NNLO) accuracy [60, 61]. A reweighting is applied to MC-generated \(\mathrm {t}\overline{\mathrm {t}}\) and \(\mathrm {Z}/\gamma ^{*} \rightarrow \ell \ell \) events to improve the respective modelling of the \(p_{\mathrm {T}} \) spectrum of the top quarks [62, 63] and the dilepton mass and \(p_{\mathrm {T}} \) spectra relative to data. The weights applied to simulated \(\mathrm {Z}/\gamma ^{*} \rightarrow \ell \ell \) events are obtained from studies of the distributions in dilepton mass and \(p_{\mathrm {T}} \) in \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\mu }\mathrm {\mu }\) events. The cross sections for single top quark [64,65,66] and diboson [67] production are computed at NLO accuracy.

Minimum bias events generated with pythia are overlaid on all simulated events to account for the presence of additional inelastic \(\mathrm {p}\mathrm {p}\) interactions, referred to as pileup (PU), which take place in the same, previous, or subsequent bunch crossings as the hard-scattering interaction. The pileup distribution in simulated events matches that in data, amounting to, on average, \({\approx }\,12\) inelastic \(\mathrm {p}\mathrm {p}\) interactions per bunch crossing. All generated events are passed through a detailed simulation of the CMS apparatus, based on Geant4  [68], and reconstructed using the same version of the CMS reconstruction software as used for data.

4 Event reconstruction

The information provided by all CMS subdetectors is employed in a particle-flow (PF) algorithm [69] to identify and reconstruct individual particles in the event, namely muons, electrons, photons, charged and neutral hadrons. These particles are then used to reconstruct jets, \(\tau _\mathrm {h} \) candidates and the vector imbalance in missing transverse momentum in the event, referred to as \({\vec p}_{\mathrm {T}}^{\ \text {miss}} \), as well as to quantify the isolation of leptons.

Electrons are reconstructed using an algorithm [70] that matches trajectories in the silicon tracker to energy depositions in the ECAL. Trajectories of electron candidates are reconstructed using a dedicated algorithm that accounts for the emission of bremsstrahlung photons. The energy loss due to bremsstrahlung is determined by searching for energy depositions in the ECAL emitted tangentially to the track. A multivariate (MVA) approach based on boosted decision trees (BDT) [71] is employed to distinguish electrons from hadrons that mimic electron signatures. Observables that quantify the quality of the electron track, the compactness of the electron cluster in directions transverse and longitudinal relative to the electron motion, and the matching of the track momentum and direction to the sum and positions of energy depositions in the ECAL are used as inputs to the BDT. The BDT is trained on samples of genuine and false electrons, produced in MC simulation. Additional requirements are applied to remove electrons originating from photon conversions.

The identification of muons is based on linking track segments reconstructed in the silicon tracking detector and in the muon system [72]. The matching is done both by starting from a track in the muon system and starting from a track in the inner detector. When a link is established, the track parameters are refitted using the combination of hits in the inner and outer detectors, and the reconstructed trajectory is referred to as a global muon track. Quality criteria are applied on the multiplicity of hits, the number of matched segments, and the quality of the fit to a global muon track, the latter being quantified through a \(\chi ^{2}\) criterion.

Electrons and muons in signal events are expected to be isolated, while leptons from heavy flavour (charm and bottom quark) decays, as well as from in-flight decays of pions and kaons, are often reconstructed within jets. Isolated leptons are distinguished from leptons in jets through a sum, denoted by the symbol \(I_{\ell }\), of the scalar \(p_{\mathrm {T}} \) values of additional charged particles, neutral hadrons, and photons reconstructed using the PF algorithm within a cone in \(\eta \) and azimuth \(\phi \) (in radians) of size \(\varDelta R = \sqrt{\smash [b]{(\varDelta \eta )^{2} + (\varDelta \phi )^{2}}} = 0.3\), centred around the lepton direction. Neutral hadrons and photons within the innermost region of the cone, \(\varDelta R < 0.01\), are excluded from the isolation sum for muons to prevent the footprint of the muon in ECAL and HCAL from causing the muon to fail isolation criteria. When computing the isolation of electrons reconstructed in the ECAL endcap region, we exclude photons within \(\varDelta R < 0.08\) and charged particles within \(\varDelta R < 0.015\) of the direction of the electron, to avoid counting photons emitted in bremsstrahlung and tracks originating from the conversion of such photons. As the amount of material that electrons traverse in the barrel region before reaching the ECAL is smaller, the resulting probability for bremsstrahlung emission and photon conversion is sufficiently reduced so as not to require exclusion of particles in the innermost cone from the isolation sum. Efficiency loss due to pileup is kept minimal by considering only charged particles originating from the lepton production vertex (“charged from PV”). The contribution from the neutral component of pileup to the isolation of the lepton is taken into account by means of \(\varDelta \beta \) corrections [69], which enter the computation of the isolation \(I_{\ell }\), as follows:

$$\begin{aligned} I_{\ell } = \sum _{\begin{array}{c} \text {charged} \\ \text {from PV} \end{array}} p_{\mathrm {T}} + \text {max} \left\{ 0, \sum _{\text {neutrals}} p_{\mathrm {T}}- \varDelta \beta \right\} , \end{aligned}$$
(1)

where \(\ell \) corresponds to either \(\mathrm {e}\) or \(\mathrm {\mu }\), and the sums extend over, respectively, the charged particles that originate from the lepton production vertex and the neutral particles. The “\(\text {max}\)” function represents taking the largest of the two values within the brackets. The \(\varDelta \beta \) corrections are computed by summing the scalar \(p_{\mathrm {T}} \) of charged particles that are within a cone of size \(\varDelta R = 0.3\) around the lepton direction, but do not originate from the lepton production vertex, (“charged from PU”) and scaling that sum by a factor of one-half:

$$\begin{aligned} \varDelta \beta = 0.5 \, \sum _{\begin{array}{c} \text {charged} \\ \text {from PU} \end{array}} p_{\mathrm {T}}. \end{aligned}$$
(2)

The factor of 0.5 approximates the phenomenological ratio of neutral-to-charged hadron production in the hadronization of inelastic \(\mathrm {p}\mathrm {p}\) collisions.

Collision vertices are reconstructed using a deterministic annealing algorithm [73, 74], with the reconstructed vertex position required to be compatible with the location of the LHC beam in the xy plane. The primary collision vertex (PV) is taken to be the vertex that has the maximum \(\sum p_{\mathrm {T}} ^{2}\) of tracks associated to it. Electrons, muons, and \(\tau _\mathrm {h} \) candidates are required to be compatible with originating from the PV.

Hadronic \(\mathrm {\tau }\) decays are reconstructed using the “hadrons+strips” (HPS) algorithm [37], which is used to separate the individual decay modes of the \(\mathrm {\tau }\) into \(\mathrm {\tau }^{-} \rightarrow \mathrm {h}^{-}\nu _{\mathrm {\tau }}\), \(\mathrm {\tau }^{-} \rightarrow \mathrm {h}^{-}\mathrm {\pi ^0}\nu _{\mathrm {\tau }}\), \(\mathrm {\tau }^{-} \rightarrow \mathrm {h}^{-}\mathrm {\pi ^0}\mathrm {\pi ^0}\nu _{\mathrm {\tau }}\), and \(\mathrm {\tau }^{-} \rightarrow \mathrm {h}^{-}\mathrm {h}^{+}\mathrm {h}^{-}\nu _{\mathrm {\tau }}\), where \(\mathrm {h}^{\pm }\) denotes either a charged pion or kaon (the decay modes of \(\mathrm {\tau }^{+}\) are assumed to be identical to their partner \(\mathrm {\tau }^{-}\) modes through charge conjugation invariance). The \(\tau _\mathrm {h} \) candidates are constructed by combining the charged PF hadrons with neutral pions. The neutral pions are reconstructed by clustering the PF photons within rectangular strips, narrow in the \(\eta \), but wide in the \(\phi \) directions, to account for the non-negligible probability for photons produced in \(\mathrm {\pi ^0}\rightarrow \gamma \gamma \) decays to convert into electron-positron pairs when traversing the all-silicon tracking detector of CMS and the broadening of energy depositions in the ECAL that occurs when this happens. For the same reason, electrons and positrons reconstructed through the PF algorithm are considered in the reconstruction of the neutral pions besides photons. The momentum of the \(\tau _\mathrm {h} \) candidate is taken as the vector sum over the momenta of the charged hadrons and neutral pions used in reconstructing the \(\tau _\mathrm {h} \) decay mode, assuming the pion-mass hypotheses. We do not use the strips of \(0.20 \times 0.05\) size in the \(\eta \)\(\phi \) plane, used in previous analyses [5,6,7, 9,10,11,12,13, 18, 21,22,23, 29,30,31, 33, 34, 38, 43], but an improved version of the strip reconstruction developed during the \(\sqrt{s} = 13\hbox { TeV}\) run. In the improved version, the size of the strip is adjusted as function of \(p_{\mathrm {T}} \), taking into consideration the bending of charged particles in the magnetic field increasing inversely with \(p_{\mathrm {T}} \). More details on strip reconstruction and validation of the algorithm with data are given in Ref. [75]. The main handle for distinguishing \(\tau _\mathrm {h} \) from the large background of quark and gluon jets relies on the use of tight isolation requirements. The sums of scalar \(p_{\mathrm {T}} \) values from photons and from charged particles originating from the PV within a cone of \(\varDelta R = 0.5\) centred around the \(\tau _\mathrm {h} \) direction, are used as input to an MVA-based \(\tau _\mathrm {h} \) ID discriminant. The set of input variables is complemented with the scalar \(p_{\mathrm {T}} \) sum of charged particles not originating from the PV, by the \(\tau _\mathrm {h} \) decay mode, and by observables that are sensitive to the lifetime of the \(\mathrm {\tau }\). The transverse impact parameter of the “leading” (highest \(p_{\mathrm {T}} \)) track of each \(\tau _\mathrm {h} \) candidate relative to the PV is used for \(\tau _\mathrm {h} \) candidates reconstructed in any decay mode. For \(\tau _\mathrm {h} \) candidates reconstructed in the \(\mathrm {\tau }^{-} \rightarrow \mathrm {h}^{-}\mathrm {h}^{+}\mathrm {h}^{-}\nu _{\mathrm {\tau }}\) decay mode, a fit of the three tracks to a common secondary vertex (SV) is attempted, and the distance between SV and PV is used as additional input to the MVA. The MVA is trained on genuine \(\tau _\mathrm {h} \) and jets generated in MC simulation. Four working points (WP), referred to as barely, minimally, moderately, and tightly constrained, are defined through changes made in the selections on the MVA output. The thresholds are adjusted as functions of the \(p_{\mathrm {T}} \) of the \(\tau _\mathrm {h} \) candidate, such that the \(\tau _\mathrm {h} \) identification efficiency for each WP is independent of \(p_{\mathrm {T}} \). The moderate and tight WP used to select events in different channels provide efficiencies of 55 and \(45\%\), and misidentification rates for jets of typically 1 and \(0.5\%\), depending on the \(p_{\mathrm {T}} \) of the jet [75]. Additional discriminants are employed to separate \(\tau _\mathrm {h} \) from electrons and muons. The separation of \(\tau _\mathrm {h} \) from electrons is performed via another MVA-based discriminant [75] that utilizes input observables that quantify the matching between the sum of energy depositions in the ECAL and the momentum of the leading track of the \(\tau _\mathrm {h} \) candidate, as well as variables that distinguish electromagnetic from hadronic showers. The cutoff-based discriminant described in Ref. [37] is used to separate \(\tau _\mathrm {h} \) from muons. It is based on matching the leading track of the \(\tau _\mathrm {h} \) candidate with energy depositions in the ECAL and HCAL, as well as with track segments in the muon detectors.

Jets within the range \(\vert \eta \vert < 4.7\) are reconstructed using the anti-\(\text {k}_{\text {t}}\) algorithm [76, 77] with a distance parameter \(R = 0.4\). Reconstructed jets are required not to overlap with identified electrons, muons, or \(\tau _\mathrm {h} \) candidates within \(\varDelta R < 0.5\), and to pass a set of minimal identification criteria that aim to reject jets arising from calorimeter noise [78]. The energy of reconstructed jets is calibrated as function of jet \(p_{\mathrm {T}} \) and \(\eta \) [79]. Average energy density corrections calculated using the FastJet algorithm [80, 81] are applied to compensate pileup effects. Jets originating from the hadronization of \(\mathrm {b} \) quarks are identified using the “combined secondary vertex” (CSV) algorithm [82], which exploits observables related to the long lifetime of \(\mathrm {b} \) hadrons and the higher particle multiplicity and mass of \(\mathrm {b} \) jets compared to light-quark and gluon jets.

The vector \({\vec p}_{\mathrm {T}}^{\ \text {miss}} \), with its magnitude referred to as \(E_{\mathrm {T}}^{\text {miss}} \), is reconstructed using an MVA regression algorithm [83]. To reduce the impact of pileup on the resolution in \(E_{\mathrm {T}}^{\text {miss}} \), the algorithm utilizes the fact that pileup produces jets predominantly of low \(p_{\mathrm {T}} \), while leptons and high-\(p_{\mathrm {T}} \) jets are almost exclusively produced through hard scattering processes.

The \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal is distinguished from backgrounds by means of the mass of the \(\mathrm {\tau }\) lepton pair. The mass, denoted by the symbol \(m_{\mathrm {\tau }\mathrm {\tau }}\), is reconstructed using the SVfit algorithm [84]. The algorithm is based on a likelihood approach and uses as inputs the measured momenta of the visible decay products of both \(\mathrm {\tau }\) leptons, the reconstructed \(E_{\mathrm {T}}^{\text {miss}} \), and an event-by-event estimate of the \(E_{\mathrm {T}}^{\text {miss}} \) resolution. The latter is computed as described in Refs. [83, 85]. The inputs are combined with a probabilistic model for leptonic and hadronic \(\mathrm {\tau }\) decays to estimate the momenta of the neutrinos produced in these decays. The algorithm achieves a resolution in \(m_{\mathrm {\tau }\mathrm {\tau }}\) of \({\approx }\,15\%\) relative to the mass of the \(\mathrm {\tau }\) lepton pairs at the generator level.

5 Event selection

The events selected in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), \(\tau _\mathrm {h} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \), and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels are recorded by combining single-electron and single-muon triggers, triggers that are based on the presence of two \(\tau _\mathrm {h} \) candidates in the event, and triggers based on the presence of both an electron and a muon.

The \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels utilize single-electron and -muon triggers with \(p_{\mathrm {T}} \) thresholds of 23 and \(18\,\text {GeV} \), respectively. Selected events are required to contain an electron of \(p_{\mathrm {T}} > 24\,\text {GeV} \) or a muon of \(p_{\mathrm {T}} > 19\,\text {GeV} \), both with \(|\eta |< 2.1\), and a \(\tau _\mathrm {h} \) candidate with \(p_{\mathrm {T}} > 20\,\text {GeV} \) and \(|\eta |< 2.3\). The electron or muon is required to pass an isolation requirement of \(I_{\ell } < 0.10 \, p_{\mathrm {T}}^{\,\ell } \), computed according to Eq. (1). The \(\tau _\mathrm {h} \) candidate is required to pass the moderate WP of the MVA-based \(\tau _\mathrm {h} \) ID discriminant, and to have a charge opposite to that of the electron or muon. The \(\tau _\mathrm {h} \) candidate is further required to pass a tight or minimal requirement on the discriminant that separates hadronic \(\mathrm {\tau }\) decays from electrons, and a minimal or tight selection on the discriminant that separates \(\tau _\mathrm {h} \) from muons. Background arising from \(\mathrm {W}\)+jets and \(\mathrm {t}\overline{\mathrm {t}}\) production is reduced by requiring the transverse mass of electron or muon and \({\vec p}_{\mathrm {T}}^{\ \text {miss}} \) to satisfy \(m_{\mathrm {T}} < 40\,\text {GeV} \). The transverse mass is defined by:

$$\begin{aligned} m_{\mathrm {T}} = \sqrt{2 \, p_{\mathrm {T}}^{\,\ell } \, E_{\mathrm {T}}^{\text {miss}} \, \left( 1 - \cos \varDelta \phi \right) }, \end{aligned}$$
(3)

where the symbol \(\ell \) refers to the electron or muon, and \(\varDelta \phi \) denotes the angle in the transverse plane between the lepton momentum and the \({\vec p}_{\mathrm {T}}^{\ \text {miss}} \) vector. Events containing additional electrons with \(p_{\mathrm {T}} > 10\,\text {GeV} \) and \(|\eta |< 2.5\), or muons with \(p_{\mathrm {T}} > 10\,\text {GeV} \) and \(|\eta |< 2.4\), passing minimal identification and isolation criteria, are rejected to reduce backgrounds from \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {e}\mathrm {e}\) and \(\mathrm {\mu }\mathrm {\mu }\) events, and from diboson production.

A trigger based on the presence of two \(\tau _\mathrm {h} \) candidates is used to record events in the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel. The trigger selects events containing two isolated calorimeter energy deposits at the L1 trigger stage, which are subsequently required to pass a simplified version of the PF-based offline \(\tau _\mathrm {h} \) reconstruction at the high-level trigger stage. The latter applies additional isolation criteria. The \(p_{\mathrm {T}} \) threshold for both \(\tau _\mathrm {h} \) candidates is \(35\,\text {GeV} \). The trigger efficiency increases with \(p_{\mathrm {T}} \) of the \(\tau _\mathrm {h} \), because different algorithms are used to reconstruct the \(p_{\mathrm {T}} \) at the L1 trigger stage and in the offline reconstruction. The trigger reaches an efficiency plateau of \({\approx }\,80\%\) for events in which both \(\tau _\mathrm {h} \) candidates have \(p_{\mathrm {T}} > 60\,\text {GeV} \). Selected events are required to contain two \(\tau _\mathrm {h} \) candidates with \(p_{\mathrm {T}} > 40\,\text {GeV} \) and \(|\eta |< 2.1\) that have opposite charge and satisfy the tight WP of the MVA-based \(\tau _\mathrm {h} \) ID discriminant, as well as the minimal criteria on the discriminants used to separate \(\tau _\mathrm {h} \) from electrons and muons. Events containing electrons with \(p_{\mathrm {T}} > 10\,\text {GeV} \) and \(|\eta |< 2.5\) or muons with \(p_{\mathrm {T}} > 10\,\text {GeV} \) and \(|\eta |< 2.4\), passing minimal identification and isolation criteria, are rejected to avoid overlap with the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels.

Events in the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) channel are recorded with the triggers based on the presence of an electron and a muon. The acceptance for the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal is increased by using two complementary triggers. The first trigger selects events that contain an electron with \(p_{\mathrm {T}} > 12\,\text {GeV} \) and a muon with \(p_{\mathrm {T}} > 17\,\text {GeV} \), while events containing an electron with \(p_{\mathrm {T}} > 17\,\text {GeV} \) and a muon with \(p_{\mathrm {T}} > 8\,\text {GeV} \) are recorded through the second trigger. The offline event selection demands the presence of an electron with \(p_{\mathrm {T}} > 13\,\text {GeV} \) and \(\vert \eta \vert < 2.5\), in conjunction with a muon of \(p_{\mathrm {T}} > 10\,\text {GeV} \) and \(|\eta |< 2.4\). Either the electron or the muon is required to pass a threshold of \(p_{\mathrm {T}} > 18\,\text {GeV} \), to ensure that at least one of the two triggers is fully efficient. Electrons and muons are further required to satisfy isolation criteria of \(I_{\ell } < 0.15 \, p_{\mathrm {T}}^{\,\ell } \), and to have opposite charge. Background from \(\mathrm {t}\overline{\mathrm {t}}\) production is reduced through a cutoff on a topological discriminant [86] based on the projections:

$$\begin{aligned} P_{\zeta }^{\,\text {miss}} = {\vec p}_{\mathrm {T}}^{\ \text {miss}} \cdot \hat{\zeta } \qquad \text{ and } \qquad P_{\zeta }^{\,\text {vis}} = \left( {\vec p}_{\mathrm {T}}^{\ \mathrm {e}} + {\vec p}_{\mathrm {T}}^{\ \mu } \right) \cdot \hat{\zeta }, \end{aligned}$$
(4)

where the symbol \(\hat{\zeta }\) denotes a unit vector in the direction of the bisector of the electron and muon \({\vec p}_{\mathrm {T}} \) vectors. The discriminator takes advantage of the fact that the angle between the neutrinos and the visible \(\mathrm {\tau }\) lepton decay products is typically small, causing the \({\vec p}_{\mathrm {T}}^{\ \text {miss}} \) vector in signal events to point in the direction of the visible \(\mathrm {\tau }\) decay products, which is often not true for \(\mathrm {t}\overline{\mathrm {t}}\) background. Selected events are required to satisfy the condition \(P_{\zeta }^{\,\text {miss}}- 0.85 \, P_{\zeta }^{\,\text {vis}} > -\,20\,\text {GeV} \). The reconstruction of the projections \(P_{\zeta }^{\,\text {miss}} \) and \(P_{\zeta }^{\,\text {vis}} \) is illustrated in Fig. 1. The figure also shows the distribution in the observable \(P_{\zeta }^{\,\text {miss}}- 0.85 \, P_{\zeta }^{\,\text {vis}} \) for events selected in the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) channel before that condition is applied.

Fig. 1
figure 1

(Left) Construction of the projections \(P_{\zeta }^{\,\text {miss}} \) and \(P_{\zeta }^{\,\text {vis}} \), and (right) the distribution in the observable \(P_{\zeta }^{\,\text {miss}}- 0.85 \, P_{\zeta }^{\,\text {vis}} \) for events selected in the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) channel, before imposing the condition \(P_{\zeta }^{\,\text {miss}}- 0.85 \, P_{\zeta }^{\,\text {vis}} > -\,20\,\text {GeV} \). Also indicated is the separation of the background into its main components. The sum of background contributions from \(\mathrm {W}\)+jets, single top quark, and diboson production is referred to as “electroweak” background. The symbols \({\vec p}_{\mathrm {T}} ^{\ \nu (\mathrm {e})}\) and \({\vec p}_{\mathrm {T}} ^{\ \nu (\mathrm {\mu })}\) refer to the vectorial sum of transverse momenta of the two neutrinos produced in the respective \(\mathrm {\tau }\rightarrow \mathrm {e}\nu \nu \) and \(\mathrm {\tau }\rightarrow \mathrm {\mu }\nu \nu \) decays

The events selected in the \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channel are recorded using a single-muon trigger with a \(p_{\mathrm {T}} \) threshold of \(18\,\text {GeV} \). The two muons are required to be within the acceptance of \(\vert \eta \vert < 2.4\), and to have opposite charge. The muons of higher and lower \(p_{\mathrm {T}} \) are required to satisfy the conditions of \(p_{\mathrm {T}} > 20\) and \(> 10\,\text {GeV} \), respectively. Both muons are required to pass an isolation criterion of \(I_{\mathrm {\mu }} < 0.15 \, p_{\mathrm {T}} ^{\,\mathrm {\mu }}\). The large background arising from DY production of \(\mathrm {\mu }\) pairs is reduced by requiring the mass of the two muons to satisfy \(m_{\mathrm {\mu }\mathrm {\mu }} < 80\,\text {GeV} \), and through the application of a cutoff on the output of a BDT trained to separate the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal from the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\mu }\mathrm {\mu }\) background. The following observables are used as BDT inputs: the ratio of the \(p_{\mathrm {T}} \) of the dimuon system to the scalar \(p_{\mathrm {T}} \) sum of the two muons (\(p_{\mathrm {T}} ^{\,\mathrm {\mu }\mathrm {\mu }} / \sum p_{\mathrm {T}} ^{\,\mathrm {\mu }}\)), the pseudorapidity of the dimuon system (\(\eta _{\mathrm {\mu }\mathrm {\mu }}\)), the \(E_{\mathrm {T}}^{\text {miss}} \), the topological discriminant \(P_{\zeta }\), computed according to Eq. (4), and the azimuthal angle between the muon of positive charge and the \({\vec p}_{\mathrm {T}}^{\ \text {miss}} \) vector, denoted by the symbol \(\varDelta \phi (\mathrm {\mu }^{+}, {\vec p}_{\mathrm {T}}^{\ \text {miss}})\). The angle between the muon of negative charge and the \({\vec p}_{\mathrm {T}}^{\ \text {miss}} \) vector, \(\varDelta \phi (\mathrm {\mu }^{-}, {\vec p}_{\mathrm {T}}^{\ \text {miss}})\), is not used as BDT input, as it is strongly anticorrelated with \(\varDelta \phi (\mathrm {\mu }^{+}, {\vec p}_{\mathrm {T}}^{\ \text {miss}})\).

We refer to the events passing the selection criteria detailed in this Section as belonging to the “signal region” (SR) of the analysis.

6 Background estimation

The accuracy of the background estimate is improved by determining from data the contributions from the main backgrounds, as well as from backgrounds that are difficult to model through MC simulation. In particular, the background from multijet production falls into the latter category. In the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels, the dominant background is from events in which a quark or gluon jet is misidentified as \(\tau _\mathrm {h} \). The estimation of background from these “false” \(\tau _\mathrm {h} \) sources is discussed in Sect. 6.1. It predominantly arises from multijet production in the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel and from \(\mathrm {W}\)+jets events, as well as from multijet production in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels. A small additional background contribution in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels arises from \(\mathrm {t}\overline{\mathrm {t}}\) events with quark or gluon jets misidentified as \(\tau _\mathrm {h} \). The multijet background is also relevant in the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels. The estimation of the multijet background in these channels is described in Sect. 6.2. The contribution to the SR from the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels arising from backgrounds with misidentified leptons other than multijet production is small and not distinguished from background contributions with genuine leptons. Significant background contributions arise from \(\mathrm {t}\overline{\mathrm {t}}\) production in the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) channel and from the DY production of muon pairs in the \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channel. The normalization of the \(\mathrm {t}\overline{\mathrm {t}}\) background in the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels is determined from data, using a control region that contains events with one electron, one muon, and one or more \(\mathrm {b} \)-tagged jets. Details of the procedure are given in Sect. 6.3. The \(\mathrm {t}\overline{\mathrm {t}}\) normalization factor obtained from this control region is also applied to the \(\mathrm {t}\overline{\mathrm {t}}\) background events selected in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels, in which the reconstructed \(\tau _\mathrm {h} \) is either due to a genuine \(\tau _\mathrm {h} \) or due to the misidentification of an electron or muon. The background rate from \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {e}\mathrm {e}\) and \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\mu }\mathrm {\mu }\) production is determined from the data through a maximum-likelihood (ML) fit of the \(m_{\mathrm {\tau }\mathrm {\tau }}\) distributions in the SR, described in Sect. 8. The contributions of minor backgrounds from single top quark and diboson production, as well as a small contribution from \(\mathrm {W}\)+jets background in the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels, are obtained from MC simulation. The sum of these minor backgrounds is referred to as “electroweak” background. A Higgs boson with a mass of \(m_{\text {H}} = 125\,\text {GeV} \), produced at the rate and with branching fractions predicted in the SM, is considered as background. Nevertheless, this contribution is found to be negligible.

The background estimates are summarized in Table 1. The quoted uncertainties represent the quadratic sum of statistical and systematic sources.

Table 1 Expected number of background events in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), \(\tau _\mathrm {h} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \), and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels in data, corresponding to an integrated luminosity of \(2.3\hbox { fb}^{-1}\). The uncertainties are rounded to two significant digits, except when they are \(< 10\), in which case they are rounded to one significant digit, and the event yields are rounded to match the precision in the uncertainties

In preparation for future analyses of \(\mathrm {\tau }\) lepton production, the validity of the background-estimation procedures described in this section is further tested in event categories that are relevant to the SM \(\text {H} \rightarrow \mathrm {\tau }\mathrm {\tau }\) analysis, as well as in searches for new physical phenomena. Event categories based on jet multiplicity, \(p_{\mathrm {T}} \) of the \(\mathrm {\tau }\) lepton pair, and on the multiplicity of \(\mathrm {b} \) jets in the event are used in \(\text {H} \rightarrow \mathrm {\tau }\mathrm {\tau }\) analyses performed by CMS in the context of the SM [43] and of its minimal supersymmetric extension [9,10,11], as well as in the context of searches for Higgs boson pair production [87]. The validation of the background-estimation procedures in these event categories is detailed in the Appendix.

6.1 Estimation of false-\(\tau _\mathrm {h} \) background in \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels

The background arising from events in which a quark or gluon jet is misidentified as \(\tau _\mathrm {h} \) in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels is estimated via the “fake factor” (\(F_\text {F}\)) method. The method is based on selecting events that pass altered \(\tau _\mathrm {h} \) ID criteria, and weighting the events through suitably chosen extrapolation factors (the \(F_\text {F}\)). The events passing the altered \(\tau _\mathrm {h} \) ID criteria are referred to as belonging to the “application region” (AR) of the \(F_\text {F}\) method. Except for modifying the \(\tau _\mathrm {h} \) ID criteria, the same selections are applied to events in the AR and in the SR. The \(F_\text {F}\) are measured in dedicated control regions in data. These are referred to as “determination regions” (DR) of the \(F_\text {F}\) method, and are chosen such that they neither overlap with the SR nor with the AR.

The \(F_\text {F}\) are determined in bins of decay mode and \(p_{\mathrm {T}} \) of the \(\tau _\mathrm {h} \) candidate, and as a function of jet multiplicity. In each such bin, the \(F_\text {F}\) is given by the ratio:

$$\begin{aligned} F_\text {F}= \frac{N_{\text {nominal}}}{N_{\text {altered}}}, \end{aligned}$$
(5)

where \(N_{\text {nominal}}\) corresponds to the number of events with \(\tau _\mathrm {h} \) candidates that pass the nominal WP of the MVA-based \(\tau _\mathrm {h} \) ID discriminant in a given channel, and \(N_{\text {altered}}\) is the number of events with \(\tau _\mathrm {h} \) candidates that satisfy the altered \(\tau _\mathrm {h} \) ID criteria. To satisfy the altered \(\tau _\mathrm {h} \) ID criteria, \(\tau _\mathrm {h} \) candidates must satisfy the barely constrained WP, but fail the nominal WP. The multiplicity of jets that is used to parametrize the \(F_\text {F}\) is denoted by \(N_{\text {jet}}\), and is defined by the jets that satisfy the conditions \(p_{\mathrm {T}} > 20\,\text {GeV} \) and \(\vert \eta \vert < 4.7\), and do not overlap with \(\tau _\mathrm {h} \) candidates passing the barely constrained WP of the MVA-based \(\tau _\mathrm {h} \) ID discriminant, nor with electrons or muons within \(\varDelta R < 0.5\). In each bin, the contribution from processes with genuine \(\tau _\mathrm {h} \), and with electrons or muons misidentified as \(\tau _\mathrm {h} \), are estimated through MC simulation, and subtracted from the numerator as well as from the denominator in Eq. (5).

As the probabilities for jets to be misidentified as \(\tau _\mathrm {h} \) depend on the \(\tau _\mathrm {h} \) ID criteria, and the latter differ in different channels, the \(F_\text {F}\) are measured separately in each one of them. Moreover, the misidentification rates differ for multijet, \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) events, necessitating a measurement of the \(F_\text {F}\) in the DR enriched in contributions from multijet, \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) backgrounds. The relative fractions of multijet, \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) background processes in the AR, denoted by \(R_{\text {p}}\), are determined through a fit to the distribution in \(m_{\mathrm {T}} \), and are used to weight the \(F_\text {F}\) determined in the DR when computing the estimate of the false-\(\tau _\mathrm {h} \) background in the SR. The procedure is illustrated in Fig. 2.

Fig. 2
figure 2

Schematic illustration of the \(F_\text {F}\) method, used to estimate the false-\(\tau _\mathrm {h} \) background in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels. An event sample enriched in multijet, \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) backgrounds is selected in the AR (top left). The weights w, given by the product of the \(F_\text {F}\) measured in the DR (top right) and the relative fractions \(R_{\text {p}}\) of different background processes \(\text {p}\) in the AR, are applied to the events selected in the AR to yield the estimate of the false-\(\tau _\mathrm {h} \) background in the SR (bottom left). The superscript \(\text {p}\) on the symbol \(F_\text {F}^{\text {p}}\) indicates that the \(F_\text {F}\) depend on the background process \(\text {p}\), where \(\text {p}\) refers to either multijet, \(\mathrm {W}\)+jets, or \(\mathrm {t}\overline{\mathrm {t}}\) background. The contribution of the \(\mathrm {Z}/\gamma ^{*} \rightarrow \tau \tau \) signal in the AR is subtracted, based on MC simulation. The fractions \(R_{\text {p}}\) are determined by a fit of the \(m_{\mathrm {T}} \) distribution in the AR (bottom right), described in more detail in Sect. 6.1.2. The fraction \(R_{1}\) includes a small contribution from DY events in which the reconstructed \(\tau _\mathrm {h} \) is due to the misidentification of a quark or a gluon jet

Fig. 3
figure 3

Probabilities for gluon and quark jets, of different flavour in simulated multijet events, to pass the moderate WP of the MVA-based \(\tau _\mathrm {h} \) ID discriminant, as a function of jet \(p_{\mathrm {T}} \), for jets passing \(p_{\mathrm {T}} > 20\,\text {GeV} \) and \(\vert \eta \vert < 2.3\) (left), and for jets passing in addition the barely constrained WP of the MVA-based \(\tau _\mathrm {h} \) ID discriminant (right)

The \(\tau _\mathrm {h} \) ID criteria applied in the AR are identical to the \(\tau _\mathrm {h} \) ID criteria applied in the denominator of Eq. (5). More specifically, the criteria on \(p_{\mathrm {T}} \) and \(\eta \), as well as the requirements on the discriminators that distinguish \(\tau _\mathrm {h} \) from electrons and muons, are the same as in the SR. The \(\tau _\mathrm {h} \) candidates selected in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels are required to pass the barely constrained, but fail the moderately constrained WP of the MVA-based \(\tau _\mathrm {h} \) ID discriminant. In the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel, one of the two \(\tau _\mathrm {h} \) candidates must pass the tight WP, while the other \(\tau _\mathrm {h} \) candidate is required to pass the barely constrained, but fail the tight WP, precluding overlap of the AR with the SR.

The DR enriched in contributions from multijet, \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) backgrounds contain specific mixtures of gluon, light-quark (\(\mathrm {u}\), \(\mathrm {d}\), \(\mathrm {s}\)), and heavy-flavour (\(\mathrm {c}\), \(\mathrm {b}\)) quark jets, with different probabilities for misidentification as \(\tau _\mathrm {h} \), as illustrated for simulated events in Fig. 3. The misidentification rates are shown for jets passing \(p_{\mathrm {T}} > 20\,\text {GeV} \) and \(\vert \eta \vert < 2.3\), and for jets satisfying in addition the barely constrained WP of the MVA-based \(\tau _\mathrm {h} \) ID discriminant. In general, the misidentification rates are higher in quark jets compared to gluon jets, as the former typically have lower particle multiplicity and are more collimated than the latter, thereby increasing their probability to be misidentified as \(\tau _\mathrm {h} \). As it can be seen in the figure, the requirement for jets to pass minimal \(\tau _\mathrm {h} \) selection criteria significantly reduce the flavour dependence of the misidentification rates. This in turn lowers the systematic uncertainty that arises from the limited knowledge of the flavour composition in the AR. Residual flavour dependence of the \(F_\text {F}\) is taken into account by measuring separate sets of \(F_\text {F}\) in each DR, and determining the relative fraction \(R_{\text {p}}\) of multijet, \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) backgrounds in the AR of the respective channel. Given the \(F_\text {F}\) and the fractions \(R_{\text {p}}\), the estimate of the background from misidentified \(\tau _\mathrm {h} \) in the SR is obtained by applying the weights

$$\begin{aligned} w = \sum _{\text {p}} \, R_{\text {p}} \, F_\text {F}^{\text {p}} \end{aligned}$$
(6)

to events selected in the AR, where the sum extends over the above three background processes \(\text {p}\). The \(F_\text {F}\) refer, as usual, to Eq. (5). The symbol \(F_\text {F}^{\text {p}}\) indicates that, in addition to their dependence on \(\tau _\mathrm {h} \) decay mode, \(\tau _\mathrm {h} \) candidate \(p_{\mathrm {T}} \), and jet multiplicity, the \(F_\text {F}\) depend on the background process \(\text {p}\), where the superscript \(\text {p}\) refers to either multijet, \(\mathrm {W}\)+jets, or \(\mathrm {t}\overline{\mathrm {t}}\) background. In the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel, the \(F_\text {F}^{\text {p}}\) is determined by the decay mode and \(p_{\mathrm {T}} \) of the \(\tau _\mathrm {h} \) candidate that passes the barely constrained, but fails the tight WP of the MVA-based \(\tau _\mathrm {h} \) ID discriminant. The \(\tau _\mathrm {h} \) candidate that passes the tight WP does not enter the computation of the weight w.

The underlying assumption in the \(F_\text {F}\) method is that the ratio of the number of events from background process \(\text {p}\) in the SR to the number of events from the same background in the AR is equal to the ratio \(N_{\text {nominal}}/N_{\text {altered}}\) that is measured in the background-specific DR.

The measurement of the \(F_\text {F}\) is detailed in Sect. 6.1.1, while the fractions \(R_{\text {p}}\) are discussed in Sect. 6.1.2. The estimate of the false-\(\tau _\mathrm {h} \) background obtained from the \(F_\text {F}\) method is validated in control regions devoid of \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal. The result of this validation is presented in Sect. 6.1.3.

6.1.1 Measurement of \(F_\text {F}\)

The \(F_\text {F}\) are measured in DR chosen such that one particular background process is enhanced in each DR. The selection criteria applied in the DR are similar to those applied in the SR. In the following, we describe only the differences relative to the SR.

In the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels, three different DR are used to measure the \(F_\text {F}\) for multijet, \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) backgrounds. The DR dominated by multijet background contains events in which the charges of the \(\tau _\mathrm {h} \) candidate and of the light lepton candidates are the same, and the electron or muon satisfies a modified isolation criterion of \(0.05< I_{\ell }/p_{\mathrm {T}}^{\,\ell } < 0.15\). Depending on whether the \(\tau _\mathrm {h} \) candidate passes or fails the moderate WP of the MVA-based \(\tau _\mathrm {h} \) ID discriminant, the event contributes either to the numerator or to the denominator of Eq. (5). The DR dominated by \(\mathrm {W}\)+jets background is defined by modifying the requirement for the transverse mass of lepton and \({\vec p}_{\mathrm {T}}^{\ \text {miss}} \) to \(m_{\mathrm {T}} > 70\,\text {GeV} \). The contamination arising from \(\mathrm {t}\overline{\mathrm {t}}\) background is reduced by vetoing events containing jets that pass the \(\mathrm {b} \) tagging criteria described in Sect. 4. A common \(\mathrm {t}\overline{\mathrm {t}}\) DR is used for the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels. The events are required to contain an electron, a muon, at least one \(\tau _\mathrm {h} \) candidate, and pass triggers based on the presence of an electron or a muon. The offline event selection demands that the electron satisfy the conditions \(p_{\mathrm {T}} > 13\,\text {GeV} \) and \(\vert \eta \vert < 2.5\), the muon \(p_{\mathrm {T}} > 10\,\text {GeV} \) and \(|\eta |< 2.4\), and that both pass an isolation criterion of \(I_{\ell } < 0.10 \, p_{\mathrm {T}}^{\,\ell } \). The event is furthermore required to contain at least one jet that passes the \(\mathrm {b} \) tagging criteria described in Sect. 4. In case events contain multiple \(\tau _\mathrm {h} \) candidates, the candidate used for the \(F_\text {F}\) measurement is chosen at random.

In the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel, a single DR is used, which selects a high purity sample of multijet events, the dominant background in this channel. The multijet DR is identical to the SR of the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel, except that the two \(\tau _\mathrm {h} \) candidates are required to have the same rather than opposite charge. One of the jets is chosen to be the “tag” jet, and required to pass the tight WP of the MVA-based \(\tau _\mathrm {h} \) ID discriminant, while the measurement of the \(F_\text {F}\) is performed on the other jet, referred to as the “probe” jet. The tag jet is chosen at random. The \(\mathrm {W}\)+jets and \(\mathrm {t}\overline{\mathrm {t}}\) backgrounds are small in the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel, making it difficult to define a DR that is dominated by these backgrounds, or that provides sufficient statistical information for the \(F_\text {F}\) measurement. The \(F_\text {F}\) in the multijet DR of the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel are therefore used to weight all events selected in the AR of the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel. Differences in the \(F_\text {F}\) between \(\mathrm {W}\)+jets, \(\mathrm {t}\overline{\mathrm {t}}\), and multijet events are accounted for by adding a systematic uncertainty of \(30\%\) on the part of the background from misidentified \(\tau _\mathrm {h} \) expected from the contribution of \(\mathrm {W}\)+jets and \(\mathrm {t}\overline{\mathrm {t}}\) background processes. This contribution is estimated using MC simulation, and the magnitude of the systematic uncertainty is motivated by the difference found in the \(F_\text {F}\) measured in multijet, \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) DR in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels.

The \(F_\text {F}\) determined in the various DR are shown in Figs. 4 and 5. The decay modes \(\mathrm {\tau }^{-} \rightarrow \mathrm {h}^{-}\nu _{\mathrm {\tau }}\), \(\mathrm {\tau }^{-} \rightarrow \mathrm {h}^{-}\mathrm {\pi ^0}\nu _{\mathrm {\tau }}\), and \(\mathrm {\tau }^{-} \rightarrow \mathrm {h}^{-}\mathrm {\pi ^0}\mathrm {\pi ^0}\nu _{\mathrm {\tau }}\) are referred to as “one-prong” decays and the mode \(\mathrm {\tau }^{-} \rightarrow \mathrm {h}^{-}\mathrm {h}^{+}\mathrm {h}^{-}\nu _{\mathrm {\tau }}\) as “three-prong” decays. The measured \(F_\text {F}\) are corrected for differences in the \(\tau _\mathrm {h} \) misidentification rates between DR and AR. The magnitude of these relative corrections is \({\approx }\,10\%\), as discussed below.

Fig. 4
figure 4

The \(F_\text {F}\) values measured in multijet events in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) (upper), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) (center), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) (lower) channels, presented in bins of jet multiplicity and \(\tau _\mathrm {h} \) decay mode, as a function of \(\tau _\mathrm {h} \) \(p_{\mathrm {T}} \). The abscissae of the points are offset to distinguish the points with different jet multiplicities

Fig. 5
figure 5

The \(F_\text {F}\) values measured in \(\mathrm {W}\)+jets events in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) (upper) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) (center) channels and in \(\mathrm {t}\overline{\mathrm {t}}\) events (lower), presented in bins of jet multiplicity and \(\tau _\mathrm {h} \) decay mode, as a function of \(\tau _\mathrm {h} \) \(p_{\mathrm {T}} \). A common \(\mathrm {t}\overline{\mathrm {t}}\) DR is used for the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels. The abscissae of the points are offset to distinguish the points with different jet multiplicities

For the multijet DR in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels, correlations between the \(F_\text {F}\) and the charge of the electron or muon and the \(\tau _\mathrm {h} \) candidate, and between \(F_\text {F}\) and the isolation of the electron or muon, are studied in data and taken into account as follows. A correction for the extrapolation from events in which the charges of lepton and \(\tau _\mathrm {h} \) candidate have the same sign (SS) to events in which they have opposite sign (OS) is obtained by comparing \(F_\text {F}\) in the SS and OS events containing electrons or muons that pass an inverted isolation criterion of \(0.1< I_{\ell }/p_{\mathrm {T}}^{\,\ell } < 0.2\). The dependence of the \(F_\text {F}\) on the isolation of the electron or muon is studied using an event sample selected with no isolation condition applied to the lepton. The results of this study are used to extrapolate the \(F_\text {F}\) obtained in the multijet DR (\(0.05< I_{\ell }/p_{\mathrm {T}}^{\,\ell } < 0.15\)) to the SR (\(I_{\ell } < 0.10 \, p_{\mathrm {T}}^{\,\ell } \)).

For the DR dominated by \(\mathrm {W}\)+jets background in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels, closure tests of the \(F_\text {F}\) method reveal a dependence of the \(F_\text {F}\) on \(m_{\mathrm {T}} \), which is not accounted for by the chosen parametrization of the \(F_\text {F}\) as functions of jet multiplicity, \(\tau _\mathrm {h} \) decay mode, and \(p_{\mathrm {T}} \). The dependence on \(m_{\mathrm {T}} \) is studied using simulated \(\mathrm {W}\)+jets events, and used to extrapolate the \(F_\text {F}\) measured in the \(\mathrm {W}\)+jets DR (\(m_{\mathrm {T}} > 70\,\text {GeV} \)) to the SR (\(m_{\mathrm {T}} < 40\,\text {GeV} \)).

In the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel, the \(F_\text {F}\) determined in the multijet DR are corrected for a dependence of the \(F_\text {F}\) on the relative charge of the two \(\tau _\mathrm {h} \) candidates. This is studied in events in which the tag jet (the jet on which the FF measurement is not performed) fails the tight WP of the MVA-based \(\tau _\mathrm {h} \) ID discriminant. The difference between the \(F_\text {F}\) in OS and SS events defines this correction.

6.1.2 Determination of \(R_{\text {p}}\)

In the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels, the relative fractions \(R_{\text {p}}\) of multijet, \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) backgrounds in the AR are determined through a fit to the distribution in \(m_{\mathrm {T}} \). The distribution in \(m_{\mathrm {T}} \) (“template”) used to represent the multijet background in the fit is obtained from a sample of events selected in data, in which the \(\tau _\mathrm {h} \) candidate and the electron or muon have same charge, and where at least one of the leptons satisfies a modified isolation criterion of \(0.05< I_{\ell }/p_{\mathrm {T}}^{\,\ell } < 0.15\). The contributions from other backgrounds to this control region are subtracted, based on MC simulation. The distribution representing the other backgrounds in the fit are also taken from simulation. The templates for \(\mathrm {t}\overline{\mathrm {t}}\), diboson, and DY events are split into three components: events in which the reconstructed \(\tau _\mathrm {h} \) is due to a genuine \(\tau _\mathrm {h} \), events in which the \(\tau _\mathrm {h} \) is due to the misidentification of an electron or muon, and events in which a quark or gluon jet is misidentified as \(\tau _\mathrm {h} \). The normalization of each component is determined independently in the fit. The relative fractions of the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal and all individual background processes are left unconstrained in the fit. Finally, the fractions \(R_{\text {p}}\) are parametrized as function of \(m_{\mathrm {T}} \) and are normalized such that the contribution of all processes \(\text {p}\) in which the reconstructed \(\tau _\mathrm {h} \) is due to a misidentified jet sums to unity, \(\sum _{\text {p}} \, R_{\text {p}} = 1\).

In the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel, the AR is dominated by multijet background. The contributions from the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal and all background processes, except multijet production, are small and taken from simulation. The fraction of multijet background in the AR is determined by subtracting the sum of all processes modelled in the MC simulation from the data in the AR, without performing a fit in this channel.

A small fraction of events in the AR of the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels arises from DY events in which quark or gluon jets are misidentified as \(\tau _\mathrm {h} \) candidates. These events are treated as background and are included in the false-\(\tau _\mathrm {h} \) estimate using the \(F_\text {F}\) method. As the analysed data do not provide a way of determining \(F_\text {F}\) in DY events with sufficient statistical accuracy, the \(F_\text {F}\) measured in \(\mathrm {W}\)+jets events are used instead for the fraction of DY events with jets misidentified as \(\tau _\mathrm {h} \) in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels. The validity of this procedure is justified by studies of \(F_\text {F}\) in simulated \(\mathrm {W}\)+jets and DY events, which indicate that the flavour composition of jets and the \(F_\text {F}\) are very similar in these events. In the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel, the \(F_\text {F}\) measured in multijet events are used and the systematic uncertainty on the DY background with misidentified \(\tau _\mathrm {h} \) is increased by \(30\%\).

6.1.3 Validation of the false-\(\tau _\mathrm {h} \) background estimate in control regions

The modelling of the background from jets misidentified as \(\tau _\mathrm {h} \) in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels through the \(F_\text {F}\) method is validated by comparing the background estimates obtained in this method to the data in control regions containing events with SS \(\mathrm {e}\tau _\mathrm {h} \), \(\mathrm {\mu }\tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) pairs. A dedicated set of \(F_\text {F}\), without corrections for the extrapolation from OS to SS events, is determined for this validation. The selection of events in the multijet DR is also altered in this validation, to avoid overlap with the AR. The distributions in \(m_{\mathrm {\tau }\mathrm {\tau }}\) in events containing SS \(\mathrm {e}\tau _\mathrm {h} \), \(\mathrm {\mu }\tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) pairs are shown in Fig. 6. The data are compared to the sum of false-\(\tau _\mathrm {h} \) background and other backgrounds. The contribution of other backgrounds, in which the reconstructed \(\tau _\mathrm {h} \) is due either to a genuine \(\tau _\mathrm {h} \) or to the misidentification of an electron or muon, is obtained from the MC simulation. The event yield of the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal in these control regions is small. The normalization of individual backgrounds and of the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal is determined through a fit to the distributions in \(m_{\mathrm {\tau }\mathrm {\tau }}\) in which the rate of each background is allowed to vary within its estimated systematic uncertainty. The good agreement observed between the data and the background prediction in the control regions of all three channels confirms the validity of false-\(\tau _\mathrm {h} \) background estimates obtained through the \(F_\text {F}\) method.

Fig. 6
figure 6

Distributions in \(m_{\mathrm {\tau }\mathrm {\tau }}\) for SS events containing (upper left) \(\mathrm {e}\tau _\mathrm {h} \), (upper right) \(\mathrm {\mu }\tau _\mathrm {h} \), and (lower) \(\tau _\mathrm {h} \tau _\mathrm {h} \) pairs, compared to expected background contributions

6.2 Estimation of multijet background in \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels

The contributions from multijet background in the SR of the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) or \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels are estimated using control regions containing events with an electron and muon or two muons of same charge, respectively. An estimate for the contribution from multijet events in the SR is obtained by scaling the yield of the multijet background in the SS control region by a suitably chosen extrapolation factor, defined by the ratio of \(\mathrm {e}\mathrm {\mu }\) or \(\mathrm {\mu }\mathrm {\mu }\) pairs with opposite charge to those with same charge. The ratio is measured in events in which at least one lepton passes an inverted isolation criterion of \(I_{\ell } > 0.15 \, p_{\mathrm {T}}^{\,\ell } \). We refer to this event sample as an isolation sideband region (SB). The requirement \(I_{\ell } > 0.15 \, p_{\mathrm {T}}^{\,\ell } \) ensures that the SB does not overlap with the SR. A complication arises from the fact that the ratio of OS to SS pairs depends on the lepton kinematics and the isolation criterion used in the SB. The nominal OS/SS ratio is measured in an isolation sideband (SB1) defined by requiring both leptons to satisfy a relaxed isolation criterion of \(I_{\ell } < 0.60 \, p_{\mathrm {T}}^{\,\ell } \), with at least one lepton passing the condition \(I_{\ell } > 0.15 \, p_{\mathrm {T}}^{\,\ell } \). The systematic uncertainty in the OS/SS ratio that arises from the choice of the upper limit on \(I_{\ell }\) applied in SB1 is estimated by taking the difference between the OS/SS ratio computed in SB1 and the ratio computed in a different isolation sideband region (SB2). The latter is defined by requiring at least one lepton to pass the condition \(I_{\ell } > 0.60 \, p_{\mathrm {T}}^{\,\ell } \), without setting an upper limit on \(I_{\ell }\) in the SB2 region. The criteria to select events in the isolation sidebands are optimized to ensure high statistical accuracy in the measurement of the OS/SS extrapolation factor and at the same time the minimization of differences in lepton kinematic distributions between the SR and the SB. In both isolation sidebands, the OS/SS ratio is measured as function of \(p_{\mathrm {T}} \) of the two leptons \(\ell \) and \(\ell '\) and of their separation \(\varDelta R(\ell ,\ell ') = \sqrt{(\eta _{\ell } - \eta _{\ell '})^{2} + (\phi _{\ell } - \phi _{\ell '})^{2}}\) in the \(\eta \)-\(\phi \) plane. The contributions to the SS control region, as well as to SB1 and SB2, from backgrounds other than multijet production are subtracted, based on results from MC simulation.

6.3 Estimation of \(\mathrm {t}\overline{\mathrm {t}}\) background

While the \(m_{\mathrm {\tau }\mathrm {\tau }}\) distribution for \(\mathrm {t}\overline{\mathrm {t}}\) background is obtained from MC simulation, the event yield in the \(\mathrm {t}\overline{\mathrm {t}}\) background in the SR is determined from data, using a control region dominated by \(\mathrm {t}\overline{\mathrm {t}}\) background. Events in the \(\mathrm {t}\overline{\mathrm {t}}\) control region are required to satisfy selection criteria that are similar to the requirements for the SR of the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) channel, described in Sect. 5. The main differences are that the cutoff on \(P_{\zeta }^{\,\text {miss}}- 0.85 \, P_{\zeta }^{\,\text {vis}} \) is inverted to \(P_{\zeta }^{\,\text {miss}}- 0.85 \, P_{\zeta }^{\,\text {vis}} < -40\,\text {GeV} \), and a condition \(E_{\mathrm {T}}^{\text {miss}} > 80\,\text {GeV} \) is added to the event selection in the \(\mathrm {t}\overline{\mathrm {t}}\) control region. The \(\mathrm {t}\overline{\mathrm {t}}\) event yield observed in the control region is a \(1.01 \pm 0.07\) multiple of the expectation from the MC simulation. The ratio of the \(\mathrm {t}\overline{\mathrm {t}}\) event yield measured in data to the MC prediction is applied as a scale factor to simulated \(\mathrm {t}\overline{\mathrm {t}}\) events, to correct the \(\mathrm {t}\overline{\mathrm {t}}\) background yield in the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels, as well as to correct the part of the \(\mathrm {t}\overline{\mathrm {t}}\) background in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels that is either due to genuine \(\tau _\mathrm {h} \) or due to the misidentification of an electron or muon as \(\tau _\mathrm {h} \). The latter is not included in the background estimate obtained through the \(F_\text {F}\) method, but modelled in the MC simulation.

7 Systematic uncertainties

Imprecisely measured or imperfectly simulated effects can alter the normalization and distribution of the \(m_{\mathrm {\tau }\mathrm {\tau }}\) mass spectrum in \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal or background processes. These systematic uncertainties can be categorized into theory-related and experimental sources. The latter can be further subdivided into those associated with the reconstruction of physical objects of interest and with estimated backgrounds. The uncertainties related to the reconstruction of physical objects apply to the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal and to backgrounds modelled in the MC simulation. The main background contributions are determined from data, as described in Sect. 6, and are largely unaffected by the accuracy achieved in modelling data in the MC simulation.

The main experimental uncertainties are related to the reconstruction and identification of electrons, muons, and \(\tau _\mathrm {h} \), as follows. The efficiency to reconstruct and identify \(\tau _\mathrm {h} \) and the energy scale of \(\tau _\mathrm {h} \) (\(\tau _\mathrm {h} \) ES) is measured using \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\rightarrow \mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) events. The former is done by comparing the number of \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\rightarrow \mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) events with \(\tau _\mathrm {h} \) candidates passing and failing the \(\tau _\mathrm {h} \) ID criteria, and the latter by comparing the distributions in the \(\tau _\mathrm {h} \) candidate mass, as well as the visible mass of the muon and \(\tau _\mathrm {h} \) system in data and in MC simulation [75], measured with respective uncertainties of \({\approx }\,6\) and \({\approx }\,1\%\). The events selected for the \(\tau _\mathrm {h} \) ID efficiency and \(\tau _\mathrm {h} \) ES measurements overlap with the events in the \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channel. We account for the overlap by assigning a \(3\%\) uncertainty to \(\tau _\mathrm {h} \) ES. A \(3\%\) change in the \(\tau _\mathrm {h} \) ES affects the acceptance in \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal by 3, 3, and \(17\%\) in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels, respectively. The impact on the signal acceptance and on the distribution in \(m_{\mathrm {\tau }\mathrm {\tau }}\) is illustrated in Fig. 7. It has been checked that the overlap and the choice in the \(\tau _\mathrm {h} \) ES uncertainty have little impact on the final results. The ML fit performed to measure the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) cross section, described in Sect. 8, reduces the uncertainties in the \(\tau _\mathrm {h} \) ID efficiency and in the \(\tau _\mathrm {h} \) ES to 2.2 and \(0.9\%\), respectively. The efficiency of the \(\tau _\mathrm {h} \) trigger used in the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel is measured in \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\rightarrow \mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) events with an uncertainty of \({\approx }\,4.5\%\) per \(\tau _\mathrm {h} \). The measurement is detailed in Ref. [88].

Fig. 7
figure 7

Distributions expected in \(m_{\mathrm {\tau }\mathrm {\tau }}\) for \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal events in the (left) \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), (center) \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and (right) \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels for the nominal value of the \(\tau _\mathrm {h} \) ES, and after implementing \(3\%\) systematic shift

Electron and muon reconstruction, identification, isolation, and trigger efficiencies are measured using \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {e}\mathrm {e}\) and \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\mu }\mathrm {\mu }\) events via the “tag-and-probe” method [89] at an accuracy of \(2\%\). The energy scales for electrons and muons (\(\mathrm {e}\) ES and \(\mathrm {\mu }\) ES) are calibrated using \(\mathrm {J /\psi (1S)}\rightarrow \ell \ell \), \(\Upsilon \rightarrow \ell \ell \), and \(\mathrm {Z}/\gamma ^{*} \rightarrow \ell \ell \) events (with \(\ell \) referring to \(\mathrm {e}\) and \(\mathrm {\mu }\)), and have an uncertainty of \(1\%\). The \(\mathrm {e}\) ES and \(\mathrm {\mu }\) ES uncertainties affect the acceptance in the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \), and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels by less than \(1\%\).

The \(E_{\mathrm {T}}^{\text {miss}} \) response and resolution are known within uncertainties of a few percent from studies performed in \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\mu }\mathrm {\mu }\), \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {e}\mathrm {e}\), and \(\gamma \)+jets events [90]. The impact of these uncertainties on the acceptance in the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal is small, amounting to less than \(1\%\). In the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels, the impact arises from the \(m_{\mathrm {T}} < 40\,\text {GeV} \) selection criterion. In the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels, the impact is due to the \(P_{\zeta }^{\,\text {miss}}- 0.85 \, P_{\zeta }^{\,\text {vis}} > -\,20\,\text {GeV} \) requirement and the use of \(E_{\mathrm {T}}^{\text {miss}} \) and \(P_{\zeta }\) as input variables in the BDT that separates the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal from the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\mu }\mathrm {\mu }\) background, respectively. The effect of uncertainties related to the modelling of the \(E_{\mathrm {T}}^{\text {miss}} \) on the distribution in \(m_{\mathrm {\tau }\mathrm {\tau }}\) is small.

The uncertainty in the integrated luminosity is \(2.3\%\) [91].

The backgrounds determined from data are also subject to uncertainties that alter the normalization and distribution (“shape”) of the \(m_{\mathrm {\tau }\mathrm {\tau }}\) mass spectrum. Background yields and their associated uncertainties are given in Table 1. The uncertainties in the backgrounds arising from the misidentification of quark and gluon jets as \(\tau _\mathrm {h} \) candidates in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels are obtained by changing the \(F_\text {F}\) values as well as the relative fractions \(R_{\text {p}}\) of multijet, \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) backgrounds within their uncertainties. The resulting uncertainties in the \(m_{\mathrm {\tau }\mathrm {\tau }}\) distribution in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels are illustrated in Fig. 8. The uncertainties in the size of the false-\(\tau _\mathrm {h} \) backgrounds are 8, 6, and \(16\%\) in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels, respectively. In the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels, the uncertainty in the size of the multijet background is \({\approx }\,20\%\). The magnitude of the \(\mathrm {t}\overline{\mathrm {t}}\) background is known to an accuracy of \(7\%\). The uncertainty in the distribution of the \(\mathrm {t}\overline{\mathrm {t}}\) background is estimated by changing the weights applied to the \(\mathrm {t}\overline{\mathrm {t}}\) MC sample, to improve the modelling of the top quark \(p_{\mathrm {T}} \) distribution (described in Sect. 3), between no reweighting and the reweighting applied twice.

Fig. 8
figure 8

Distributions in \(m_{\mathrm {\tau }\mathrm {\tau }}\) expected for the background arising from quark or gluon jets misidentified as \(\tau _\mathrm {h} \) in the (left) \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), (center) \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and (right) \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels, and the systematic uncertainty in the false-\(\tau _\mathrm {h} \) background estimate. The grey shaded band represents the quadratic sum of all systematic uncertainties related to the \(F_\text {F}\) method: uncertainties in the \(F_\text {F}\) measured in the multijet, \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) DR; uncertainties in the relative fractions of multijet, \(\mathrm {W}\)+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) backgrounds in the AR; and uncertainties in the non-closure corrections (described in Sect. 6.1)

The uncertainties in the yields of single top quark and diboson backgrounds, modelled using MC simulation, are each \({\approx }\,15\%\). Besides constituting the dominant background in the \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channel, the DY production of electron and muon pairs are relevant backgrounds in, respectively, the decay channels \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), because of the small but non-negligible rate at which electrons and muons are misidentified as \(\tau _\mathrm {h} \). The probability for electrons and muons to pass the tight-electron or tight-muon removal criteria applied, respectively, in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels is measured in \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {e}\mathrm {e}\) and in \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\mu }\mathrm {\mu }\) events. The misidentification rates depend on \(\eta \). For electrons in the ECAL barrel and endcap regions, the misidentifications are at respective levels of 0.2 and \(0.1\%\), with accuracies of 13 and \(29\%\) [75]. The misidentification rate for muons lies between less than one and several tenths of a percent, and is known to within an uncertainty of \(30\%\). The contribution from \(\mathrm {W}\)+jets background in the \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels is modelled using MC simulation, and is known to an accuracy of \(15\%\). The production of SM Higgs bosons is assigned an uncertainty of \(30\%\), reflecting the present experimental uncertainty in the \(\text {H} \rightarrow \mathrm {\tau }\mathrm {\tau }\) rate measured at \(\sqrt{s} = 13\hbox { TeV}\) [14].

The theoretical uncertainty in the product of signal acceptance and efficiency for the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal is \({\approx }\,2\%\) in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \), and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels, and \(6\%\) in the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel. The quoted uncertainties include the effect of missing higher-order terms in the perturbative expansion for the calculated cross section, estimated through independent changes in the renormalization and factorization scales by factors of 2 and 1 / 2 relative to their nominal equal values [92, 93], uncertainties in the NNPDF3.0 set of PDF, estimated following the recommendations given in Ref. [94], and the uncertainties in the modelling of parton showers (PS) and the underlying event (UE). The theoretical uncertainty is larger in the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel, as the acceptance depends crucially on the modelling of the \(p_{\mathrm {T}} \) distribution of the \(\mathrm {Z}\) boson, which is also affected by the missing higher-order terms in the calculation.

The systematic uncertainties are summarized in Table 2. The table also quantifies the impact that each systematic uncertainty has on the measurement of the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) cross section, defined as the percent change in the measured cross section when individual sources are changed by one standard deviation relative to their nominal values. The impacts are computed for the values of nuisance parameters obtained in the ML fit used to extract the signal (described in Sect. 8).

Table 2 Effect of experimental and theoretical uncertainties in the measurement of the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) cross section. The sources of systematic uncertainty are specified in the leftmost column, and apply to the processes given in the second column. The relative changes in the acceptance \(\mathcal {A}\) for the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal, and in the yield from background processes that correspond to a one standard deviation change in a given source of uncertainty is given in the third column. The range in this column represents the range in signal acceptance or background yield across all decay channels and background processes. The impact that each change produces is quantified by its effect on the measured \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) cross section, given in the rightmost column

The uncertainties in the integrated luminosity, in the cross section for DY production of electron and muon pairs, and in the electron, muon, and \(\tau _\mathrm {h} \) reconstruction and identification efficiencies have greatest impact on the results.

The impact of the uncertainty on the integrated luminosity amounts to \(1.9\%\). This is smaller than the \(2.3\%\) uncertainty in the integrated luminosity measurement, because of correlations of the nuisance parameter representing the integrated luminosity with other nuisance parameters. When the integrated luminosity changes by \(2.3\%\), the ML fit readjusts the nuisance parameters that represent the rates for background processes obtained from MC simulation, as well as identification and trigger efficiencies for \(\mathrm {e}\), \(\mathrm {\mu }\), and \(\tau _\mathrm {h} \), such that the measured \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) cross section changes by only \(1.9\%\). The uncertainty in the integrated luminosity is not constrained in the ML fit.

The impact of the uncertainty in the production rate of \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {e}\mathrm {e}\) and \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\mu }\mathrm {\mu }\) background processes amounts to \(1.8\%\). The impact is sizeable, because of the small statistical uncertainty in the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\mu }\mathrm {\mu }\) background in the \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channel, which, in the absence of uncertainties in the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\mu }\mathrm {\mu }\) production rate, would constrain the efficiency for muon reconstruction and identification, as well as the integrated luminosity.

The impact of uncertainties in the efficiencies to reconstruct and identify electrons and muons amounts to 1.5 and \(1.6\%\), respectively. Their impact is considerable, because these uncertainties are not reduced greatly in the ML fit, as they affect all channels, except the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel, in a similar way.

The impact of the uncertainty in the efficiency to reconstruct and identify \(\tau _\mathrm {h} \) is of similar size, amounting to \(1.5\%\), despite that the uncertainty in the \(\tau _\mathrm {h} \) ID efficiency is significantly larger than the uncertainties in the electron and muon ID efficiencies. This is because the simultaneous fit to the \(m_{\mathrm {\tau }\mathrm {\tau }}\) distributions in all five channels reduces the uncertainties in the \(\tau _\mathrm {h} \) ID efficiency and the \(\tau _\mathrm {h} \) ES significantly, diminishing thereby the impact that these uncertainties have on the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) cross section. When the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) cross section is measured in the individual \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels, the impact of the uncertainty on the \(\tau _\mathrm {h} \) ID efficiency increases to 6, 6, and \(10\%\), respectively.

The uncertainty in \(\tau _\mathrm {h} \) ES becomes relevant for the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel when the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) cross section is measured in this channel alone, and amounts to \(9\%\). In the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \) and \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \) channels, the impact of the \(\tau _\mathrm {h} \) ES uncertainty amounts to less than \(1\%\), even when the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) cross section is measured just in these channels.

8 Signal extraction

The cross section \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X}) \, \mathcal {B}(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau })\) for DY production of \(\mathrm {\tau }\) pairs is obtained through a simultaneous ML fit to the observed \(m_{\mathrm {\tau }\mathrm {\tau }}\) distributions in the five decay channels: \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), \(\tau _\mathrm {h} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \), and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \). The likelihood function \(\mathcal {L}\left( \text {data} \, \vert \, \xi , \varTheta \right) \) depends on the value of the cross section, denoted by the symbol \(\xi \), which defines the parameter of interest (POI) in the fit, and it also depends on the values of nuisance parameters \(\theta _{k}\) that represent the systematic uncertainties discussed in Sect. 7:

$$\begin{aligned} \mathcal {L}\left( \text {data} \, \vert \, \xi , \varTheta \right) = \prod _{i} \, {\mathcal {P}}\left( n_{i} \vert \xi , \varTheta \right) \, \prod _{k} \, \rho \left( \tilde{\theta }_{k} \vert \theta _{k}\right) . \end{aligned}$$
(7)

The index i refers to individual bins of the \(m_{\mathrm {\tau }\mathrm {\tau }}\) distribution in each of the five final states. The set of all nuisance parameters \(\theta _{k}\) is denoted by the symbol \(\varTheta \). Correlations among decay channels as well as between the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal and background processes are taken into account through relationships among channels, processes, and nuisance parameters in the ML fit. The probability to observe \(n_{i}\) events in a given bin i, when \(\nu _{i}(\xi , \varTheta )\) events are expected in that bin is given by the Poisson distribution:

$$\begin{aligned} {\mathcal {P}}\left( n_{i} \vert \xi , \varTheta \right) = \frac{\left( \nu _{i}(\xi , \varTheta )\right) ^{n_{i}}}{n_{i}!} \, \exp \left( -\nu _{i}(\xi , \varTheta ) \right) . \end{aligned}$$
(8)

The number of events expected in each bin corresponds to the sum of the number of signal (\(\nu _{i}^{\text {S}}\)) and background (\(\nu _{i}^{\text {B}}\)) events: \(\nu _{i}(\xi , \varTheta ) = \nu _{i}^{\text {S}}(\xi , \varTheta ) + \nu _{i}^{\text {B}}(\varTheta )\). The estimate in the number of background events is obtained as described in Sect. 6. The number of signal events is proportional to \(\xi \), with the coefficient of proportionality depending on the signal acceptance and on the signal selection efficiency, with both obtained from MC simulation.

The function \(\rho \left( \tilde{\theta }_{k} \vert \theta _{k}\right) \) represents the probability to observe a value \(\tilde{\theta }_{k}\) in an auxiliary measurement of the nuisance parameter, given that the true value is \(\theta _{k}\). The nuisance parameters are treated via the frequentist paradigm, as described in Refs. [95, 96]. Systematic uncertainties that affect only the normalization, but not the distribution in \(m_{\mathrm {\tau }\mathrm {\tau }}\), are represented by the Gamma function if they are statistical in origin, e.g. corresponding to the number of events observed in a control region, and otherwise by log-normal probability density functions. Systematic uncertainties that affect the distribution in \(m_{\mathrm {\tau }\mathrm {\tau }}\) are incorporated into the ML fit via the technique detailed in Ref. [97], and represented by Gaussian probability density functions. Nuisance parameters representing systematic uncertainties of the latter type can also affect the normalization of the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal or of its backgrounds. The nuisance parameters corresponding to the cross sections for DY production of electron and muon pairs are left unconstrained in the fit.

Fig. 9
figure 9

Dependence of \(-\,2 \ln \lambda \left( \xi \right) \) on the cross section \(\xi \) for DY production of \(\mathrm {\tau }\) pairs. The PLR is computed for the simultaneous ML fit to the observed \(m_{\mathrm {\tau }\mathrm {\tau }}\) distributions in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), \(\tau _\mathrm {h} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \), and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels. The dashed, dash-dotted, and solid curves correspond to situations when just the statistical uncertainties are used in the fit, when the uncertainty in integrated luminosity is also included, and when all uncertainties are included in the fit. The values of nuisance parameters, corresponding to uncertainties that are ignored, are fixed at the values that yield the best fit to the data. The horizontal line represents the value of \(-\,2 \ln \lambda \left( \xi \right) \) that is used to determine the \(68\%\) CI on \(\xi \)

Table 3 Yields expected in \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal events and backgrounds in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), \(\tau _\mathrm {h} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \), and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels, obtained from the ML fit described in Sect. 8. The uncertainties are rounded to two significant digits, except when they are \(< 10\), in which case they are rounded to one significant digit, and the event yields are rounded to match the precision in the uncertainties. The analysed data corresponds to an integrated luminosity of \(2.3~\mathrm {fb}^{-1}\)

The best fit value \(\hat{\xi }\) of the POI is the value that maximizes the likelihood \(\mathcal {L}\left( \text {data} \, \vert \, \xi , \varTheta \right) \) in Eq. (7). A \(68\%\) confidence interval (CI) on the POI is obtained using the profile likelihood ratio (PLR) [95, 96, 98]:

$$\begin{aligned} \lambda \left( \xi \right) = \frac{\mathcal {L}\left( \text {data} \, \vert \, \xi , \hat{\varTheta }_{\xi } \right) }{\mathcal {L}\left( \text {data} \, \vert \, \hat{\xi }, \hat{\varTheta } \right) }. \end{aligned}$$
(9)

The symbol \(\hat{\varTheta }_{\xi }\) denotes the values of nuisance parameters that maximize the likelihood for a given value of \(\xi \). The combination of \(\hat{\xi }\) and \(\hat{\varTheta }\) correspond to the values of \(\xi \) and \(\varTheta \) for which the likelihood function reaches its maximum. The \(68\%\) CI is defined by the values of \(\xi \) for which \(-\,2 \ln \lambda \left( \xi \right) \) increases by one unit relative to its minimum. To quantify the effects from individual statistical uncertainties, the uncertainty in the integrated luminosity, and other systematic uncertainties, we ignore some single source of uncertainties at a time, and recompute the \(68\%\) CI. The nuisance parameters \(\theta _{k}\) corresponding to uncertainties that are ignored are fixed at the values \(\hat{\theta }_{k}\) that yield the best fit to the data. The square root of the quadratic difference between the CI, computed for all sources of uncertainties in the fit, and for the case that some given source is ignored, reflects the estimate of the uncertainty in the POI resulting from a single source. The procedure is illustrated in Fig. 9 for the combined fit of all five final states. Correlations among different sources of uncertainty are estimated through this procedure.

Table 4 Cross section \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X}) \, \mathcal {B}(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau })\) measured in individual final states
Fig. 10
figure 10

Distributions in \(m_{\mathrm {\tau }\mathrm {\tau }}\) for events selected in the (upper left) \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), (upper right) \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), and (lower) \(\tau _\mathrm {h} \tau _\mathrm {h} \) channels. Signal and background contributions are shown for values of nuisance parameters obtained in the ML fit to the data

Fig. 11
figure 11

Distributions in \(m_{\mathrm {\tau }\mathrm {\tau }}\) for events selected in the (left) \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \) and (right) \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels. Signal and background contributions are shown for the values of nuisance parameters obtained in the ML fit to the data

The cross section for DY production of \(\mathrm {\tau }\) pairs is quoted within the mass window \(60< m_{\mathrm {\tau }\mathrm {\tau }}^{\text {true}} < 120\,\text {GeV} \). The contribution from \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) events that pass the selection criteria described in Sect. 5, but have a mass outside of this window is at the level of a few percent in the \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \), and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \) channels. In the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel, this contribution from outside of the mass window is \({\approx }\,40\%\), the reason for this being so large is the high \(p_{\mathrm {T}} \) threshold on the \(\tau _\mathrm {h} \) candidates required in the trigger. The \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) events that have two \(\tau _\mathrm {h} \) with \(p_{\mathrm {T}} > 40\,\text {GeV} \) contain either a \(\mathrm {Z}\) boson of high \(p_{\mathrm {T}} \) or a \(\mathrm {\tau }\) lepton pair above the mass of the \(\mathrm {Z}\) boson. Only a small fraction of signal events pass either of these two conditions, which leads to the smallest event yield from the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal in the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel (as shown in Table 3), and to the largest fraction of signal events containing a \(\mathrm {\tau }\) lepton pair of mass outside of the \(60< m_{\mathrm {\tau }\mathrm {\tau }}^{\text {true}} < 120\,\text {GeV} \) window.

The PLR depends on the \(\tau _\mathrm {h} \) ID efficiency and on the \(\tau _\mathrm {h} \) ES through its dependence on the corresponding two nuisance parameters. The \(\tau _\mathrm {h} \) ID efficiency and \(\tau _\mathrm {h} \) ES are determined by promoting these nuisance parameters to the role of POI. The cross section for DY production of \(\mathrm {\tau }\) pairs, the \(\tau _\mathrm {h} \) ID efficiency, and the \(\tau _\mathrm {h} \) ES are left unconstrained in the fit, and the PLR is minimized as a function of all three parameters.

9 Results

The yields expected in \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }\) signal and in background contributions from the ML fit to the \(m_{\mathrm {\tau }\mathrm {\tau }}\) distributions in the different decay channels are given in Table 3. The cross sections are displayed in Table 4, and the distributions in \(m_{\mathrm {\tau }\mathrm {\tau }}\) for the selected events are shown in Figs. 10 and 11.

The total uncertainty in the cross section is decomposed into statistical contributions, uncertainty in the integrated luminosity of the data, and other systematic uncertainties, as described in Sect. 8. The measured values are compatible with each other. The largest deviation, amounting to a little more than one standard deviation, is observed in the \(\tau _\mathrm {h} \tau _\mathrm {h} \) channel. A deviation of this magnitude is expected. We proceed to a simultaneous fit of the \(m_{\mathrm {\tau }\mathrm {\tau }}\) distributions in the five final states. The value of the cross section obtained from the combined fit is:

$$\begin{aligned}&\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X}) \, \mathcal {B}(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }) \nonumber \\&\quad =1848 \pm 12\,\text {(stat)} \pm 57\,\text {(syst)} \pm 35\,\text {(lumi)} \hbox { pb}. \end{aligned}$$
(10)

The result is compatible with the prediction of \(1845^{+12}_{-6}\text { (scale)} \pm 33\text { (PDF)}\) pb, computed at NNLO accuracy [60] using the NNPDF3.0 PDF. The results are illustrated in Fig. 12. The inner and outer error bars represent, respectively, the statistical uncertainties, and the quadratic sum of the uncertainties in the statistical, systematic, and integrated-luminosity components. The uncertainty in \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X}) \, \mathcal {B}(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau })\) arising from the uncertainty in the integrated luminosity is smaller than the uncertainty in the integrated luminosity, for the reasons discussed in Sect. 7.

As a side note, the values of the nuisance parameters that correspond to the cross sections in the \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {e}\mathrm {e}\) and \(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\mu }\mathrm {\mu }\) backgrounds, obtained from the simultaneous fit to the \(m_{\mathrm {\tau }\mathrm {\tau }}\) distributions in the five final states in data, are also compatible with the expected values.

Fig. 12
figure 12

The inclusive cross section \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X}) \, \mathcal {B}(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau })\) measured in individual channels, and in the combination of all final states, compared to the theoretical prediction [60]

Fig. 13
figure 13

Likelihood contours for the joint parameter estimation of (upper left) \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X}) \, \mathcal {B}(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau })\) and the \(\tau _\mathrm {h} \) ID efficiency, (upper right) \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X}) \, \mathcal {B}(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau })\) and \(\tau _\mathrm {h} \) ES, and (lower) the \(\tau _\mathrm {h} \) ES and the \(\tau _\mathrm {h} \) ID efficiency, at 68 and \(95\%\) confidence level (CL). The values of the \(\tau _\mathrm {h} \) ID efficiency and of \(\tau _\mathrm {h} \) ES are quoted in terms of scale factors (SF) relative to their standard model, MC expectation

Two-dimensional projections of \(-\,2 \ln \lambda \left( \xi \right) \), obtained when the \(\tau _\mathrm {h} \) ID efficiency and \(\tau _\mathrm {h} \) ES are left unconstrained in the fit, are shown in Fig. 13. Measured values of the \(\tau _\mathrm {h} \) ID efficiency and of \(\tau _\mathrm {h} \) ES are quoted as scale factors (SF) relative to their MC expectation. The values of \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X}) \, \mathcal {B}(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau })\), \(\tau _\mathrm {h} \) ID efficiency, and \(\tau _\mathrm {h} \) ES that minimize \(-\,2 \ln \lambda \left( \xi \right) \), yielding the best fit to the data, are indicated by a cross. Contours for which \(-\,2 \ln \lambda \left( \xi \right) \) exceeds its minimum value by 2.30 and 6.18 units, corresponding to coverage probabilities of 68 and \(95\%\) in the two-dimensional parameter plane, are also shown. The \(68\%\) CIs for the \(\tau _\mathrm {h} \) ID efficiency and \(\tau _\mathrm {h} \) ES are obtained as the values of the respective parameter for which \(-\,2 \ln \lambda \left( \xi \right) \) increases by one unit relative to its minimum. The measured SF for the \(\tau _\mathrm {h} \) ID efficiency and for \(\tau _\mathrm {h} \) ES amount to \(0.979 \pm 0.022\) and \(0.986 \pm 0.009\), respectively. Both SF are compatible with unity, indicating that the measured values of the \(\tau _\mathrm {h} \) ID efficiency and of the \(\tau _\mathrm {h} \) ES are in agreement with the MC expectation. The expected \(\tau _\mathrm {h} \) ID efficiency in the LHC data is documented in Ref. [75].

10 Summary

The cross section for inclusive Drell–Yan production of \(\mathrm {\tau }\) pairs has been measured using \(\mathrm {p}\mathrm {p}\) collisions recorded by the CMS experiment at \(\sqrt{s} = 13\hbox { TeV}\) at the LHC. The analysed data correspond to an integrated luminosity of \(2.3~\mathrm {fb}^{-1}\). The signal yield was determined in a global fit to the mass distributions in five \(\mathrm {\tau }\mathrm {\tau }\) decay channels: \(\mathrm {\tau }_{\mathrm {e}} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {\mu }} \tau _\mathrm {h} \), \(\tau _\mathrm {h} \tau _\mathrm {h} \), \(\mathrm {\tau }_{\mathrm {e}} \mathrm {\tau }_{\mathrm {\mu }} \), and \(\mathrm {\tau }_{\mathrm {\mu }} \mathrm {\tau }_{\mathrm {\mu }} \). The measured cross section times branching fraction \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}/\gamma ^{*} \text {+X}) \, \mathcal {B}(\mathrm {Z}/\gamma ^{*} \rightarrow \mathrm {\tau }\mathrm {\tau }) = 1848 \pm 12\,\text {(stat)} \pm 57\,\text {(syst)} \pm 35\,\text {(lumi)} \hbox { pb}\) is in agreement with the standard model expectation, computed at next-to-next-to-leading order accuracy in perturbation theory. As a byproduct of the global fit, the efficiency for reconstructing and identifying the decays of \(\mathrm {\tau }\) leptons to hadrons (\(\mathrm {\tau }\rightarrow \text{ hadrons } + \nu _{\mathrm {\tau }}\)), as well as the \(\tau _\mathrm {h} \) energy scale, have been determined. The results from data agree with Monte Carlo simulation within the uncertainties of the measurement, amounting to \(2.2\%\) relative uncertainty in the \(\tau _\mathrm {h} \) identification efficiency, and \(0.9\%\) in the energy scale.