Evidence for the 125 GeV Higgs boson decaying to a pair of τ leptons

A bstractA search for a standard model Higgs boson decaying into a pair of τ leptons is performed using events recorded by the CMS experiment at the LHC in 2011 and 2012. The dataset corresponds to an integrated luminosity of 4.9 fb−1 at a centre-of-mass energy of 7 TeV and 19.7 fb−1 at 8 TeV. Each τ lepton decays hadronically or leptonically to an electron or a muon, leading to six different final states for the τ -lepton pair, all considered in this analysis. An excess of events is observed over the expected background contributions, with a local significance larger than 3 standard deviations for mH values between 115 and 130 GeV. The best fit of the observed H → τ τ signal cross section times branching fraction for mH = 125 GeV is 0.78 ± 0.27 times the standard model expectation. These observations constitute evidence for the 125 GeV Higgs boson decaying to a pair of τ leptons.



Introduction
Elucidating the mechanism of electroweak symmetry breaking, through which the W and Z bosons become massive, is an important goal of the Large Hadron Collider (LHC) physics programme. In the standard model (SM) [1,2], electroweak symmetry breaking is achieved via the Brout-Englert-Higgs mechanism [3][4][5][6][7][8], which also predicts the existence of a scalar Higgs boson. On July 4, 2012, the discovery of a new boson with a mass around 125 GeV was announced at CERN by the ATLAS and CMS Collaborations [9][10][11]. The excess was most significant in the ZZ, γγ, and WW decay modes. The spin and CP properties of the new boson are compatible with those of the SM Higgs boson [12,13]. In the SM, the masses of the fermions are generated via the Yukawa couplings between the Higgs field and the fermionic fields. The measurement of these couplings is essential for identifying this boson as the SM Higgs boson. The ττ decay mode is the most promising because of the large event rate expected in the SM compared to the other leptonic decay modes and the smaller contribution from background events with respect to the bb decay mode.
Searches for SM Higgs bosons decaying to a τ-lepton pair have been performed at the LEP, Tevatron, and LHC colliders. The collaborations at LEP have searched for associated ZH production and found no significant excess of events over the background expectation [14][15][16][17]. Dedicated searches in the ττ final state have been carried out at the Tevatron and at the LHC, placing upper limits on the Higgs boson production cross section times branching fraction, denoted as σ × B, at the 95% confidence level (CL). Using pp collisions at √ s = 1.96 TeV, the CDF Collaboration excluded values larger than 16 times (σ × B) 125 GeV This paper reports on the results of a search for a SM Higgs boson using final states with a pair of τ leptons in proton-proton collisions at √ s = 7 and 8 TeV at the LHC. We use the entire dataset collected in 2011 and 2012 by the CMS experiment corresponding to an integrated luminosity of 4.9 fb −1 at a centre-of-mass energy of 7 TeV and 19.7 fb −1 at 8 TeV.
The paper is organized as follows. An overview of the analysis strategy is given in section 2, while the CMS detector, the event reconstruction, and the Monte Carlo (MC) simulation are described in section 3. The event selection is summarized in section 4, followed by the description of the reconstruction of the τ-lepton pair invariant mass in section 5 and the categorization of events in section 6. The background estimation is based on data control regions whenever possible and is explained in section 7. Finally, systematic uncertainties are summarized in section 8 and the results are presented in section 9.

Analysis overview
Throughout this paper, the symbol τ h denotes the reconstructed hadronic decay of a τ lepton. The τ h candidates are reconstructed in decay modes with one or three charged particles (see section 3). The symbol refers to an electron or a muon, and the symbol L to any kind of reconstructed charged lepton, namely electron, muon, or τ h .
The main Higgs boson production mechanisms, shown in figure 1, lead to final states with a different number of charged leptons. For Higgs boson production through gluon-gluon fusion 2 2 Analysis overview and vector boson fusion (VBF), final states with H → ττ decays contain only two charged leptons, defining the LL channels. All six τ-pair final states are studied: LL = µτ h , eτ h , τ h τ h , eµ, µµ, and ee. Sensitivity to the associated production with a W or a Z boson is achieved by requiring one or two additional electrons or muons compatible with leptonic decays of the W or Z boson. The four most sensitive final states are retained in the + Lτ h channels aiming at the associated production with a W boson, + Lτ h = µ + µτ h , e + µτ h /µ + eτ h , µ + τ h τ h , and e + τ h τ h . In the + LL channels that target the associated production with a Z boson decaying to , the τ-pair final states µτ h , eτ h , eµ, and τ h τ h are considered, leading to eight channels in total. The ee and µµ τ-pair final states are excluded because the corresponding events are already used in the search for H → ZZ → 4 [11].
To maximize the sensitivity of the analysis in the LL channels, events are classified in categories according to the number of jets in the final state, excluding the jets corresponding to the L and L leptons. The events are further classified according to a number of kinematic quantities that exhibit different distributions for signal and background events (see section 6). In particular, the contribution of the VBF production process is enhanced for events with two or more jets by requiring a large rapidity gap between the two jets with the highest transverse momentum. For the remaining events with at least one jet, requiring a large p T of the reconstructed Higgs boson candidate increases the sensitivity to Higgs boson production through gluon fusion. A complete listing of all lepton final states and event categories is given in appendix B.
With the exception of the + Lτ h , ee, and µµ channels, the signal is extracted from the distribution of the invariant mass of the τ-lepton pair, m ττ , calculated from the L and L four-momenta and the missing transverse energy vector. In the + Lτ h channels, the signal extraction is instead based on the invariant mass, m vis , of the visible Lτ h decay products because the missing transverse energy does not entirely arise from the neutrinos produced in the decay of the two τ leptons. In the ee and µµ channels, a discriminating variable combining a number of kinematic quantities and other observables, including m ττ , is used.
The background composition depends on the channel and, in particular, on the number of electrons and muons in the final state. The Drell-Yan production of a Z boson decaying into a pair of τ leptons constitutes the main irreducible background in all LL channels. Another source of background with the same leptonic final state is the production of top-quark pairs (tt), which is most important in the eµ channel. Reducible background contributions include QCD multijet production that is particularly relevant in the τ h τ h channel and W(→ ν) + jets production with a jet misidentified as a τ h in the τ h channels. In the + Lτ h and + LL channels, diboson production is the largest irreducible background.
While the signal contribution is expected to be a pure sample of H → ττ decays in many channels considered, there is a significant contribution from H → WW decays in the + τ h and the + LL channels, and, most importantly, in the two-jet event samples of the eµ, ee, and µµ channels. The contribution from H → WW decays is treated as a background in the search for H → ττ decays. Given the discovery of a SM-like Higgs boson with the mass near 125 GeV, this contribution is taken from the expectation for a SM Higgs boson with m H = 125 GeV. On the other hand, the presence of a H → WW contribution provides additional sensitivity to the coupling of the Higgs boson to vector bosons. Therefore, the H → WW contribution is treated as a signal process for the measurement of the fermionic and the bosonic couplings of the Higgs boson.

The CMS experiment
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the volume of the superconducting solenoid are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass/scintillator hadron calorimeter. The coverage of these calorimeters is complemented by extensive forward calorimetry. Muons are detected in gas-ionization chambers embedded in the steel flux return yoke outside the solenoid. The first level of the CMS trigger system (L1), composed of custom hardware processors, uses information from the calorimeters and muon detectors to select the most interesting events in a fixed time interval of less than 4 µs. The high-level trigger (HLT) processor farm further decreases the event rate from around 100 kHz to around 300 Hz, before data storage. A more detailed description of the CMS detector can be found in ref. [22].
The CMS experiment uses a right-handed coordinate system, with the origin at the nominal interaction point, the x axis pointing to the centre of the LHC, the y axis pointing up (perpendicular to the LHC plane), and the z axis along the anticlockwise-beam direction. The polar angle θ is measured from the positive z axis and the azimuthal angle φ is measured in the transverse (x, y) plane. The pseudorapidity is defined as η ≡ −ln[tan(θ/2)].
The number of inelastic proton-proton collisions occurring per LHC bunch crossing was, on average, 9 in 2011 and 21 in 2012. The tracking system is able to separate collision vertices as close as 0.5 mm along the beam direction [23]. For each vertex, the sum of the squared transverse momenta of all associated tracks is computed. The vertex for which this quantity is the largest is assumed to correspond to the hard-scattering process and is referred to as the primary vertex. The additional proton-proton collisions happening in the same bunch crossing are termed pileup (PU).
A particle-flow (PF) algorithm [24][25][26] combines the information from the CMS subdetectors to identify and reconstruct the particles emerging from proton-proton collisions: charged hadrons, neutral hadrons, photons, muons, and electrons. These particles are then used to reconstruct the missing transverse energy vector E miss T , the jets, the τ h candidates, and to quantify the lepton isolation. Jets are reconstructed from all particles using the anti-k T jet clustering algorithm implemented in FASTJET [27,28], with a distance parameter of 0.5. The jet energy scale is calibrated through correction factors that depend on the p T and η of the jet [29]. Jets originating from the hadronization of b quarks are identified using the combined secondary vertex (CSV) algorithm [30] which exploits observables related to the long lifetime of b hadrons. The btagging efficiencies in simulation are corrected for differences between simulated and recorded 4 3 The CMS experiment events. Jets originating from PU are identified and rejected based on both vertex information and jet shape information [31]. All particles reconstructed in the event are used to determine the E miss T (and its magnitude, E miss T ) with a high, PU-independent, resolution [32] using a multivariate regression technique based on a boosted decision tree (BDT) [33].
Muons are identified with additional requirements on the quality of the track reconstruction and on the number of measurements in the tracker and the muon systems [34]. Electrons are identified with a multivariate discriminant combining several quantities describing the track quality, the shape of the energy deposits in the electromagnetic calorimeter, and the compatibility of the measurements from the tracker and the electromagnetic calorimeter [35]. The τ h are reconstructed and identified using the "hadron-plus-strips" algorithm [36] which uses charged hadrons and photons to reconstruct the main decay modes of the τ lepton: one charged hadron, one charged hadron + photons, and three charged hadrons. Electrons and muons misidentified as τ h are suppressed using dedicated criteria based on the consistency between the measurements in the tracker, the calorimeters, and the muon detectors. Figure 2 shows the resulting τ h mass distribution reconstructed from the visible decay products, m τ h vis , in the µτ h channel after the baseline selection described in section 4, illustrating the different decay modes.
To reject non-prompt or misidentified leptons, the absolute lepton isolation is defined as In this expression, ∑ charged p T is the scalar sum of the transverse momenta of the charged hadrons, electrons, and muons originating from the primary vertex and located in a cone of size ∆R = √ (∆η) 2 + (∆φ) 2 = 0.4 centred on the lepton direction. The sums ∑ neutral p T and ∑ γ p T represent the same quantity for neutral hadrons and photons, respectively. In the case of τ h , the particles used in the reconstruction of the τ h are excluded from the sums. The contribution of pileup photons and neutral hadrons is estimated from the scalar sum of the transverse momenta of charged hadrons from pileup vertices in the cone, ∑ charged, PU p T . This sum is multiplied by a factor of 1/2 which corresponds approximately to the ratio of neutral to charged hadron production in the hadronization process of inelastic proton-proton collisions, as estimated from simulation. The relative lepton isolation is defined as R L ≡ I L /p L T , where p L T is the lepton transverse momentum.
The signal event samples with a SM Higgs boson produced through gluon-gluon fusion or VBF are generated with POWHEG 1.0 [37-41], while PYTHIA 6.4 [42] is used for the production of a SM Higgs boson in association with a W or Z boson, or with a tt pair. The MADGRAPH 5.1 [43] generator is used for Z + jets, W + jets, tt + jets, and diboson production, and POWHEG for single-top-quark production. The POWHEG and MADGRAPH generators are interfaced with PYTHIA for parton shower and fragmentation. The PYTHIA parameters affecting the description of the underlying event are set to the Z2 tune for the 7 TeV samples and to the Z2 * tune for the 8 TeV samples [44]. All generators are interfaced with TAUOLA [45] for the simulation of the τ-lepton decays. The Higgs boson p T spectrum from POWHEG is reweighted to the spectrum obtained from a next-to-next-to-leading-order (NNLO) calculation using HRES [46]. The reweighting increases by about 3% the fraction of gluon-gluon fusion signal events with a Higgs boson mass of 125 GeV and p T > 100 GeV. The various production cross sections and branching fractions for SM processes and their corresponding uncertainties are taken from references .
The presence of pileup interactions is incorporated by simulating additional proton-proton  Figure 2: Observed and predicted distributions for the visible τ h mass, m τ h vis , in the µτ h channel after the baseline selection described in section 4. The yields predicted for the Z → ττ, Z → µµ, electroweak, tt, and QCD multijet background contributions correspond to the result of the final fit presented in Section 9. The Z → ττ contribution is then split according to the decay mode reconstructed by the hadron-plus-strips algorithm as shown in the legend. The mass distribution of the τ h built from one charged hadron and photons peaks near the mass of the intermediate ρ(770) resonance; the mass distribution of the τ h built from three charged hadrons peaks around the mass of the intermediate a 1 (1260) resonance. The τ h built from one charged hadron and no photons are reconstructed with the π ± mass, assigned to all charged hadrons by the PF algorithm, and constitute the main contribution to the third bin of this histogram. The first two bins correspond to τ ± leptons decaying into e ± νν and µ ± νν, respectively, and for which the electron or muon is misidentified as a τ h . The electroweak background contribution is dominated by W + jets production. In most selected W + jets, tt, and QCD multijet events, a jet is misidentified as a τ h . The "bkg. uncertainty" band represents the combined statistical and systematic uncertainty in the background yield in each bin. The expected contribution from the SM Higgs signal is negligible.

4 Baseline event selection
collisions with PYTHIA. All generated events are processed through a detailed simulation of the CMS detector based on GEANT4 [74] and are reconstructed with the same algorithms as for data. Simulated and recorded Z + jets events are compared to extract event weighting factors and energy correction factors for the various physics objects. These are then applied to all simulated events in order to minimize the remaining discrepancies with data. In particular, (i) a recoil correction is applied to the response and resolution of the components of the E miss T [32], (ii) energy correction factors are applied to the leptons, and (iii) simulated events are weighted by the ratio between the observed and expected lepton selection efficiencies, which can differ by a few percent.

Baseline event selection
Events are selected and classified in the various channels according to the number of selected electrons, muons, and τ h candidates. The resulting event samples are independent. Using simulated event samples, the trigger and offline selection criteria have been optimized for each channel to maximize the sensitivity to a SM Higgs boson signal. These criteria are summarized in table 1 for the LL and + Lτ h channels, and in table 2 for the + LL channels.
The HLT requires a combination of electron, muon, and τ h trigger objects [34,35,75]. A specific version of the PF algorithm is used in the HLT to quantify the isolation of τ h trigger objects as done in the offline reconstruction. Channels with two are based on a di-trigger. Channels with a single are based on a τ h trigger except for the µ + τ h τ h channel which uses a single-muon trigger. The fully hadronic τ h τ h channel relies on di-τ h and di-τ h + jet triggers, implemented for the 8 TeV data taking period. For these triggers, the reconstruction of the two τ h trigger objects is seeded by objects from the L1 trigger system. These objects can either be two calorimeter jets with p T > 64 GeV and |η| < 3.0 or two narrow calorimeter jets with p T > 44 GeV and |η| < 2.17. The offline isolation requirements range within the values given in table 1 depending on the lepton flavour, p T , and η. For the channels considered in this analysis, the efficiency to reconstruct and select offline a τ decaying hadronically ranges from 60 to 70%, with a jet misidentification probability around 1%. For τ h candidates selected offline, the efficiency of the HLT selection plateaus at 90%. For the µτ h channel, the muon p T threshold was raised in 2012 to 20 GeV to cope with the increased instantaneous luminosity. For the same reason, the electron p T threshold was raised to 24 GeV in the eτ h channel. For the + LL channels, the selection proceeds by first identifying a Z boson candidate (Z → ) with mass between 60 and 120 GeV from opposite-charge electron or muon pairs, and then a Higgs boson candidate (H → LL ) from the remaining leptons. With some variations among the channels, all leptons meet the minimum requirement that the distance of closest approach to the primary vertex satisfies d z < 0.2 cm along the beam direction, and d xy < 0.045 cm in the transverse plane. The two leptons assigned to the Higgs boson decay are required to be of opposite charge.
In the τ h channels, the large W + jets background is reduced by requiring where p T is the transverse momentum and ∆φ is the difference in azimuthal angle between the direction and the E miss T . In the eµ channel, the tt background is reduced using a BDT discriminant that makes use of kinematic variables related to the eµ system and the E miss T , the distance of closest approach between the leptons and the primary vertex, and the value of the CSV b-tagging discriminator for the leading jet with p T > 20 GeV, if any. Table 1: Lepton selection for the LL and + Lτ h channels. The HLT requirement is defined by a combination of trigger objects with p T over a given threshold. The p T and I τ h thresholds are given in GeV. The indices 1 and 2 denote, respectively, the leptons with the highest and next-tohighest p T . The definitions of the lepton isolation, R and I, are given in the text. For a number of channels, the isolation requirements depend on the lepton flavour, p T , and η. Similarly, a range of p T thresholds is given when the HLT requirements change with the data-taking period.

Channel
HLT requirement Lepton selection T > 20 8 5 The tau-pair invariant-mass reconstruction Table 2: Lepton selection for the + LL channels. The HLT requirement is defined by a combination of trigger objects over a p T threshold indicated in GeV. The p T and I τ h thresholds are given in GeV. The indices 1 and 2 denote, respectively, the leptons with the highest and next-to-highest p T .

Resonance HLT requirement
Lepton selection In the + τ h τ h channels, the background from QCD multijet, W + jets, and Z + jets production is suppressed using a BDT discriminant based on the E miss T and on kinematic variables related to the τ h τ h system. With τ h,1 and τ h,2 denoting the τ h with highest and second-highest p T , respectively, these variables are p T ). For the chosen threshold on the BDT score, the signal efficiency is ∼60% whereas the efficiency for the reducible background components is ∼13%.
In the + τ h channels, the large background from Z and tt production is strongly reduced by requiring the and leptons to have the same charge. For the 7 TeV dataset, the requirement L T ≡ p T + p T + p τ h T > 80 GeV is imposed to further suppress the reducible background components. For the 8 TeV dataset, the L T variable is instead used to divide the data into two event categories, one with high L T (≥130 GeV) and one with low L T (<130 GeV). The Z + jets background in the + LL channels is reduced by selecting events with high L LL T ≡ p L T + p L T . The requirements are L µτ h T > 45 GeV for + µτ h , L eτ h T > 30 GeV for + eτ h , L τ h τ h T > 70 GeV for + τ h τ h , and L eµ T > 25 GeV for + eµ.

The τ-pair invariant-mass reconstruction
The visible mass, m vis , of the LL system could be used to separate the H → ττ signal events from the Z → ττ events, which constitute an important irreducible background. However, the neutrinos from the τ-lepton decay can take away a large amount of energy, thereby limiting the separation power of the m vis variable. In Z → ττ events and in H → ττ events where the Higgs boson is produced through gluon-gluon fusion, VBF, or in association with a Z boson, the τlepton decay is the only source of neutrinos. The SVFIT algorithm described below combines the E miss T with the L and L momenta to calculate a more precise estimator of the mass of the parent boson, the τ-pair invariant mass m ττ .
Six parameters are needed to specify a hadronic τ-lepton decay: the polar and azimuthal angles of the visible decay product system in the τ-lepton rest frame, the three boost parameters from the τ-lepton rest frame to the laboratory frame, and the invariant mass m vis of the visible decay products. In the case of a leptonic τ decay two neutrinos are produced and the invariant mass of the two-neutrino system is the seventh parameter. The unknown parameters are constrained by four observables that are the components of the four-momentum of the system formed by the visible decay products of the τ lepton, measured in the laboratory frame. For each hadronic (leptonic) τ-lepton decay, 2 (3) parameters are thus left unconstrained. These parameters are chosen to be: • x, the fraction of the τ-lepton energy in the laboratory frame carried by the visible decay products; • φ, the azimuthal angle of the τ-lepton direction in the laboratory frame; • m νν , the invariant mass of the two-neutrino system in leptonic τ decays; for hadronic τ-lepton decays, we take m νν ≡ 0 in the fit described below.
The two components E miss x and E miss y of the E miss T provide two further constraints, albeit each with an experimental resolution of [10][11][12][13][14][15]76].
The fact that the reconstruction of the τ-pair decay kinematics is underconstrained by the measured observables is addressed by a maximum likelihood fit method. The mass, m ττ , is reconstructed by combining the measured observables E miss x and E miss y with a likelihood model that includes terms for the τ-lepton decay kinematics and the E miss T resolution. The likelihood function f ( z, y, a 1 , a 2 ) of the parameters z = (E miss x , E miss y ) in an event is constructed, given that the unknown parameters specifying the kinematics of the two τ-lepton decays have values a 1 = (x 1 , φ 1 , m νν,1 ) and a 2 = (x 2 , φ 2 , m νν,2 ), and that the four-momenta of the visible decay products have the measured values y = (p vis 1 , p vis 2 ). This likelihood model is used to compute the probability as a function of the mass hypothesis m i ττ . The best estimate,m ττ , for m ττ is taken to be the value of m i ττ that maximizes P(m i ττ ). The likelihood f ( z, y, a 1 , a 2 ) is the product of three likelihood functions: the first two functions model the decay parameters a 1 and a 2 of the two τ leptons, and the last one quantifies the compatibility of a τ-pair decay hypothesis with the measured E miss T . The likelihood functions modelling the τ-lepton decay kinematics are different for leptonic and hadronic τ-lepton decays. Matrix elements for unpolarized τ-lepton decays from ref. [77] are used to model the differential distributions in the leptonic decays, within the physically allowed region 0 ≤ x ≤ 1 and 0 ≤ m νν ≤ m τ √ 1 − x. For hadronic τlepton decays, a model based on the two-body phase space [78] is used, treating all the visible decay products of the τ lepton as a single system, within the physically allowed region m 2 vis /m 2 τ ≤ x ≤ 1. It has been verified that the two-body phase space model is adequate for representing hadronic τ-lepton decays by comparing distributions generated by a parameterized MC simulation based on the two-body phase space model with results from the detailed simulation implemented in TAUOLA. The likelihood functions for hadronic (leptonic) τ-lepton decays do not depend on the parameters x, φ, and m νν (x and φ). The dependence on x enters via the integration boundaries. The dependence on φ comes from the likelihood function L ν , which quantifies the compatibility of a τ-lepton decay hypothesis with the reconstructed E miss T in an event, assuming the neutrinos from the τ-lepton decays to be the only source of missing transverse energy. This likelihood function is defined as In this expression, the expected E miss T resolution is represented by the covariance matrix V, estimated on an event-by-event basis using a E miss T significance algorithm [76]; |V| is the determinant of this matrix.
The relative m ττ resolution achieved by the SVFIT algorithm is estimated from simulation and found to be about 10% in the τ h τ h decay channel, 15% in the τ h channels, and 20% in the channels. The resolution varies at the level of a few percent between the different event categories defined in section 6 because in some categories events with a boosted (i.e. high-p T ) Higgs boson candidate and thus a better E miss T resolution are selected. The m ττ resolution for each channel and each category is listed in table 4 of appendix B. Figure 3 shows the normalized distributions of m vis and m ττ in the µτ h channel after the baseline selection for simulated Z → ττ events and simulated SM Higgs boson events with m H = 125 GeV. The SVFIT mass reconstruction allows for a better separation between signal and background than m vis alone, yielding an improvement in the final expected significance of ∼40%.  In the case of Higgs boson production in association with a W boson, the neutrino from the W-boson decay is an additional source of E miss T . Therefore, in the + Lτ h channels, the signal 11 is extracted from the distribution of the visible mass, m vis , of the Lτ h system. In the + τ h channels, the visible mass is calculated from the τ h and the electron or muon with smaller p T .

Event categories
The event sample is split into mutually exclusive categories, defined to maximize the sensitivity of the analysis to the presence of a SM Higgs boson with a mass, m H , between 110 and 145 GeV. The categories for the LL channels are schematically represented in figure 4 and described below.
In each channel, events are first classified according to the number of reconstructed jets with transverse momentum and pseudorapidity p j T > 30 GeV and |η j | < 4.7, and a separation in (η, φ) space between the jet and all selected leptons of ∆R jL > 0.5. In all categories, events containing at least one b-tagged jet with p j T > 20 GeV are rejected to reduce the tt background. In the µτ h , eτ h , τ h τ h and eµ channels, events with at least two jets are further required to pass a set of criteria targeting signal events where the Higgs boson is produced via VBF, i.e. in association with two jets separated by a large pseudorapidity gap. This "VBF tag" strongly suppresses the background, in particular the irreducible Z → ττ background. This background is suppressed for two main reasons: first, because the requirement of two high-p T jets is effective in rejecting the gluon-initiated jets from initial state radiation in the Drell-Yan production process; second, because such jets are typically produced in the central region of the detector. The VBF-tagged category consists of events for which the two highest-p T jets have a large invariant mass, m jj , and a large separation in pseudorapidity, |∆η jj |. A central-jet veto is applied by not allowing any additional jet in the pseudorapidity region delimited by the two highest-p T jets. For the analysis of the 8 TeV data, the VBF-tagged category is further split into tight and loose sub-categories. In the ee and µµ channels, events with two jets or more are not required to fulfil any additional selection criteria, except for the central-jet veto. Instead, a multivariate discriminant involving m jj and |∆η jj | is used to extract the signal, as described in section 9.
Events failing the VBF tag requirements, or the 2-jet category selection in the case of the ee and µµ channels, are collected in the 1-jet category if they contain at least one jet, and in the 0-jet category otherwise. The latter has low sensitivity to the presence of a SM Higgs boson and is mainly used to constrain the Z → ττ background for the more sensitive categories. The τ h τ h channel does not feature a 0-jet category because of the large background from QCD multijet events.
The 1-jet and 2-jet categories are further split according to the transverse momentum of the Higgs boson candidate, defined as where p T L and p T L denote the transverse momenta of the two leptons. The p ττ T variable is used to select sub-categories in which the Higgs boson candidate is boosted in the transverse plane. The m ττ resolution is improved for such events and a better separation between the H → ττ signal and the Z → ττ background is achieved. This selection also has the advantage of reducing the QCD multijet background which is especially large in the τ h τ h channel.
The 0-jet and 1-jet categories are further divided into low and high p L T categories, where (i) L = τ h in the τ h channels, (ii) L = µ in the eµ channel, and (iii) L is the highest-p T lepton in the ee and µµ channels. For m H > m Z , higher-p T leptons are produced in the H → ττ process than in the Z → ττ process. Selecting high-p T leptons also reduces the contribution of background  Figure 4: Event categories for the LL channels. The p ττ T variable is the transverse momentum of the Higgs boson candidate. In the definition of the VBF-tagged categories, |∆η jj | is the difference in pseudorapidity between the two highest-p T jets, and m jj their invariant mass. In the µµ and ee channels, events with two or more jets are not required to fulfil any additional VBF tagging criteria. For the analysis of the 7 TeV eτ h and µτ h data, the loose and tight VBFtagged categories are merged into a single VBF-tagged category. In the eτ h channel, the E miss T is required to be larger than 30 GeV in the 1-jet category. Therefore, the high-p τ h T category is not used and is accordingly crossed out. The term "baseline" refers to the baseline selection described in section 4.

13
events in which a jet is misidentified as a lepton. Figure 5 demonstrates a good modelling of the p ττ T and p τ h T distributions for the µτ h channel, after the baseline selection. The electroweak background contribution includes events from W + jets, diboson, and single-top-quark production. The "bkg. uncertainty" band represents the combined statistical and systematic uncertainty in the background yield in each bin. In each plot, the bottom inset shows the ratio of the observed and predicted numbers of events. The expected contribution from the SM Higgs signal is negligible.
In the 1-jet category of the eτ h channel, the background from Z → ee events in which an electron is misidentified as a τ h is reduced by requiring E miss T > 30 GeV. This extra requirement makes it difficult to predict the m ττ distribution for the Z → ττ background events in the 1-jet high-p τ h T category. This category, which has relatively low sensitivity, is therefore ignored in the eτ h channel.
For the 8 TeV dataset, events in the + τ h channels are categorized according to L T with a threshold at 130 GeV. The + τ h τ h and + LL samples are not split into categories.

Background estimation
The estimation of the shape and yield of the major backgrounds in each channel is based on the observed data. The experimental systematic uncertainties affecting the background shapes and yields are thus directly related to the background estimation techniques and are also discussed in this section.
In the µτ h , eτ h , τ h τ h , and eµ channels, the largest source of background is the Drell-Yan production of Z → ττ. This contribution is greatly reduced by the 1-jet and VBF tag selection criteria as the jet-multiplicity distribution in Drell-Yan production falls off steeply. It is modelled using "embedded" event samples recorded in each data-taking period under a loose Z → µµ selection. In each event, the PF muons are replaced by the PF particles reconstructed from the visible decay products of the τ leptons in simulated Z → ττ events, before reconstructing the E miss T , the 14 7 Background estimation jets, the τ h candidates, and the lepton isolation. The Drell-Yan event yield is rescaled to the observed yield using the inclusive sample of Z → µµ events; thus, for this dominant background, the systematic uncertainties in the jet energy scale, the missing transverse energy, and the luminosity measurement are negligible. Additional uncertainties arise due to the extrapolation to the different event categories. These include uncertainties in the event reconstruction and acceptance of the "embedded" event samples that are estimated in simulated events as well as statistical uncertainties due to the limited number of events in these samples. In the eτ h and µτ h channels, the largest remaining systematic uncertainty affecting the Z → ττ background yield is due to the τ h selection efficiency. This uncertainty, which includes the uncertainty in the efficiency to trigger on a τ h , is estimated to be 8% in an independent study based on a tag-and-probe method [79] and, in addition, a µτ h event sample recorded with single-muon triggers.
The Drell-Yan production of Z → is the largest background in the channels. The Z → event yield is normalized to the data in each category after subtracting all backgrounds. In the eτ h channel, Z → production is also an important source of background because of the 2-3% probability for electrons to be misidentified as a τ h [36] and the fact that the reconstructed m ττ distribution peaks in the Higgs boson mass search range. Because of the lower µ → τ h misidentification rate, the Z → contribution in the µτ h channel is small. The contribution of this background is estimated from simulation in both channels, after rescaling the simulated Drell-Yan yield to the one derived from Z → µµ data. The dominant systematic uncertainty in the Z → background yields arises from the → τ h misidentification rate. This uncertainty is estimated using the tag-and-probe method with Z → event samples, and is found to be 20% for electrons and 30% for muons. The small contribution from Z → events in the µτ h , eτ h , and τ h τ h channels where a lepton is lost and a jet is misidentified as a τ h candidate is also estimated from simulation. Depending on the event category, the uncertainties range from 20% to 80%, including uncertainties in the jet → τ h misidentification rate and statistical uncertainties due to the limited number of simulated events.
The background from W + jets production contributes significantly to the eτ h and µτ h channels when the W boson decays leptonically and a jet is misidentified as a τ h . The background shape for these channels is modelled using the simulation. Figure 6 shows the observed and predicted m T distribution obtained in the 8 TeV µτ h channel after the baseline selection but without the m T < 30 GeV requirement. In each category, the W + jets background yield in a high-m T control region is normalized to the observed yield. The extrapolation factor to the lowm T signal region is obtained from the simulation and has an estimated systematic uncertainty of 10% to 25%, depending on the event category. The uncertainty is estimated by comparing the m T distribution in simulated and recorded Z(→ µµ) + jets events in which a reconstructed muon is removed from the event to emulate W + jets events. In the high-m T region of figure 6 the observed and predicted yields match by construction, and the agreement in shape indicates good modelling of the E miss T in the simulation. In the VBF-tagged categories, where the number of simulated W + jets events is small, smooth m ττ templates are obtained by loosening the VBF selection criteria. The m ττ bias introduced in doing so was found to be negligible in a much larger sample of events obtained by relaxing the m T selection. For the τ h τ h channel, a µτ h control sample is used to define the same categories as in the τ h τ h channel. In each of these categories, the W + jets background is normalized to the yield observed in the high-m T control region through a factor that is then used to scale the W + jets background in the τ h τ h sample, with a 30% systematic uncertainty.
The tt production process is one of the main backgrounds in the eµ channel. Its shape for all LL channels is predicted by the simulation, and the yield is adjusted to the one observed The electroweak background contribution includes events from W + jets, diboson, and singletop-quark production. The "bkg. uncertainty" band represents the combined statistical and systematic uncertainty in the background yield in each bin. The bottom inset shows the ratio of the observed and predicted numbers of events. The expected contribution from a SM Higgs signal is negligible.
using a tt-enriched control sample, extracted by requiring b-tagged jets in the final state. The systematic uncertainty in the yield includes, among others, the systematic uncertainty in the b-tagging efficiency, which ranges from 1.5% to 7.4% depending on the b-tagged jet p T [30]. Furthermore, it is affected by systematic uncertainties in the jet energy scale, the E miss T scale, and the background yields in the control sample. Figure 7 shows a good agreement between the observed and predicted distributions for the number of jets after the baseline selection in the 8 TeV eµ analysis, in particular for events with three or more jets, for which the tt process dominates.
QCD multijet events, in which one jet is misidentified as a τ h and another as an , constitute another important source of background in the τ h channels. In the 0-jet and 1-jet low-p τ h T categories that have a high event yield, the QCD multijet background yield is obtained using a control sample where the and the τ h are required to have the same charge. In this control sample, the QCD multijet yield is obtained by subtracting from the data the contribution of the Drell-Yan, tt, and W + jets processes, estimated as explained above. The expected contribution of the QCD multijet background in the opposite-charge signal sample is then derived by rescal- ing the yield obtained in the same-charge control sample by a factor of 1.06, which is measured using a pure QCD multijet sample obtained by inverting the isolation requirement and relaxing the τ h isolation requirement. The 10% systematic uncertainty in this factor accounts for a small dependence on p τ h T and the statistical uncertainty in the measurement, and dominates the uncertainty in the yield of this background contribution. In the VBF-tagged and 1-jet high-p τ h T boosted categories, the number of events in the same-charge control sample is too small to use this procedure. Instead, the QCD multijet yield is obtained by multiplying the QCD multijet yield estimated after the baseline selection by the category selection efficiency. This efficiency is measured using a sample dominated by QCD multijet production in which the and the τ h are not isolated. The yield is affected by a 20% systematic uncertainty. In all categories, the m ττ template is obtained from a same-charge control region in which the isolation requirement is inverted. In addition, the VBF tagging and the τ h isolation criteria are relaxed in the VBF-tagged and 1-jet high-p τ h T boosted categories, respectively, to obtain a smooth template shape.
In the τ h τ h channel, the large QCD multijet background is estimated from a control region with a relaxed τ h isolation requirement, disjoint from the signal region. In this region, the QCD multijet background shape and yield are obtained by subtracting from the observed data the contribution of the Drell-Yan, tt, and W + jets processes, estimated as explained above. The QCD multijet background yield in the signal region is obtained by multiplying the yield in the control region by an extrapolation factor, obtained using identical signal region and control region definitions applied to a sample of same-charge τ h τ h events. Depending on the event category, the systematic uncertainty in this yield is estimated to range from 35% to 50%. The uncertainty includes contributions from the limited number of events in the control region and uncertainties in the expected yields of the subtracted background components. The QCD multijet background shape in the signal region is taken to be the same as in the control region with a relaxed τ h isolation requirement, an assumption that is verified by comparing the QCD multijet shapes obtained in the signal region and in the control region of the same-charge sample.
The small background due to W + jets and QCD multijet production in the eµ channel corresponds to events in which one or two jets are misidentified as leptons, and is denoted as the "misidentified-" background. A misidentified-control region is defined by requiring the to pass relaxed selection criteria, and to fail the nominal selection criteria. The expected contribution from processes with a dilepton final state is subtracted. The number of events N in the signal region in which a jet is misidentified as an is estimated as the yield in the misidentifiedcontrol region multiplied by the ratio between the yields measured in the signal and control regions using a multijet sample. The procedure is applied separately for electrons and muons, leading to the estimation of N e and N µ . The number of events for which two jets are misidentified as an electron and a muon, N eµ , is estimated from a control region in which both the electron and the muon pass the relaxed selection criteria and fail the nominal selection criteria. The background yield in the signal region is then estimated as N e + N µ − N eµ . In the 0-jet and 1-jet low-p τ h T categories, the m ττ template is taken from a same-charge control region with inverted electron isolation requirement. In the 1-jet high-p τ h T and VBF-tagged categories, the number of events in this control region is too small and the template is instead taken from the opposite-charge misidentified-electron control region.
The small contributions from diboson and single-top-quark production in the LL channels are taken from simulation. For m H = 125 GeV, the contribution from H → WW decays amounts to up to 45% of the expected SM Higgs boson signal in the VBF-tagged categories in the eµ channel, and to up to 60% of the complete expected SM Higgs boson signal in the 2-jet categories of the ee and µµ channels. In all other LL channels, the contribution from H → WW decays is negligible.
In the + Lτ h channels, the irreducible background is due to WZ and ZZ production, while the reducible background comes from QCD multijet, W + jets, Z + jets, W + γ, Z + γ, and tt production. In the + LL channels, ZZ production and tt production in association with a Z boson constitute the sources of irreducible background; the reducible background comes from WZ + jets, Z + jets, and tt production. Events from reducible background sources contain at least one jet misidentified as a lepton. For each channel, the reducible background contribution is estimated from sidebands in which one or more lepton candidates satisfy relaxed selection criteria but do not satisfy the nominal selection criteria. The number of reducible background events for which all leptons satisfy the nominal selection criteria is obtained by weighting the sideband events according to the probability for a lepton passing the relaxed selection criteria to also pass the nominal selection criteria. These misidentification probabilities are measured in independent control samples of QCD multijet, W(→ ν) + jets, and Z(→ ) + jets events with one lepton passing the relaxed selection criteria in addition to the well-identified leptons corresponding to the decay of the W or Z boson, if any. The misidentification probabilities are parameterized as a function of the lepton p T , the number of jets in the event, and, in case a jet is found close to the lepton, the jet p T . To obtain a smooth template shape, the isolation 8 Systematic uncertainties criteria for the leptons associated to the Higgs boson decay are relaxed in the + LL channels. The systematic uncertainties in the event yield of the reducible background components range from 15 to 30%. They include contributions from the limited number of events in the sideband, uncertainties in the estimation of the misidentification probabilities, and uncertainties in the background composition in the sidebands.

Systematic uncertainties
The values of a number of imprecisely known quantities can affect the rates and shapes of the m ττ distributions for the signal and background processes. These systematic uncertainties can be grouped into theory related uncertainties, which are predominantly relevant for the expected signal yields, and into uncertainties from experimental sources, which can further be subdivided into uncertainties related to the reconstruction of physics objects and uncertainties in the background estimation. The uncertainties related to the reconstruction of physics objects apply to processes estimated with simulated samples, most importantly the signal processes. As outlined in the previous section, the distributions for several background processes are estimated from data, and the corresponding systematic uncertainties are therefore mostly uncorrelated with the ones affecting the signal distributions.
The main experimental uncertainties in the decay channels involving a τ h are related to the reconstruction of this object. The τ h energy scale is obtained from a template fit to the τ h mass distribution, such as the one shown in figure 2. In this fit, the shape of the mass distribution is morphed as a function of the τ h energy scale parameter. The uncertainty of ±3% in the energy scale of each τ h affects both the shape and the rate of the relevant signal and background distributions in each category. The τ h identification and trigger efficiencies for genuine τ leptons sum up to an overall rate uncertainty of 6 to 10% per τ h , depending on the decay channel, due to the different trigger and rejection criteria and additional uncertainties for higher p τ h T . For Z → events where jets, muons, or electrons are misidentified as τ h , the estimation of the τ h identification efficiency leads to rate uncertainties of 20 to 80%, including the statistical uncertainty due to the limited number of simulated events.
In the decay channels with muons or electrons, the uncertainties in the muon and electron identification, isolation, and trigger efficiencies lead to rate uncertainties of 2 to 4% for muons and 2 to 6% for electrons. The uncertainty in the electron energy scale is relevant only in the eµ and ee channels, where it affects the normalization and shape of the simulated m ττ and final discriminant distributions. The uncertainty in the muon energy scale is found to have a negligible effect for all channels. The relative E miss T scale uncertainty of 5% affects the event yields for all channels making use of the E miss T in the event selection, in particular for the τ h channels due to the m T selection [32]. This translates into yield uncertainties of 1 to 12%, depending on the channel and the event category. The uncertainties are largest for event categories with a minimum E miss T requirement and for background contributions with no physical source of E miss T , e.g. the Z → ee contribution in the high-p τ h T boosted category in the eτ h channel. The uncertainty in the jet energy scale varies with jet p T and jet η [29] and leads to rate uncertainties for the signal contributions of up to 20% in the VBF-tagged categories. For the most important background samples, the effect on the rate is, however, well below 5%. Because of the veto of events with b-tagged jets, uncertainties in the tagging efficiency for b-quark jets and in the mistagging efficiency for c-quark, light-flavour, and gluon jets result in rate uncertainties of up to 8% for the different signal and background components. The uncertainty in the integrated luminosity amounts to 2.2% for the 7 TeV analysis [80] and 2.6% for the 8 TeV analysis [81], yielding corresponding rate uncertainties for the affected signal and background samples.
The uncertainties related to the estimation of the different background processes are discussed in detail in the previous section, and only a summary is given here. For the different Drell-Yan decay modes, the uncertainty in the inclusive Z → ττ yield is 3%, with additional extrapolation uncertainties in the different categories in the range of 2 to 14%. The uncertainties in the W + jets event yields estimated from data are in the range of 10-100%. The values are dominated by the statistical uncertainties involved in the extrapolation from high to low m T and due to the limited number of data events in the high-m T control region. As a consequence, they are treated as uncorrelated with any other uncertainty. The QCD multijet background estimation results in 6 to 35% rate uncertainties for the LL channels, except for the very pure dimuon final state and the VBF-tagged categories where uncertainties of up to 100% are estimated. Additional shape uncertainties are included in the eτ h , µτ h , and eµ channels to account for the uncertainty in the shape estimation from the control regions.
The rate and acceptance uncertainties for the signal processes related to the theoretical calculations are due to uncertainties in the parton distribution functions (PDF), variations of the renormalization and factorization scales, and uncertainties in the modelling of the underlying event and parton showers. The magnitude of the rate uncertainty depends on the production process and on the event category. In the VBF-tagged categories, the theoretical uncertainties concerning the qq → H process are 4% from the PDFs and 3% from scale variations. The rate and acceptance uncertainties in the gg → H process in the VBF-tagged categories are estimated by comparing the four different MC generators POWHEG, MADGRAPH, POWHEG interfaced with MINLO [82], and aMC@NLO [83]. They amount to 30% and thus become of similar absolute size as the uncertainties in the qq → H process.
For the gg → H process, additional uncertainties are incorporated to account for missing higher-order corrections ranging from 10 to 41% depending on the category and on the decay channel. The combined systematic uncertainty in the background yield arising from diboson and single-top-quark production processes is estimated to be 15% for the LL channels based on recent CMS measurements [84,85]. In the + Lτ h and + LL channels, the uncertainties in the event yields of WZ production and ZZ production arise from scale variations and uncertainties in the PDFs, including the PDF uncertainties in the gg → ZZ event yields which are 44%. The resulting overall uncertainties range from 4 to 8%. The uncertainties in the small background from tt + Z production in the + LL channels amount to 50% [86].
In addition, uncertainties due to the limited number of simulated events or due to the limited number of events in data control regions are taken into account. These uncertainties are uncorrelated across bins in the individual templates. A summary of the considered systematic uncertainties is given in table 3.

Results
The search for an excess of SM Higgs boson events over the expected background involves a global maximum likelihood fit based on final discriminating variables which are either m ττ or m vis in all channels except for ee and µµ [87,88]. In these two channels the final discriminating variable D is built for a given event from the output of two boosted decision trees B 1 and B 2 . The two BDTs are based on kinematic variables related to the system and the E miss T , on the distance of closest approach between the leptons, and, in the 2-jet category, the m jj and |∆η jj | variables. The first BDT is trained to discriminate Z → ττ from Z → events, whereas the second BDT is trained to discriminate H → ττ from Z → ττ events. Both BDTs are separately trained in the 2-jet category and in the combined 0-jet and 1-jet categories. The final discrimi- In this expression, f sig is the two-dimensional joint probability density for the signal. Therefore, D represents the probability for a signal event to have a value lower than B 1 for the first BDT and B 2 for the second BDT.        For the global fit, the distributions of the final discriminating variable obtained for each category and each channel at 7 and 8 TeV are combined in a binned likelihood, involving the expected and observed numbers of events in each bin. The expected number of signal events is the one predicted by the SM for the production of a Higgs boson of mass m H decaying into a pair of τ leptons, multiplied by a signal strength modifier µ treated as free parameter in the fit.
The systematic uncertainties are represented by nuisance parameters that are varied in the fit according to their probability density function. A log-normal probability density function is assumed for the nuisance parameters affecting the event yields of the various background contributions. Systematic uncertainties that affect the template shapes, e.g. the τ h energy scale uncertainty, are represented by nuisance parameters whose variation results in a continuous perturbation of the spectrum [89] and which are assumed to have a Gaussian probability density function.
Nuisance parameters affect the yields and template shapes across categories and channels when applicable. For example, in the VBF-tagged categories of the µτ h channel, the most important nuisance parameters related to background normalization are the ones affecting the Z → ττ yield (τ h selection efficiency) and the W + jets yield (statistical uncertainty for the normalization to the yield in the high-m T region, and extrapolation to the low-m T region). The nuisance parameter affecting the W + jets yield is constrained only by the events observed in the given category, in particular in the high-mass region. The nuisance parameters related to the τ h identification efficiency and to the τ h energy scale are, however, mostly constrained by the 0-jet and 1-jet categories, for which the number of events in the Z → ττ peak is very large. Overall, the statistical uncertainty in the observed event yields is the dominant source of uncertainty for all combined results.
The excess of events observed in the most sensitive categories of figures 8 and 9 is highlighted in figure 11, which shows the observed and expected m ττ distributions for all categories of the τ h , eµ, and τ h τ h channels combined. The ee and µµ channels are not included because the final discrimination is based on D instead of m ττ in these channels. The distributions are weighted in each category of each channel by the S/(S + B) ratio where S is the expected signal yield for a SM Higgs boson with m H = 125 GeV (µ = 1) and B is the predicted background yield corresponding to the result of the global fit. The ratio is obtained in the central m ττ interval containing 68% of the signal events. The figure also shows the difference between the observed data and expected background distributions, together with the expected distribution for a SM Higgs boson signal with m H = 125 GeV.
The visible excess in the weighted m ττ distribution is quantified by calculating the corresponding local p-values for the LL channels using a profile-likelihood ratio test statistics [87,88]. Figure 12 shows the distribution of local p-values and significances as a function of the Higgs boson mass hypothesis. The expected significance for a SM Higgs boson with m H = 125 GeV is 3.6 standard deviations. For m H between 110 and 130 GeV, the observed significance is larger than three standard deviations, and equals 3.4 standard deviations for m H = 125 GeV. The corresponding best-fit value for µ isμ = 0.86 ± 0.29 at m H = 125 GeV.
The m vis or m ττ distributions obtained for the 8 TeV dataset in the + Lτ h and + LL channels are shown in figure 13. Because of the small number of expected events in each event category, different event categories are combined. The complete set of distributions is presented in appendix A, and the event yields for the individual event categories are given in table 6 in appendix B.
The following results include all decay channels considered. Figure 14 Figure 11: Combined observed and predicted m ττ distributions for the µτ h , eτ h , τ h τ h , and eµ channels. The normalization of the predicted background distributions corresponds to the result of the global fit. The signal distribution, on the other hand, is normalized to the SM prediction (µ = 1). The distributions obtained in each category of each channel are weighted by the ratio between the expected signal and signal-plus-background yields in the category, obtained in the central m ττ interval containing 68% of the signal events. The inset shows the corresponding difference between the observed data and expected background distributions, together with the signal distribution for a SM Higgs boson at m H = 125 GeV. The distribution from SM Higgs boson events in the WW decay channel does not significantly contribute to this plot.
95% CL upper limit obtained using the modified frequentist construction CL s [90,91] together with the expected limit obtained for the background-only hypothesis for Higgs boson mass hypotheses ranging from 90 to 145 GeV. The background-only hypothesis includes the expected contribution from H → WW decays for m H = 125 GeV. The difference between evaluating this contribution at m H = 125 GeV or at the corresponding m H value for m H = 125 GeV is less than 5%. An excess is visible in the observed limit with respect to the limit expected for the background-only hypothesis. The observed limit is compatible with the expected limit obtained in the signal-plus-background hypothesis for a SM Higgs boson with m H = 125 GeV (figure 14 right). The excess is quantified in figure 15 which shows the local p-value as a function of m H . For m H = 125 GeV, the expected p-value is smallest, corresponding to a significance of 3.7 standard deviations. The expected p-value is slightly smaller when including the + Lτ h and + LL channels. The observed p-value is minimal for m H = 120 GeV with a significance of 3.3 standard deviations. The observed significance is larger than three standard deviations for m H between 115 and 130 GeV, and is equal to 3.2 standard deviations for m H = 125 GeV.
The best-fit value for µ, combining all channels, isμ = 0.78 ± 0.27 at m H = 125 GeV. Figure 16 shows the results of the fits performed in each decay channel for all categories, and in each category for all decay channels. These compatibility tests do not constitute measurements of any physical parameter per se, but rather show the consistency of the various observations with the expectation for a SM Higgs boson with m H = 125 GeV. The uncertainties of the individual µ values in the 1-jet and 2-jet (VBF-tagged) categories are of similar size, showing that both contribute about equally to the sensitivity of the analysis. The fraction of signal events from VBF production in the 1-jet categories and of signal events produced via gluon-gluon fusion in the 2-jet (VBF-tagged) categories are each of the order of 20 to 30%; hence, it is not possible to fully disentangle the two production modes.
The combined distribution of the decimal logarithm log(S/(S + B)) obtained in each bin of the final discriminating variables for all event categories and channels is shown in figure 17. Here, S denotes the expected signal yield for a SM Higgs boson with m H = 125 GeV (µ = 1) and B denotes the expected background yield in a given bin. The plot illustrates the contribution from the different event categories that are sensitive to the different Higgs boson production mechanisms. In addition, it provides a visualization of the observed excess of data events over the background expectation in the region of high S/(S + B).   Additional contributions to the overall uncertainty of the mass measurement arise due to uncertainties in the absolute energy scale and its variation with p T of 1 to 2% for τ h candidates, electrons, muons, and the E miss T , summing up to an uncertainty of 4 GeV. Given the coarse m H granularity, a parabolic fit is performed to −2∆ ln L values below 4. The combined measured mass of the Higgs boson is m H = 122 ± 7 GeV. Figure 18 right shows a likelihood scan in the two-dimensional (κ V , κ f ) parameter space for m H = 125 GeV. The κ V and κ f parameters quantify the ratio between the measured and the SM value for the coupling of the Higgs boson to vector bosons and fermions, respectively [49]. To consistently measure deviations of the fermionic and the bosonic couplings of the Higgs boson, the H → WW contribution is considered as a signal process in this likelihood scan. For the VBF production of a Higgs boson that decays to a WW pair, the bosonic coupling enters both in the production and in the decay, thus providing sensitivity to the bosonic coupling despite the small expected event rates. All nuisance parameters are profiled for each point in the parameter space. The observed likelihood contour is consistent with the SM expectation of κ V = κ f = 1.

Summary
We report a search for the standard model Higgs boson decaying into a pair of τ leptons. The search is based on the full proton-proton collision sample recorded by CMS in 2011 and 2012, corresponding to an integrated luminosity of 4.9 fb −1 at a centre-of-mass energy of 7 TeV and 19.7 fb −1 at 8 TeV. The analysis is performed in six channels corresponding to the final states µτ h , eτ h , τ h τ h , eµ, µµ, and ee. The gluon-gluon fusion and vector-boson fusion production of a Higgs boson are probed in the one-jet and two-jet final states, respectively, whereas the production of a Higgs boson in association with a W or Z boson decaying leptonically is targeted by requiring additional electrons or muons in the final state. An excess of events over the background-only hypothesis is observed with a local significance in excess of 3 standard deviations for Higgs boson mass hypotheses between m H = 115 and 130 GeV, and equal to 3.2 standard deviations at m H = 125 GeV, to be compared to an expected significance of 3.7 standard deviations. The best fit of the observed H → ττ signal cross section times branching fraction for m H = 125 GeV is 0.78 ± 0.27 times the standard model expectation. Assuming that this excess corresponds to a Higgs boson decaying to ττ, its mass is measured to be m H = 122 ± 7 GeV. These results constitute evidence for the coupling between the τ lepton and the 125 GeV Higgs boson discovered in 2012 by the ATLAS and CMS Collaborations.                         Table 4: Observed and predicted event yields in all event categories of the µτ h , eτ h , τ h τ h , and eµ channels in the full m ττ mass range. The event yields of the predicted background distributions correspond to the result of the global fit. The signal yields, on the other hand, are normalized to the standard model prediction. The different signal processes are labelled as ggH (gluon-gluon fusion), VH (production in association with a W or Z boson), and VBF (vector-boson fusion). The S S+B variable denotes the ratio of the signal and the signal-plus-background yields in the central m ττ range containing 68% of the signal events for m H = 125 GeV. The RMS variable denotes the standard deviation of the m ττ distribution for corresponding signal events. SM