Observation of a new boson with mass near 125 GeV in pp collisions at sqrt(s) = 7 and 8 TeV

A detailed description is reported of the analysis used by the CMS Collaboration in the search for the standard model Higgs boson in pp collisions at the LHC, which led to the observation of a new boson. The data sample corresponds to integrated luminosities up to 5.1 inverse femtobarns at sqrt(s) = 7 TeV, and up to 5.3 inverse femtobarns at sqrt(s) = 8 TeV. The results for five Higgs boson decay modes gamma gamma, ZZ, WW, tau tau, and bb, which show a combined local significance of 5 standard deviations near 125 GeV, are reviewed. A fit to the invariant mass of the two high resolution channels, gamma gamma and ZZ to 4 ell, gives a mass estimate of 125.3 +/- 0.4 (stat) +/- 0.5 (syst) GeV. The measurements are interpreted in the context of the standard model Lagrangian for the scalar Higgs field interacting with fermions and vector bosons. The measured values of the corresponding couplings are compared to the standard model predictions. The hypothesis of custodial symmetry is tested through the measurement of the ratio of the couplings to the W and Z bosons. All the results are consistent, within their uncertainties, with the expectations for a standard model Higgs boson.


Introduction
A light Higgs boson has a natural width of a few MeV [25], and therefore the precision of the mass measurement from fully reconstructed decays would be limited by the detector resolution. The first two channels, H → γγ and H → ZZ → 4 , produce a narrow mass peak. These two high-resolution channels were used to measure the mass of the newly observed particle [22,23].
In the SM, the properties of the Higgs boson are fully determined once its mass is known. All cross sections and decay fractions are predicted [25,26], and thus the measured rates into each channel provide a test of the SM. The individual measurements can be combined, and from them the coupling constants of the Higgs boson with fermions and bosons can be extracted.
The measured values can shed light on the nature of the newly observed particle because the Higgs boson couplings to fermions are qualitatively different from those to bosons.
The data described in this paper are identical to those reported in the observation publication [23]. The main focus of this paper is an in-depth description of the five main analyses and a more detailed comparison of the various channels with the SM predictions by evaluating the couplings to fermions and vector bosons, as well as various coupling ratios.
The paper is organized into several sections. Sections 2 and 3 contain a short description of the CMS detector and the event reconstruction of physics objects relevant for the Higgs boson search. Section 4 describes the data sample, the Monte Carlo (MC) event generators used for the signal and background simulation, and the evaluation of the signal sensitivity. Then the analyses of the five decay channels are described in detail in Sections 5 to 9. In the last section, the statistical method used to combine the five channels and the statistical treatment of the systematic uncertainties are explained. Finally, the results are combined and the first measurements of the couplings of the new particle to bosons and fermions are presented.

The CMS experiment
The discovery capability for the SM Higgs boson is one of the main benchmarks that went into optimizing the design of the CMS experiment [27][28][29][30].
The central feature of the detector [30] is a superconducting solenoid 13 m long, with an internal diameter of 6 m. The solenoid generates a uniform 3.8 T magnetic field along the axis of the LHC beams. Within the field volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass/scintillator hadron calorimeter (HCAL). Muons are identified and measured in gas-ionization detectors embedded in the outer steel magnetic flux return yoke of the solenoid. The detector is subdivided into a cylindrical barrel and endcap disks on each side of the interaction point. Forward calorimeters complement the coverage provided by the barrel and endcap detectors.
The CMS experiment uses a right-handed coordinate system, with the origin at the nominal interaction point, the x axis pointing to the centre of the LHC, the y axis pointing up (perpendicular to the LHC plane), and the z axis along the anticlockwise-beam direction. The azimuthal angle φ is measured in the x-y plane. The pseudorapidity is defined as η = − ln[tan (θ/2)] where the polar angle θ is measured from the positive z axis. The centre-of-mass momentum of the colliding partons in a proton-proton collision is subject to Lorentz boosts along the beam direction relative to the laboratory frame. Because of this effect, the pseudorapidity, rather than the polar angle, is a more natural measure of the angular separation of particles in the rest frame of the detector.
Charged particles are tracked within the pseudorapidity range |η| < 2.5. The silicon pixel tracker is composed of 66 million pixels of area 100 × 150 µm 2 , arranged in three barrel layers and two endcap disks at each end. The silicon strip tracker, organized in ten barrel layers and twelve endcap disks at each end, is composed of 9.3 million strips with pitch between 80 and 205 µm, with a total silicon surface area of 198 m 2 . The performance of the tracker is essential to most analyses in CMS and has reached the design performance in transversemomentum (p T ) resolution, efficiency, and primary-and secondary-vertex resolutions. The tracker has an efficiency larger than 99% for muons with p T > 1 GeV, a p T resolution between 2 and 3% for charged tracks of p T ≈ 100 GeV in the central region (|η| < 1.5), and unprecedented capabilities for b-jet identification. Measurements of the impact parameters of charged tracks and secondary vertices are used to identify jets that are likely to contain the hadronization and decay products of b quarks ("b jets"). A b-jet tagging efficiency of more than 50% is achieved with a rejection factor for light-quark jets of ≈200, as measured with tt events in data [31]. The dimuon mass resolution at the Υ mass, dominated by instrumental effects, is measured to be 0.6% in the barrel region [32], consistent with the design goal. Due to the high spatial granularity of the pixel detector, the channel occupancy is less than 10 −3 , allowing chargedparticle trajectories to be measured in the high-rate environment of the LHC without loss of performance.
The ECAL is a fine-grained, homogeneous calorimeter consisting of more than 75 000 lead tungstate crystals, arranged in a quasi-projective geometry and distributed in a barrel region (|η| < 1.48) and two endcaps that extend up to |η| = 3.0. The front-face cross section of the crystals is approximately 22 × 22 mm 2 in the barrel region and 28.6 × 28.6 mm 2 in the endcaps. Preshower detectors consisting of two planes of silicon sensors interleaved with a total of three radiation lengths of lead absorber are located in front of the endcaps. Electromagnetic (EM) showers are narrowly distributed in the lead tungstate crystals (Molière radius of 21 mm), which have a transverse size comparable to the shower radius. The precise measurement of the transverse shower shape is the primary method used for EM particle identification, and measurements in the surrounding crystals are used for isolation criteria. The energy resolution of the ECAL is the single most important performance benchmark for the measurement of the Higgs boson decay into two photons and to a lesser extent for the decay to ZZ that subsequently decay to electrons. In the central barrel region, the energy resolution of electrons that interact minimally with the tracker material indicates that the resolution of unconverted photons is consistent with design goals. The energy resolution for photons with transverse energy of ≈60 GeV varies between 1.1% and 2.5% over the solid angle of the ECAL barrel, and from 2.2% to 5% in the endcaps. For ECAL barrel unconverted photons the diphoton mass resolution is estimated to be 1.1 GeV at a mass of 125 GeV.
The HCAL barrel and endcaps are sampling calorimeters composed of brass and plastic scintillator tiles, covering |η| < 3.0. The hadron calorimeter thickness varies from 7 to 11 interaction lengths within the solenoid, depending on |η|; a scintillator "tail catcher" placed outside the coil of the solenoid, just in front of the innermost muon detector, extends the instrumented thickness to more than 10 interaction lengths. Iron forward calorimeters with quartz fibres, read out by photomultipliers, extend the calorimeter coverage up to |η| = 5.0.
Muons are measured in the range |η| < 2.4, with detection planes based on three technologies: drift tubes (|η| < 1.2), cathode strip chambers (0.9 < |η| < 2.4), and resistive-plate chambers (|η| < 1.6). The first two technologies provide a precise position measurement and trigger, whilst the third one provides precise timing information, as well as a second independent trigger. The muon system consists of four stations in the barrel and endcaps, designed to ensure robust triggering and detection of muons over a large angular range. In the barrel region, each muon station consists of twelve drift-tube layers, except for the outermost station, which has eight layers. In the endcaps, each muon station consists of six detection planes. The precision of the r-φ measurement is 100 µm in the drift tubes and varies from 60 to 140 µm in the cathode strip chambers, where r is the radial distance from the beamline and φ is the azimuthal angle.
The CMS trigger and data acquisition systems ensure that data samples with potentially interesting events are recorded with high efficiency. The first-level (L1) trigger, composed of the calorimeter, muon, and global-trigger processors, uses coarse-granularity information to select the most interesting events in less than 4 µs. The detector data are pipelined to ensure negligible deadtime up to a L1 rate of 100 kHz. After L1 triggering, data are transferred from the readout electronics of all subdetectors through the readout network to the high-level-trigger (HLT) processor farm, which assembles the full event and executes global reconstruction algorithms. The HLT filters the data, resulting in an event rate of ≈500 Hz stored for offline processing.
All data recorded by the CMS experiment are accessible for offline analysis through the worldwide LHC computing grid. The CMS experiment employs a highly distributed computing infrastructure, with a primary Tier-0 centre at CERN, supplemented by seven Tier-1, more than 50 Tier-2, and over 100 Tier-3 centres at national laboratories and universities throughout the world. The CMS software running on this high-performance computing system executes a multitude of crucial tasks, including the reconstruction and analysis of the collected data, as well as the generation and detector modelling of MC simulation. Figure 1 shows the distribution of the number of vertices reconstructed per event in the 2011 and 2012 data, and the display of a four-lepton event recorded in 2012. The large number of proton-proton interactions occurring per LHC bunch crossing ("pileup"), on average of 9 in 2011 and 19 in 2012, makes the identification of the vertex corresponding to the hard-scattering process nontrivial, and affects most of the physics objects: jets, lepton isolation, etc. The tracking system is able to separate collision vertices as close as 0.5 mm along the beam direction [33]. For each vertex, the sum of the p 2 T of all tracks associated with the vertex is computed. The vertex for which this quantity is the largest is assumed to correspond to the hard-scattering process, and is referred to as the primary vertex in the event reconstruction. In the H → γγ final state, a large fraction of the transverse momentum produced in the collision is carried by the photons, and a dedicated algorithm, described in Section 5.2, is therefore used to assign the photons to a vertex.

Event reconstruction
A particle-flow (PF) algorithm [34,35] combines the information from all CMS subdetectors to identify and reconstruct the individual particles emerging from all vertices: charged hadrons, neutral hadrons, photons, muons, and electrons. These particles are then used to reconstruct the missing transverse energy, jets, and hadronic τ-lepton decays, and to quantify the isolation of leptons and photons.
Electrons and photons can interact with the tracker material before reaching the ECAL to create additional electrons and photons through pair production and bremsstrahlung radiation. A calorimeter superclustering algorithm is therefore used to combine the ECAL energy deposits that could correspond to a photon or electron. In the barrel region, superclusters are formed from five-crystal-wide areas in η, centred on the locally most-energetic crystal and having a variable extension in φ. In the endcaps, where the crystals are arranged according to an xy rather than η-φ geometry, matrices of 5 × 5 crystals around the most-energetic crystals are merged if they lie within a narrow road in η. The stability and uniformity of the ECAL response must be calibrated at a fraction of a percent to maintain the excellent intrinsic energy resolution of the ECAL [36]. A dedicated monitoring system, based on the injection of laser light into each crystal, is used to track and correct for channel response changes caused by radiation damage and subsequent recovery of the crystals [37]. Response variations are a few percent in the barrel region, and increase up to a few tens of percent in the most-forward endcap regions. The channel-to-channel response is equalized using several techniques that exploit reference signatures from collision events (mainly π 0 , η → γγ) [38]. The residual miscalibration of the channel response varies between 0.5% in the central barrel to a few percent in the endcaps [39]. At the reconstruction level, additional correction factors to the photon energy are applied. These corrections are sizeable for photons that convert before entering the ECAL, for which the resolution is mainly limited by showerloss fluctuations. Given the distribution of the tracker material in front of the ECAL, these effects are sizeable for |η| > 1 [39].
Candidate photons for the H → γγ search are reconstructed from the superclusters, and their identification is discussed in Section 5.3. The photon energy is computed starting from the raw supercluster energy. In the region covered by the preshower detector (|η| > 1.65), the energy recorded in that detector is added. In order to obtain the best resolution, the raw energy is corrected for the containment of the shower in the clustered crystals and for the shower losses of photons that convert in the tracker material before reaching the calorimeter. These corrections are computed using a multivariate regression technique based on the boosted decision tree (BDT) implementation in TMVA [40]. The regression is trained on photons from a sample of simulated events using the ratio of the true photon energy to the raw energy as the target variable. The input variables are the η and φ coordinates of the supercluster, a collection of shower-shape variables, and a set of energy-deposit coordinates defined with respect to the supercluster. A second BDT, using the same input variables, is trained on a separate sample of simulated photons to provide an estimate of the uncertainty in the energy value provided by the first BDT.
The width of the reconstructed Z resonance is used to quantify the ECAL performance, using decays to two electrons whose energies are measured using the ECAL alone, with their direction determined from the tracks. In the 7 TeV data set, the dielectron mass resolution at the Z boson mass, fitting for the measurement contribution separately from the natural width, is 1.56 GeV in the barrel and 2.57 GeV in the endcaps, while in the 8 TeV data sample, reconstructed with preliminary calibration constants, the corresponding values are 1.61 GeV and 3.75 GeV.
Electron reconstruction is based on two methods: the first where an ECAL supercluster is used to seed the reconstruction of a charged-particle trajectory in the tracker [41,42], and the second where a candidate track is used to reconstruct an ECAL supercluster [43]. In the latter, the electron energy deposit is found by extrapolating the electron track to the ECAL, and the deposits from possible bremsstrahlung photons are collected by extrapolating a straight line tangent to the electron track from each tracker layer, around which most of the tracker material is concentrated. In both cases, the trajectory is fitted with a Gaussian sum filter [44] using a dedicated modelling of the electron energy loss in the tracker material. Merging the output of these two methods provides high electron reconstruction efficiency within |η| < 2.5 and p T > 2 GeV. The electron identification relies on a TMVA BDT that combines observables sensitive to the amount of bremsstrahlung along the electron trajectory, the geometrical and momentum matching between the electron trajectory and the associated supercluster, as well as the shower-shape observables.
Muons are reconstructed within |η| < 2.4 and down to a p T of 3 GeV. The reconstruction combines the information from both the silicon tracker and the muon spectrometer. The matching between the tracker and the muon system is initiated either "outside-in", starting from a track in the muon system, or "inside-out", starting from a track in the silicon tracker. Loosely identified muons, characterized by minimal requirements on the track components in the muon system and taking into account small energy deposits in the calorimeters that match to the muon track, are identified with an efficiency close to 100% by the PF algorithm. In some analyses, additional tight muon identification criteria are applied: a good global muon-track fit based on the tracker and muon chamber hits, muon track-segment reconstruction in at least two muon stations, and a transverse impact parameter with respect to the primary vertex smaller than 2 mm.
Jets are reconstructed from all the PF particles using the anti-k T jet algorithm [45] implemented in FASTJET [46], with a distance parameter of 0.5. The jet energy is corrected for the contribution of particles created in pileup interactions and in the underlying event. This contribution is calculated as the product of the jet area and an event-by-event p T density ρ, also obtained with FASTJET using all particles in the event. Charged hadrons, photons, electrons, and muons reconstructed by the PF algorithm have a calibrated momentum or energy scale. A residual calibration factor is applied to the jet energy to account for imperfections in the neutral-hadron calibration, the jet energy containment, and the estimation of the contributions from pileup and underlying-event particles. This factor, obtained from simulation, depends on the jet p T and η, and is of the order of 5% across the whole detector acceptance. Finally, a percent-level correction factor is applied to match the jet energy response in the simulation to the one observed in data. This correction factor and the jet energy scale uncertainty are extracted from a comparison between the data and simulation of γ+jets, Z+jets, and dijet events [47]. Particles from different pileup vertices can be clustered into a pileup jet, or significantly overlap a jet from the primary vertex below the p T threshold applied in the analysis. Such jets are identified and removed using a TMVA BDT with the following input variables: momentum and spatial distribution of the jet particles, charged-and neutral-particle multiplicities, and consistency of charged hadrons within the jet with the primary vertex.
The missing transverse energy (E miss T ) vector is calculated as the negative of the vectorial sum of the transverse momenta of all particles reconstructed by the PF algorithm. The resolution σ(E miss x,y ) on either the x or y component of the E miss T vector is measured in Z → µµ events and parametrized by σ(E miss x,y ) = 0.5 × √ ΣE T , where ΣE T is the scalar sum of the transverse momenta of all particles, with σ and ΣE T expressed in GeV. In 2012, with an average number of 19 pileup interactions, ΣE T ≈ 600 GeV for the analyses considered here.
Jets originating from b-quark hadronization are identified using different algorithms that exploit particular properties of such objects [31]. These properties, which result from the relatively large mass and long lifetime of b quarks, include the presence of tracks with large impact parameters, the presence of secondary decay vertices displaced from the primary vertex, and the presence of low-p T leptons from semileptonic b-hadron decays embedded in the jets [31]. A combined secondary-vertex (CSV) b-tagging algorithm, used in the H → bb and H → ττ searches, makes use of the information about track impact parameters and secondary vertices within jets in a likelihood discriminant to provide separation of b jets from jets originating from gluons, light quarks, and charm quarks. The efficiency to tag b jets and the rate of misidentification of non-b jets depends on the algorithm used and the operating point chosen. These are typically parameterized as a function of the transverse momentum and rapidity of the jets. The performance measurements are obtained directly from data in samples that can be enriched in b jets, such as tt and multijet events.
Hadronically decaying τ leptons (τ h ) are reconstructed and identified using an algorithm [48] which targets the main decay modes by selecting candidates with one charged hadron and up to two neutral pions, or with three charged hadrons. A photon from a neutral-pion decay can convert in the tracker material into an electron and a positron, which can then radiate bremsstrahlung photons. These particles give rise to several ECAL energy deposits at the same η value and separated in azimuthal angle, and are reconstructed as several photons by the PF algorithm. To increase the acceptance for such converted photons, the neutral pions are identified by clustering the reconstructed photons in narrow strips along the φ direction. The τ h from W, Z, and Higgs boson decays are typically isolated from the other particles in the event, in contrast to misidentified τ h from jets that are surrounded by the jet particles not used in the τ h reconstruction. The τ h isolation parameter R τ Iso is obtained from a multivariate discriminator, taking as input a set of transverse momentum sums S j = ∑ i p T,i,j , where p T,i,j is the transverse momentum of a particle i in a ring j centred on the τ h candidate direction and defined in (η, φ) space. Five equal-width rings are used up to a distance ∆R = (∆η) 2 + (∆φ) 2 = 0.5 from the τ h candidate, where ∆η and ∆φ are the pseudorapidity and azimuthal angle differences (in radians), respectively, between the particle and the τ h candidate direction. The effect of pileup on the isolation parameter is mainly reduced by discarding from the S j calculation the charged hadrons with a track originating from a pileup vertex. The contribution of pileup photons and neutral hadrons is handled by the discriminator, which also takes as input the p T density ρ.
The isolation parameter of electrons and muons is defined relative to their transverse momentum p T as where ∑ charged p T , ∑ neut. had. p T , and ∑ γ p T are, respectively, the scalar sums of the transverse momenta of charged hadrons, neutral hadrons, and photons located in a cone centred on the lepton direction in (η, φ) space. The cone size ∆R is taken to be 0.3 or 0.4 depending on the analysis. Charged hadrons associated with pileup vertices are not considered, and the contribution of pileup photons and neutral hadrons is estimated as the product of the neutral-particle p T density ρ neutral and an effective cone area A eff . The neutral-particle p T density is obtained with FASTJET using all PF photons and neutral hadrons in the event, and the effective cone area is slightly different from the actual cone area, being computed in such a way so as to absorb the residual dependence of the isolation efficiency on the number of pileup collisions.

Data sample and analyses performance
The data have been collected by the CMS experiment at a centre-of-mass energy of 7 TeV in the year 2011, corresponding to an integrated luminosity of about 5.1 fb −1 , and a centre-of-mass energy of 8 TeV in the year 2012, corresponding to an integrated luminosity of about 5.3 fb −1 .
A summary of all analyses described in this paper is presented in Table 1, where we list their main characteristics, namely: exclusive final states, Higgs boson mass range of the search, integrated luminosity used, and the approximate experimental mass resolution. The presence of a signal in one of the channels at a certain value of the Higgs boson mass, m H , should manifest itself as an excess in the corresponding invariant-mass distribution extending around that value for a range corresponding to the m H resolution.

Simulated samples
MC simulation samples for the SM Higgs boson signal and background processes are used to optimize the event selection, evaluate the acceptance and systematic uncertainties, and predict the expected yields. They are processed through a detailed simulation of the CMS detector based on GEANT4 [49] and are reconstructed with the same algorithms used for the data. The simulations include pileup interactions properly reweighted to match the distribution of the number of such interactions observed in data. For leading-order generators the default set of parton distribution functions (PDF) used to produce these samples is CTEQ6L [50], while CT10 [51] is employed for next-to-leading-order (NLO) generators. For all generated samples the hadronization is handled by PYTHIA 6.4 [52] or HERWIG++ [53], and the TAUOLA [54] package is used for τ decays. The PYTHIA parameters for the underlying event and pileup interactions are set to the Z2 tune [55] for the 7 TeV data sample and to the Z2* tune [55] for the 8 TeV data sample.
For the four-fermion final states the total cross section is scaled by the branching fraction B(H → 4 ) calculated with the PROPHECY4F program [89,90]. The calculations include NLO QCD and EW corrections, and all interference effects up to NLO [25,26,[89][90][91][92][93][94]. For all the other final states HDECAY [91,92] is used, which includes NLO QCD and NLO EW corrections. The predicted signal cross sections at 8 TeV and branching fraction for a low-mass Higgs boson are shown in the left and right plots of Fig. 2, respectively [25,26].
The uncertainty in the signal cross section related to the choice of PDFs is determined with the PDF4LHC prescription [95][96][97][98][99]. The uncertainty due to the higher-order terms is calculated by varying the renormalization and factorization scales in each process, as explained in Ref. [25].
For the dominant gluon-gluon fusion process, the transverse momentum spectrum of the Higgs boson in the 7 TeV MC simulation samples is reweighted to match the NNLL + NLO distribution computed with HqT [100,101] (and FEHIPRO [102,103] for the high-p T range in the ττ analysis), except in the H → ZZ analysis, where the reweighting is not necessary. At 8 TeV, POWHEG was tuned to reach a good agreement of the p T spectrum with the NNLL + NLO prediction in order to make reweighting unnecessary [26].

Background simulation
The background contribution from ZZ production via qq is generated at NLO with POWHEG, while other diboson processes (WW, WZ) are generated with MADGRAPH [104,105] with cross sections rescaled to NLO predictions. The PYTHIA generator is also used to simulate all diboson   [25,26]. The width of the lines represents the total theoretical uncertainty in the cross section and in the branching fractions.
processes. The gg →VV contributions are generated with GG2VV [106]. The V+jets and Vγ samples are generated with MADGRAPH, as are contributions to inclusive Z and W production, with cross sections rescaled to NNLO predictions. Single-top-quark and tt events are generated at NLO with POWHEG. The PYTHIA generator takes into account the initial-state and final-state radiation effects that can lead to the presence of additional hard photons in an event. The MADGRAPH generator is also used to generate samples of tt events. QCD events are generated with PYTHIA. Table 2 summarizes the generators used for the different analyses.

Search sensitivities
The search sensitivities of the different channels, for the recorded luminosity used in the analyses, expressed in terms of the median expected 95% CL upper limit on the ratio of the measured signal cross section, σ, and the predicted SM Higgs boson cross section, σ SM , are shown in Fig. 3 (left) as a function of the Higgs boson mass. A channel showing values below unity (dashed horizontal line) for a given mass hypothesis would be expected, in the absence of a Higgs boson signal, to exclude the standard model Higgs boson at 95% CL or more at that mass. Figure 3 (right) shows the expected sensitivities for the observation of the Higgs boson in terms of local p-values and significances as a function of the Higgs boson mass. The local p-value is defined as the probability of a background fluctuation; it measures the consistency of the data with the background-only hypothesis.
The overall statistical methodology used in this paper was developed by the ATLAS and CMS Collaborations in the context of the LHC Higgs Combination Group [107]. A summary of our usage of this methodology in the search for the Higgs boson is given in Section 10.

H → γγ
In the H → γγ analysis, a search is made for a narrow peak, of width determined by the experimental resolution of ∼1%, in the diphoton invariant-mass distribution for the range 110-150 GeV, on top of a large irreducible background from the production of two photons originat-  Figure 3: The median expected 95% CL upper limits on the cross section ratio σ/σ SM in the absence of a Higgs boson (left) and the median expected local p-value for observing an excess, assuming that a Higgs boson with that mass exists (right), as a function of the Higgs boson mass for the five Higgs boson decay channels and their combination.
ing directly from the hard-scattering process. In addition, there is a sizable amount of reducible background in which one or both of the reconstructed photons originate from the misidentification of particles in jets that deposit substantial energy in the ECAL, typically photons from the decay of π 0 or η mesons. Early studies indicated this to be one of the most promising channels in the search for a SM Higgs boson in the low-mass range [108].
To enhance the sensitivity of the analysis, candidate diphoton events are separated into mutually exclusive classes with different expected signal-to-background ratios, based on the properties of the reconstructed photons and the presence or absence of two jets satisfying criteria aimed at selecting events in which a Higgs boson is produced through the VBF process. The analysis uses multivariate techniques for the selection and classification of the events. As independent cross-checks, two additional analyses are performed. The first is almost identical to the CMS analysis described in Ref. [109], but uses simpler criteria based on the properties of the reconstructed photons to select and classify events. The second analysis incorporates the same multivariate techniques described here, however, it relies on a completely independent modelling of the background. These two analyses are described in more detail in Section 5.6.

Diphoton trigger
All the data under consideration have passed at least one of a set of diphoton triggers, each using transverse energy thresholds and a set of additional photon selections, including criteria on the isolation and the shapes of the reconstructed energy clusters. The transverse energy thresholds were chosen to be at least 10% lower than the envisaged final-selection thresholds. This set of triggers enabled events passing the later offline H → γγ selection criteria to be collected with a trigger efficiency greater than 99.5%.

Interaction vertex location
In order to construct a photon four-momentum from the measured ECAL energies and the impact position determined during the supercluster reconstruction, the photon production vertex, i.e. the origin of the photon trajectory, must be determined. Without incorporating any additional information, any of the reconstructed pp event vertices is potentially the origin of the photon. If the distance in the longitudinal direction between the assigned and the true interaction point is larger than 10 mm, the resulting contribution to the diphoton mass resolution becomes comparable to the contribution from the ECAL energy resolution. It is, therefore, desirable to use additional information to assign the correct interaction vertex for the photon with high probability. This can be achieved by using the kinematic properties of the tracks associated with the vertices and exploiting their correlation with the diphoton kinematic properties, including the transverse momentum of the diphoton (p γγ T ). In addition, if either of the photons converts into an e + e − pair and the tracks from the conversion are reconstructed and identified, the direction of the converted photon, determined by combining the conversion vertex position and the position of the ECAL supercluster, can be extrapolated to identify the diphoton interaction vertex.
For each reconstructed interaction vertex the following set of variables are calculated: the sum of the squared transverse momenta of all tracks associated with the vertex and two variables that quantify the p T balance with respect to the diphoton system. In the case of a reconstructed photon conversion, an additional "pull" variable is used, defined as the distance between the vertex z position and the beam-line extrapolated z position coming from the conversion reconstruction, divided by the uncertainty in this extrapolated z position. These variables are used as input to a BDT algorithm trained on simulated Higgs signal events and the interaction point ranking highest in the constructed classifier is chosen as the origin of the photons. The vertex-finding efficiency, defined as the efficiency to locate the vertex to within 10 mm of its true position, is studied using Z → µµ events where the muon tracks were removed from the tracks considered, and the muon momenta were replaced by the photon momenta. The result is shown in Fig. 4. The overall efficiency in signal events with a Higgs boson mass of 120 GeV, integrated over its p T spectrum, is (83.0 ± 0.4)% in the 7 TeV data set, and (79.0 ± 0.2)% in the 8 TeV data set. The statistical uncertainties in these numbers are propagated to the uncertainties in the final result.
A second vertex related multivariate discriminant is employed to estimate, event-by-event, the probability for the vertex assignment to be within 10 mm of the diphoton interaction point. This BDT is trained using simulated H → γγ events. The input variables are the classifier values of the vertex BDT described above for the three vertices with the highest score BDT values, the number of vertices, the diphoton transverse momentum, the distances between the chosen vertex and the second and third choices, and the number of photons with an associated conversion track. These variables allow for a reliable quantification of the probability that the selected vertex is close to the diphoton interaction point.
The resulting vertex-assignment probability from simulated events is used when constructing the Higgs boson signal models. The signal modelling is described in Section 5.5.

Photon selection
The event selection requires two photon candidates with transverse momenta satisfying p γ T (1) > m γγ /3 and p γ T (2) > m γγ /4, where m γγ is the diphoton invariant mass, within the ECAL fiducial region |η| < 2.5, and excluding the barrel-endcap transition region 1.44 < |η| < 1.57. The fiducial region requirement is applied to the supercluster position in the ECAL and the p T threshold is applied after the vertex assignment. The requirements on the mass-scaled transverse momenta are mainly motivated by the fact that by dividing the transverse momenta by the diphoton mass, turn-on effects on the background-shape in the low mass region are strongly reduced. In the rare cases where the event contains more than two photons passing all the selection requirements, the pair with the highest summed (scalar) p T is chosen.
The relevant backgrounds in the H → γγ channel consist of the irreducible background from prompt diphoton production, i.e. processes in which both photons originate directly from the hard-scattering process, and the reducible backgrounds from γ + jet and dijet events, where the objects misidentified as photons correspond to particles in jets that deposit substantial energy in the ECAL, typically photons from the decay of isolated π 0 or η mesons. These misidentified objects are referred to as fake or nonprompt photons.
In order to optimize the photon identification to exclude such nonprompt photons, a BDT classifier is trained using simulated pp → γ + jet event samples, where prompt photons are used as the signal and nonprompt photons as the background. The variables used in the training are divided into two groups. The first contains information on the detailed electromagnetic shower topology, the second has variables describing the photon isolation, i.e. kinematic information on the particles in the geometric neighbourhood of the photon. Examples of variables in the first group are the energy-weighted shower width of the cluster of ECAL crystals assigned to the photon and the ratio of the energy of the most energetic 3 × 3 crystal cluster to the total cluster energy. The isolation variables include the magnitude of the sum of the transverse momenta of all other reconstructed particles inside a cone of size ∆R = 0.3 around the candidate photon direction. In addition, the geometric position of the ECAL crystal cluster, as well as the event energy density ρ, are used. The photon ID classifier is based on the measured properties of a single photon and makes no use of the any properties that are specific to the production mechanism. Any small residual dependence on the production mechanism, e.g. through the isolation distribution, arises from the different event enviroments in Higgs decays and in photon plus jets events.
Instead of having a requirement on the trained multivariate classifier value to select photons with a high probability of being prompt photons, the classifier value itself is used as input to subsequent steps of the analysis. To reduce the number of events, a loose requirement is imposed on the classifier value (> − 0.2) for candidate photons to be considered further. This requirement retains more than 99% of signal photons. The efficiency of this requirement, as well as the differential shape of the classifier variable for prompt photons, have been studied by comparing Z → ee data to simulated events, given the similar response of the detector to photon and electrons. The comparisons between the differential shape in data and MC simulation for the 8 TeV analysis are shown in Fig. 5, for electrons in the barrel (left) and endcap (right) regions.

Event classification
The strategy of the analysis is to look for a narrow peak over the continuum in the diphoton invariant-mass spectrum. To increase the sensitivity of the search, events are categorized according to their expected diphoton mass resolution and signal-to-background ratio. Categories with good resolution and a large signal-to-background ratio dominate the sensitivity of the search. To accomplish this, an event classifier variable is constructed based on multi-variate techniques, that assigns a high classifier value to events with signal-like kinematic characteristics and good diphoton mass resolution, as well as prompt-photon-like values for the photon identification classifier. However, the classifier should not be sensitive to the value of the diphoton invariant mass, in order to avoid biasing the mass distribution that is used to extract a possible signal. To achieve this, the input variables to the classifier are made dimensionless. Those that have units of energy (transverse momenta and resolutions) are divided by the diphoton invariant-mass value. The variables used to train this diphoton event classifier are the scaled photon transverse momenta (p  /m γγ is computed using the single photon resolution estimated by the dedicated BDT described in Section 3. A vertex is being labeled as correct if the distance from the true interaction point is smaller than 10 mm.
To ensure the classifier assigns a high value to events with good mass resolution, the events are weighted by a factor inversely proportional to the mass resolution, This factor incorporates the resolutions under both correct-and incorrect-interaction-vertex hypotheses, properly weighted by the probabilities of having assigned the vertex correctly. The training is performed on simulated background and Higgs boson signal events. The training procedure makes full use of the signal kinematic properties that are assumed to be those of the SM Higgs boson. The classifier, though still valid, would not be fully optimal for a particle produced with significantly different kinematic properties.
The uncertainties in the diphoton event classifier output come from potential mismodelling of the input variables. The dominant sources are the uncertainties in the shapes of the photon identification (ID) classifier and the individual photon energy resolutions, which are used to compute the relative diphoton invariant-mass resolutions.
The first of these amounts to a potential shift in the photon ID classifier value of at most ±0.01 in the 8 TeV and ±0.025 in the 7 TeV analysis. These values are set looking to the observed differences between the photon ID classifier value distributions from data and simulation. This comparison for the 7 TeV analysis is shown in Fig. 6, where the distribution for the leading (highest p T ) candidate photons in the ECAL barrel (left) and endcaps (right) are compared between data and MC simulation for m γγ > 160 GeV, where most photons are prompt ones. In addition to the three background components described in Section 5.3 (prompt-prompt, promptnonprompt, and nonprompt-nonprompt), the additional component composed by Drell-Yan events, in which both final-state electrons are misidentified as photons, has been studied and found to be negligible. As discussed previously a variation of the classifier value by ±0.025, represented by the cross-hatched histogram, covers the differences.
For the second important variable, the photon energy resolution estimate (calculated by a BDT, as discussed in Section 3), a similar comparison is shown in Fig. 7. Again, the 7 TeV data distributions for candidate photons in the ECAL barrel (left) and endcap (right) are compared to MC simulation for m γγ > 160 GeV. The systematic uncertainty of ±10% is again shown as the cross-hatched histogram.
The effect of both these uncertainties propagated to the diphoton event classifier distribution can be seen in Fig. 8, where the 7 TeV data diphoton classifier variable is compared to the MC simulation predictions. The data and MC simulation distributions in both the left and right plots of Fig. 8 are the same. In the left plot, the uncertainty band arises from propagating the photon ID classifier uncertainty, while in the right plot, it is from propagating the energy resolution uncertainty. From these plots one can see that the uncertainty in the photon ID classifier dominates the overall uncertainty, and by itself almost covers the full difference between the data and MC simulation distributions. Both uncertainties are propagated into the final result.
The diphoton event classifier output is then used to divide events into different classes, prior to fitting the diphoton invariant-mass spectrum. The procedure successively splits events into classes by introducing a boundary value for the diphoton classifier output. The first boundary results in two classes, and then these classes are further split. Each split is introduced using the boundary value that gives rise to the best expected exclusion limit. The procedure is termi-   Figure 8: The effect of the systematic uncertainty assigned to the photon identification classifier output (left) and the photon resolution estimate (right) on the diphoton BDT output for background MC simulation (100 GeV < m γγ < 180 GeV) and for data. The nominal BDT output is shown as a stacked histogram and the variation due to the uncertainty is shown as a crosshatched band. These plots show only the systematic uncertainties that are common to both signal and background. There are additional significant uncertainties that are not shown here.

H → γγ
nated once additional splitting results in a negligible (<1%) gain in sensitivity. Additionally, the lowest score class is dropped since it does not contribute significantly to the sensitivity. This procedure results in four event classes for both the 7 and 8 TeV data sets. The systematic uncertainties in the diphoton identification classifier and photon energy resolution discussed above can cause events to migrate between classes. In the 8 TeV analysis, these class migrations are up to 4.3% and 8.1%, respectively. They are defined as the relative change of expected signal yield in each category under the variation of the photon ID BDT classifier and the per-photon energy resolution estimate, within their uncertainties as explained above.
The sensitivity of the analysis is enhanced by using the special kinematics of Higgs bosons produced by the VBF process [110]. Dedicated classes of events are selected using dijet-tagging criteria. The 7 TeV data set has one class of dijet-tagged events, while the 8 TeV data set has two.
In the 7 TeV analysis, dijet-tagged events are required to contain two jets with transverse energies exceeding 20 and 30 GeV, respectively. The dijet invariant mass is required to be greater than 350 GeV, and the absolute value of the difference of the pseudorapidities of the two jets has to be larger than 3.5. In the 8 TeV analysis, dijet-tagged events are required to contain two jets and are categorized as "Dijet tight" or "Dijet loose". The jets in Dijet tight events must have transverse energies above 30 GeV and a dijet invariant mass greater than 500 GeV. For the jets in the Dijet loose events, the leading (subleading) jet transverse energy must exceed 30 (20) GeV and the dijet invariant mass be greater than 250 GeV, where leading and subleading refer to the jets with the highest and next-to-highest transverse momentum, respectively. The pseudorapidity separation between the two jets is also required to be greater than 3.0. Additionally, in both analyses the difference between the average pseudorapidity of the two jets and the pseudorapidity of the diphoton system must be less than 2.5 [111], and the difference in azimuthal angle between the diphoton system and the dijet system is required to be greater than 2.6 radians. To further reduce the background in the dijet classes, the p T threshold on the leading photon is increased to p γ T (1) > m γγ /2. Systematic uncertainties in the efficiency of dijet tagging for signal events arise from the uncertainty in the MC simulation modelling of the jet energy corrections and resolution, and from uncertainties in simulating the number of jets and their kinematic properties. These uncertainties are estimated by using different underlying-event tunes, PDFs, and renormalization and factorization scales as suggested in Refs. [25,26]. A total systematic uncertainty of 10% is assigned to the efficiency for VBF signal events to pass the dijet-tag criteria, and an uncertainty of 50%, dominated by the uncertainty in the underlying-event tune, to the efficiency for signal events produced by gluon-gluon fusion. Table 3 shows the predicted number of signal events for a SM Higgs boson with m H = 125 GeV, as well as the estimated number of background events per GeV of invariant mass at m γγ = 125 GeV, for each of the eleven event classes in the 7 and 8 TeV data sets. The table also gives the fraction of each Higgs boson production process in each class (as predicted by MC simulation) and the mass resolution, represented both as σ eff , half the width of the narrowest interval containing 68.3% of the distribution, and as the full-width-at-half-maximum (FWHM) of the invariant-mass distribution divided by 2.35.

Signal and background modelling
The modelling of the Higgs boson signal used in the estimation of the sensitivity has two aspects. First, the normalization, i.e. the expected number of signal events for each of the considered Higgs boson production processes; second, the diphoton invariant-mass shape. To model both aspects, including their respective uncertainties, the MC simulation events and theoretical Table 3: Expected number of SM Higgs boson events (m H = 125 GeV) and estimated background (at m γγ = 125 GeV) for the event classes in the 7 (5.1 fb −1 ) and 8 TeV (5.3 fb −1 ) data sets. The composition of the SM Higgs boson signal in terms of the production processes and its mass resolution are also given. considerations described in Section 4 are used. To account for the interference between the signal and background diphoton final states [112], the expected gluon-gluon fusion process cross section is reduced by 2.5% for all values of m H .
Additional systematic uncertainties in the normalization of each event class arise from potential class-to-class migration of signal events caused by uncertainties in the diphoton event classifier value. The instrumental uncertainties in the classifier value and their effect have been discussed previously. The theoretical ones, arising from the uncertainty in the theoretical predictions for the photon kinematics, are estimated by measuring the amount of class migration under variation of the renormalization and factorization scales within the range m H /2 < µ < 2m H , (class migrations up to 12.5%) and the PDFs (class migrations up to 1.3%). These uncertainties are propagated to the final statistical analysis.
To model the diphoton invariant-mass spectrum properly, it is essential that the simulated diphoton mass and scale are accurately predicted. This is done by comparing the dielectron invariant-mass distribution in Z → ee events between data and MC simulation, where the electrons have been reconstructed as photons. This comparison is shown for the 8 TeV data in Fig. 9, where the points represent data, and the histogram MC simulation. Before correction, the dielectron invariant-mass distribution from simulation is narrower than the one from data, caused by an inadequate modelling of the photon energy resolution in the simulation.
To correct this effect, the photon energies in the Higgs boson signal MC simulation events are smeared and the data events scaled, so that the dielectron invariant-mass scale and resolution as measured in Z → ee events agree between data and MC simulation. These scaling and smearing factors are determined in a total of eight photon categories, i.e. separately for photons in four pseudorapidity regions (|η| < 1, 1 ≤ |η| < 1.5, 1.5 ≤ |η| < 2, and |η| ≥ 2), and separately for high R9 (>0.94) and low R9 (≤0.94) photons, where R9 is the ratio of the energy of the most energetic 3 × 3 crystal cluster and the total cluster energy.
Additionally, the factors are computed separately for different running periods in order to account for changes in the running conditions, for example the change in the average beam in-  tensity. These modifications reconcile the discrepancy between data and simulation, as seen in the comparison of the dots and solid curve of Fig. 9. The uncertainties in the scaling and smearing factors, which range from 0.2% to 0.9% depending on the photon properties, are taken as systematic uncertainties in the signal evaluation and mass measurement. The final signal model is then constructed separately for each event class and each of the four production processes as the weighted sum of two submodels that assume either the correct or incorrect primary vertex selection (as described in Section 5.2). The two submodels are weighted by the corresponding probability of picking the right (p vtx ) or wrong (1 − p vtx ) vertex. The uncertainty in the parameter p vtx is taken as a systematic uncertainty.
To describe the signal invariant-mass shape in each submodel, two different approaches are used. In the first, referred to as the parametric model, the MC simulated diphoton invariantmass distribution is fitted to a sum of Gaussian distributions. The number of Gaussian functions ranges from one to three depending on the event class, and whether the model is a corrector incorrect-vertex hypothesis. The systematic uncertainties in the signal shape are estimated from the variations in the parameters of the Gaussian functions. In the second approach, referred to as the binned model, the signal mass shape for each event class is taken directly from the binned histogram of the corresponding simulated Higgs boson events. The systematic uncertainties are included by parametrizing the change in each bin of the histogram as a linear function under variation of the corresponding nuisance parameter, i.e. the variable that parametrizes this uncertainty in the statistical interpretation of the data. The two approaches yield consistent final results and serve as an additional verification of the signal modelling. The presented results are derived using the parametric-model approach.
The parametric signal models for a Higgs boson mass of 120 GeV in two of the 8 TeV BDT event classes are shown in Fig. 10. The signal models are summed over the four production processes, each weighted by their respective expected yield as computed from MC simulation. The two plots in Fig. 10  a mass resolution σ eff = 1.34 GeV, while the right distribution is for classifier values between −0.05 and 0.50 and has σ eff = 2.77 GeV. This is the intended behaviour of the event class implementation.
The uncertainties in the weighting factors for each of the production processes arise from variations in the renormalization and factorization scales, and uncertainties in the PDFs. They range from several percent for associated production with W/Z to almost 20% for the gluon-gluon fusion process. The detailed values for the 8 TeV analysis, together with all the other systematic uncertainties discussed above, are summarized in Table 4. The corresponding uncertainties in the 7 TeV analysis are very similar, with the exception of the already mentioned uncertainty on the photon ID classifier, which was significantly larger in the 7 TeV analysis. The reason for this is a worse agreement between data and MC simulation.
In addition to the per-photon energy scale uncertainties, that are derived in the eight η − R9 categories, additional fully correlated energy scale uncertainties are assigned in order to account for possible non-linearity as a function of energy and for additional electron-photon differences. The uncertainty associated with possible non-linearities in the energy measurement as a function of the cluster energy are evaluated by measuring the energy scale of Z → ee events as a function of the scalar sum of transverse momentum of the two electrons. The change in energy scale due to possible non-linearities in the energy measurement is estimated around 0.2%; since this correction is not applied, a systematic uncertainty of 0.4% is assigned. An additional fully correlated uncertainty related to difference of 0.25% between electron and photon is assigned, amounting to half of the absolute energy scale difference between electrons and photons for non-showering electrons/photons in the barrel. Adding these two numbers in quadrature results in the additional energy scale uncertainty of 0.47%, that is treated as fully correlated among all event classes.
The modelling of the background relies entirely on the data. The observed diphoton invariantmass distributions for the eleven event classes (five in the 7 and eight in the 8 TeV analysis) are Table 4: Largest sources of systematic uncertainty in the analysis of the 8 TeV data set. Eight photon categories are defined, depending on their η and R9, where R9 is the ratio of the energy of the most energetic 3 × 3 crystal cluster and the total cluster energy. The four pseudorapidity regions are: |η| < 1 (low η), 1 ≤ |η| < 1.5 (high η) for the barrel, and 1.5 ≤ |η| < 2 (low η), |η| ≥ 2 (high η) for the endcaps; the two R9 regions are: high R9 (> 0.94) and low R9 (≤0.94).

Sources of systematic uncertainty Uncertainty Per photon
Barrel Endcap Photon selection efficiency 0.8% 2.2% Energy resolution (∆σ/E MC ) fitted separately over the range 100 < m γγ < 180 GeV. This has the advantage that there are no systematic uncertainties due to potential mismodelling of the background processes by the MC simulation. The procedure is to fit the diphoton invariant-mass distribution to the sum of a signal mass peak and a background distribution. Since the exact functional form of the background in each event class is not known, the parametric model has to be flexible enough to describe an entire set of potential underlying functions. Using a wrong background model can lead to biases in the measured signal strength. Such a bias can, depending on the Higgs boson mass and the event class, reach or even exceed the size of the expected signal, and therefore dramatically reduce the sensitivity of the analysis to any potential signal. In what follows, a procedure for selecting the background function is described that results in a potential bias small enough to be neglected.
If the true underlying background model could be used in the extraction of the signal strength, and no signal is present in the fitted data, the median fitted signal strength would be zero in the entire mass region of interest. The deviation of the median fitted signal strength from zero in background-only pseudo-experiments can thus be used to quantify the potential bias. These pseudodata sets are generated from a set of hypothetical truth models, with each model using a different analytical function that adequately describes the observed diphoton invariant-mass distribution. The set of truth-models contains exponential and power-law functions, as well as polynomials (Bernstein polynomials) and Laurent series of different orders. None of these functions is required to describe the actual (unknown) underlying background distribution. Instead, we argue that they span the phase-space of potential underlying models in such a way that a fit model resulting in a negligible bias against all of them would also result in a negligible bias against the (unknown) true underlying distribution.
The first step in generating such pseudodata sets consists of constructing a truth model, from which the pseudodata set is drawn. This is done by fitting the data in each of the eleven event classes separately, and for each of the four general types of background functions, resulting in four truth-models for each event class. The order of the background function required to adequately describe the data for each of the models is determined by increasing the order until an additional increase does not result in a significant improvement of the fit to the observed data. A χ 2 -goodness-of-fit is used to quantify the fit quality, and an F-test to determine the termination criterion. "Increasing the order" here means adding additional terms of higher order in the case of the polynomial and the Laurent series, and adding additional exponential or power-law terms with different parameters in the case of the exponential and power-law truth models.
Once the four truth models are determined for a given event class, ∼40 000 pseudodata sets are generated for each by randomly drawing diphoton mass values from them. The next step is then to find a function (in what follows referred to as fit model), that results in a negligible bias against all four sets of toy data in the entire mass region of interest, i.e. an analytical function that when used to extract the signal strength in all the 40 000 pseudodata sets, gives a mean value for the fitted strength consistent with zero.
The criterion for the bias to be negligible is that it must be five times smaller than the statistical uncertainty in the number of fitted events in a mass window corresponding to the FWHM of the corresponding signal model. With this procedure, any potential bias from the background fit function can be neglected in comparison with the statistical uncertainty from the finite data sample. We find that only the polynomial background function produces a sufficiently small bias for all four truth models. Therefore, we only use this background function to fit the data. The required order of the polynomial function needed to reach the sufficiently small bias is determined separately for each of the 11 event classes, and ranges from 3 to 5.
The entire procedure results in a background model for each of the event classes as a polynomial function of a given, class-dependent order. The parameters of this polynomial, i.e. the coefficients for each term, are left free in the fit, and their variations are therefore the only source of uncertainty from the modelling of the background.
The simultaneous fit to the signal-plus-background models, derived as explained above, together with the m γγ distributions for the data, are shown for the eleven event classes in Figs. 11 and 12 for the 7 and 8 TeV data samples, respectively. The uncertainty bands shown in the background component of the fit arise from the variation of the background fit parameters, and correspond to the uncertainties in the expected background yield. The fit is performed on the data from all event class distributions simultaneously, with an overall floating signal strength. In these fits, the mass hypothesis is scanned in steps of 0.5 GeV between 110 and 150 GeV. At the point with the highest significant excess over the background-only hypothesis (m H = 125 GeV), the best fit value is σ/σ SM = 1.56 ± 0.43.
In order to better visualize any overall excess/significance in the data, each event is weighted by a class-dependent factor, and its corresponding diphoton invariant mass is plotted with that weight in a single distribution. The weight depends on the event class and is proportional to S/(S + B), where S and B are the number of expected signal and background events in a mass window corresponding to 2σ eff , centered on m γγ = 125 GeV and calculated from the signal-plusbackground fit to all data event classes simultaneously. The particular choice of the weights is motivated in Ref. [113]. The resulting distribution is shown in Fig. 13, where for reference the distribution for the unweighted sum of events is shown as an inset. The binning for the distributions is chosen to optimize the visual effect of the excess at 125 GeV, which is evident in both the weighted and unweighted distributions. It should be emphasized that this figure is for visualization purposes only, and no results are extracted from it.

Alternative analyses
In order to verify the results described above, two alternative analyses are performed. The first (referred to as the cut-based analysis) refrains from relying on multivariate techniques, except for the photon energy corrections described in Section 3. Instead, the photon identification is performed by an optimized set of requirements on the discriminating variables explained in Section 5.3. Additionally, instead of using a BDT event-classifier variable to separate events into classes, the event classes are built using requirements on the photons directly. Four mutually exclusive classes are constructed by splitting the events according to whether both candidate photons are reconstructed in the ECAL barrel or endcaps, and whether the R9 variable exceeds 0.94. This categorization is motivated by the fact that photons in the barrel with high R9 values are typically measured with better energy resolution than ones in the endcaps with low R9. Thus, the classification serves a similar purpose to the one using the BDT event classifier: events with good diphoton mass resolution are grouped together into one class. The four event classes used in this analysis are then: • both photons are in the barrel, with R9 > 0.94, • both photons are in the barrel and at least one of them with R9 ≤ 0.94, • at least one photon is in the endcap and both photons with R9 > 0.94, • at least one photon is in the endcap and at least one of them with R9 ≤ 0.94.
The second alternative analysis (referred to as the sideband analysis) uses the identical multivariate technique as the baseline analysis, as well as an identical event sample, but relies on dif-     ferent procedures to model the signal and background contributions. This approach uses data in the sidebands of the invariant mass distribution to model the background. Consequently, this analysis is much less sensitive to the parametric form used to describe the diphoton mass spectrum and allows the explicit inclusion of a systematic uncertainty for the possible bias in the background mass fit. For any given mass hypothesis m H , a signal region is defined to be in the range ±2% on either side of m H . A contiguous set of sidebands is defined in the mass distribution on either side of the signal region, from which the background is extracted. Each sideband is defined to have the equivalent width of ±2% relative to the mass hypothesis that corresponds to the centre of the sideband. A total of six sidebands are used in the analysis (three on either side of the signal region), with the two sidebands adjacent to the signal region omitted in order to avoid signal contamination, as illustrated in Fig. 14.  The result is extracted by counting events in the signal region, in classes that are defined by the output distribution of a BDT. This mass-window BDT takes two dimensionless inputs: the diphoton BDT output (as described in Section 5.4), and the mass, in the form ∆m/m H , where ∆m = m γγ − m H and m H is the Higgs boson mass hypothesis. The output of the BDT is binned to define the event classes. The bin boundaries are optimized to give the maximum expected significance in the presence of a Standard Model Higgs boson signal, and the number of bins is chosen such that any additional increase in the number of bins results in an improvement in the expected significance of less than 0.1%. The same bin boundaries are used for the signal region and for the six sidebands. The dijet-tagged events constitute an additional bin (two bins for the 8 TeV data set) appended to the bins of the mass-window BDT output value.
The background model (i.e. the BDT output distribution for background events in the signal region) is constructed from the BDT output distributions of the data in each of the six sidebands. The only assumptions made concerning the background model shape, both verified within the assigned systematic errors, are that the fraction of events in each BDT output bin varies linearly as a function of invariant mass (and thus with sideband position), and that there is negligible signal contamination in the sidebands. Only the overall normalization of the background model (the total number of background events in the signal region) is obtained from a parametric fit to the mass spectrum. The signal region is excluded from this fit. The bias incurred by the choice of the functional form used in the fit has been studied in a similar fashion to that described in Section 5.5, and is covered with a systematic uncertainty of 1%.
The mass-window BDT is trained using simulated Higgs boson events with m H = 123 GeV and simulated background events, including prompt-prompt, prompt-fake, and fake-fake processes. The training samples are not used in any other part of the analysis, except as input to the binning algorithm, thus avoiding any biases from overtraining.
The signal region for mass hypothesis m H = 125 GeV is estimated from simulation to contain 93% of the signal. The number of expected signal events in each bin is determined using MC simulation, as in the baseline analysis. Systematic uncertainties in the signal modelling lead to event migrations between the BDT bins, that are accounted for as additional nuisance parameters in the limit-setting procedure.
Examples of distributions in this analysis are shown in Fig. 15, for the 7 (left) and 8 TeV (right) data sets. The different event classes are listed along the x axis. The first seven classes are the mass-window BDT classes. They are ordered by increasing expected signal-to-background ratio. The class labeled as "Dijet" contains the dijet-tagged events. The number of data events, displayed as points, is compared to the expected background events determined from the sideband population, shown by the histogram. The expected signal yield for a Higgs boson mass of m H = 125 GeV is shown with the dotted line. The statistical interpretation of the results is given in Section 10.

Event selection and kinematics
The search for the decay H → ZZ → 4 with = e, µ is performed by looking for a narrow fourlepton invariant-mass peak in the presence of a small continuum background. The background sources include an irreducible four-lepton contribution from direct ZZ (Zγ * ) production via the qq annihilation and gg fusion processes. Reducible contributions arise from Z + bb and tt production, where the final state contains two isolated leptons and two b-quark jets that produce two nonprompt leptons. Additional background arises from Z+jets and WZ+jets events, where jets are misidentified as leptons. Since there are differences in the reducible background rates and mass resolutions between the subchannels 4e, 4µ, and 2e2µ, they are analyzed separately and the results are then combined statistically.
Compared to the first CMS ZZ → 4 analysis reported in Ref. [114], this analysis employs improved muon reconstruction, lepton identification and isolation, recovery of final-state-radiation (FSR) photons, and the use of a kinematic discriminant that exploits the expected decay kinematics of the signal events. New mass and spin-parity results obtained from a H → ZZ → 4 analysis using additional integrated luminosity at the centre-of-mass energy of 8 TeV are described in a recent CMS publication [115], and not discussed further here.
Candidate events are first selected by triggers that require the presence of a pair of electrons or muons. An additional trigger requiring an electron and a muon in the event is also used for the 8 TeV data. The requirements on the minimum p T of the two leptons are 17 and 8 GeV. The trigger efficiency is determined by first adjusting the simulation to reproduce the efficiencies obtained on single lepton legs in special tag-and-probe measurements, and then using the simulation to combine lepton legs within the acceptance of the analysis. The efficiency for a Higgs boson of mass > 120 GeV, is greater than 99% (98%, 95%) in the 4µ (2e2µ, 4e) channel. The candidate events are selected using identified and isolated leptons. The electrons are required to have transverse momentum p e T > 7 GeV and pseudorapidity within the tracker geometrical acceptance of |η e | < 2.5. The corresponding requirements for muons are p µ T > 5 GeV and |η µ | < 2.4. No gain in expected significance for a Higgs boson signal is obtained by lowering the p T thresholds for the leptons, since the improvement in signal detection efficiency is accompanied by a large increase in the Z+jets background.
The lepton-identification techniques have been described in Section 3. The multivariate electron identification is trained using a Higgs boson MC simulation sample for the H → ZZ signal and a sample of W+1-jet events from data for the background. The working point is optimized using a Z+1-jet data sample. For each lepton, = e, µ, an isolation requirement of R Iso < 0.4 is applied to suppress the Z+jet, Z+bb, and tt backgrounds. In addition, the lepton impact parameter significance with respect to the primary vertex, defined as SIP 3D = IP σ IP , with IP the impact parameter in three dimensions and σ IP its uncertainty, is used to further reduce background. The criteria of |SIP 3D | < 4 suppresses the Z + bb and tt backgrounds with negligible effect on the signal efficiency.
The efficiencies for reconstruction, identification, and isolation of electrons and muons are measured in data, using a tag-and-probe technique [116] based on an inclusive sample of Z → events. The measurements are performed in bins of p T and |η|. Additional samples of dileptons with p T < 15 GeV from J/ψ decays are used for the efficiency measurements (in the case of muons) or for consistency checks (in the case of electrons). Examples of tag-and-probe results for the lepton identification efficiencies obtained with data and MC simulation are shown for electrons (top) and muons (bottom) in Fig. 16. The efficiencies measured with data are in agreement with those obtained using MC simulation. The mean differences (at the percent level) are used to correct the MC simulation predictions, and the uncertainty in the difference is propagated as a systematic uncertainty per lepton. The overall lepton selection efficiencies are obtained as the product of the reconstruction, identification, and isolation efficiencies. The overall efficiency for selecting electrons in the ECAL barrel (endcaps) varies from about 71% (65%) for 7 < p e T < 10 GeV to 82% (73%) at p e T 10 GeV, and reaches 90% (89%) for p e T 20 GeV. The efficiency for electrons drops to about 85% in the transition region, 1.44 < |η e | < 1.57, between the ECAL barrel and endcaps. The muons are selected with an efficiency above 98% in the full |η µ | < 2.4 range for p µ T > 5 GeV. Photons reconstructed with pseudorapidity |η γ | < 2.4 are possible FSR candidates. The photon selection criteria are optimized as a function of the angular distance between the photon and the closest lepton in (η, φ) space. In an inner cone ∆R = 0.07, photons are accepted if p T > 2 GeV, with no further requirements. In an outer annulus 0.07 < ∆R < 0.5, where the rate of photons from the underlying event and pileup is much larger, a tighter threshold of 4 GeV is used, and the photons are also required to be isolated: the sum of the p T of all charged hadrons, neutral hadrons, and photons in a cone of radius ∆R = 0.3 centred on the photon should not exceed the p T of the photon itself. In contrast to lepton isolation, and in order to take into account the fact that the photon might come from a pileup interaction, the photon isolation also uses the charged hadrons associated with other primary vertices. The selection criteria have been tuned to achieve approximately the same purity in the two angular regions. When reconstructing the Z → candidates, only FSR photons associated with the closest lepton, and that make the dilepton-plus-photon invariant mass closer to the nominal Z mass than the dilepton invariant mass, are kept. The dilepton-plus-photon invariant mass must also be less than 100 GeV. The performance of the FSR selection algorithm is measured using MC H → ZZ simulation samples, and the rate is verified with inclusive Z-boson events in data. Photons within the acceptance for the FSR selection are measured with an efficiency of 50% and a mean purity of 80%. The FSR photons are selected in 5% of inclusive Z-boson events in the muon channel and 0.5% in the electron channels. In the case of electrons, the FSR photons are often implicitly combined into the electron superclusters, resulting in a lower FSR recovery efficiency.
The Z boson candidates are reconstructed from pairs of leptons of the same flavour and opposite charge ( + − ). The lepton pair with an invariant mass closest to the nominal Z mass is denoted as Z 1 with mass m Z 1 and is retained if it satisfies 40 < m Z 1 < 120 GeV. The invariant mass of the second Z candidate, denoted Z 2 , must satisfy 12 < m Z 2 < 120 GeV. The minimum value of 12 GeV is found from simulation to provide the optimal sensitivity for a Higgs boson mass in the range 110 < m H < 160 GeV. If more than one Z 2 candidate satisfies all the criteria, we choose the candidate reconstructed from the two leptons with the highest scalar sum of their p T . Among the four selected leptons forming Z 1 and Z 2 , at least one is required to have p T > 20 GeV and another p T > 10 GeV. These p T thresholds ensure that the selected leptons are on the high-efficiency plateau for the trigger. To further reject leptons originating from weak semileptonic hadron decays or decays of low-mass hadronic resonances, we require that all opposite-charge pairs of leptons chosen from among the four selected leptons (irrespective of flavour) have an invariant mass greater than 4 GeV. The phase space for the Higgs boson search is defined by restricting the four-lepton mass range to m 4 > 100 GeV. The predicted lepton p T distributions from the MC simulation for a Higgs boson with m H = 125 GeV are shown in Fig. 17 for the 4e, 4µ, and 2e2µ channels. Also given in Fig. 17 (bottom right) are the event selection efficiencies for each of the three lepton channels, as a function of the Higgs boson mass. These distributions clearly emphasize the importance of low lepton-p T thresholds and high lepton efficiencies. The selection efficiencies shown in Fig. 17 are relative to events where all four leptons are within the geometrical acceptance and all dilepton invariant masses satisfy m > 1 GeV. The combined signal reconstruction and selection efficiency, for a Higgs boson with m H = 125 GeV, is 18% for the 4e channel, 40% for the 4µ channel, and 27% for the 2e2µ channel. The expected resolution on the per-event mass measurement is on average 2.2% for the 4e channel, 1.1% for the 4µ channel, and 1.6% for the 2e2µ channel.
The kinematics of the H → ZZ → 4 process, as well as for any boson decaying to ZZ, has been extensively studied in the literature [117][118][119][120][121][122][123][124][125][126][127][128][129]. Since the Higgs boson is spinless, the angular distribution of its decay products is independent of the production mechanism. In the Higgs boson rest frame, for a given invariant mass of the 4 system, the kinematics are fully described by five angles, denoted Ω, and the invariant masses of the two lepton pairs Z 1 and Z 2 . These seven variables provide significant discriminating power between signal and background.
A kinematic discriminant (K D ) is introduced using the full probability density in the dilepton masses and angular variables, P (m Z 1 , m Z 2 , Ω|m 4 ). The K D is constructed for each candidate event based on the probability ratio of the signal and background hypotheses, K D = P sig /(P sig + P bkg ), as described in Refs. [23,130]. For the signal, the phase-space and Zpropagator terms [119] are included in a fully analytic parametrization of the Higgs boson signal [126]. An analytic parametrization is also used for the background probability distribution for the mass range above the ZZ threshold, while it is tabulated using a MC simulation of the qq → ZZ(Zγ * ) process below this threshold.

Background estimation and systematic uncertainties
The small number of observed candidate events precludes a precise direct determination of the background by extrapolating from the signal region mass sidebands. Instead, we rely on MC simulation to evaluate the local density (∆N/∆m 4 ) of ZZ background events expected as a function of m 4 . The cross section for ZZ production at NLO is calculated with MCFM [131][132][133]. This includes the dominant process from qq annihilation, as well as from gluon-gluon fusion. The uncertainties in the predicted number of background events owing to the variation in the QCD renormalization and factorization scales and PDF set are on average 8% for each final state [26]. The number of predicted ZZ → 4 events and their systematic uncertainties after the signal selection are given in Table 5.
The reducible Z + bb, tt, Z + jets, Z + γ + jets, and WZ + jets backgrounds contain at least one nonprompt lepton in the four-lepton final state. The main sources of nonprompt leptons are electrons and muons coming from decays of heavy-flavour quark, misidentified jets (usually originating from light-flavour quarks), and electrons from photon conversions. The lepton misidentification probabilities are measured in data samples of Z + jet events with one additional reconstructed lepton, which are dominated by final states that include a Z boson and a fake lepton. The contamination from WZ production in these events is suppressed by requiring E miss T < 25 GeV. The lepton misidentification probabilities measured from these events are consistent with those derived from MC simulation. These misidentification probabilities are applied to dedicated Z 1 + X control samples, where X contains two reconstructed leptons with relaxed isolation and identification criteria. Starting from these samples, two complementary approaches are used to extract the corresponding reducible Z + X background yield expected in the 4 signal region. The first approach avoids signal contamination in the background sample by reversing the opposite-sign requirement on the Z 2 lepton candidates, and then applies the fake lepton efficiencies to the additional leptons to calculate the expected number of background events in the signal sample. The second approach uses a control region defined by two opposite-sign leptons failing the isolation and identification criteria, and using the misidentification probability to extrapolate to the signal region. In addition, a control region with three passing leptons and one failing lepton is also used to estimate the background with three prompt leptons and one misidentified lepton. Comparable background predictions in the signal region are found from both methods within their uncertainties. The average of the two predictions is used for the background estimate, with an uncertainty that includes the difference between them (see Table 5).
Systematic uncertainties are evaluated from the data for the trigger (1.5%), and the combined four-lepton reconstruction, identification, and isolation efficiencies that vary from 1.2% in the 4µ channel at m H = 150 GeV to about 11% in the 4e channel at m H = 120 GeV. The effects of the systematic uncertainties in the lepton energy-momentum calibration (0.4%) and energy resolution on the four-lepton invariant-mass distribution are taken into account. The accuracy of the absolute mass scale and resolution is validated using Z → , Y → , and J/ψ → events. The effect of the energy resolution uncertainty is taken into account by introducing a 20% variation on the simulated width of the signal mass peak. An uncertainty of 50% is assigned to the reducible background rate. This arises from the finite statistical precision in the reducible background control regions, differences in the background composition between the various control regions, and differences between the data samples used to measure the lepton misidentification probabilities. Since all the reducible and instrumental background are estimated using control regions in the data, they are independent of the uncertainty in the integrated luminosity. However, this uncertainty (2.2% at 7 TeV [134] and 4.4% at 8 TeV [135]) does affect the prediction of the ZZ background and the normalization of the signal in determining the Higgs boson cross section. Finally, the systematic uncertainties in the theoretical Higgs boson cross section (17-20%) and 4 branching fraction (2%) are taken from Ref. [25].

Results
The number of selected ZZ → 4 candidate events in the mass range 110 < m 4 < 160 GeV for each of the three final states is given in Table 5. The number of predicted background events in each of the three final states and their uncertainties are also given, together with the number of signal events expected from a SM Higgs boson of m H = 125 GeV. Table 5: The number of observed selected events, compared to the expected background yields and the expected number of signal events (m H = 125 GeV) for each lepton final state in the H → ZZ → 4 analysis. The estimates of the ZZ background are from MC simulation and the Z + X background are based on data. These results are given for the four-lepton invariant-mass range from 110 to 160 GeV. The total expected background and the observed numbers of events are also given integrated over the three bins ("signal region" defined as 121.5 < m 4 < 130.5 GeV) of Fig. 18, centred on the bin where the most significant excess is seen. The uncertainties shown include both statistical and systematic components.
The observed m 4 distribution from data is shown in Fig. 18. There is a clear peak at the Z boson mass from the decay Z → 4 [136]. The size and shape of the peak are consistent with those from the background prediction. Over the full Higgs boson search region from 110 to 160 GeV, the reducible background from Z+X events is much smaller than the irreducible ZZ(Zγ * ) background. There is an excess of events above the expected background near 125 GeV. The total number of observed events and the expected number of background events in the three bins centred on the excess (121.5 < m 4 < 130.5 GeV), and referred to as the "signal" region, are given in Table 5. The expected four-lepton invariant-mass distribution for a Higgs boson with a mass of 125 GeV is shown by the open histogram in Fig. 18.
The distributions of the reconstructed Z 1 and Z 2 dilepton invariant masses for the events in the signal region are shown in the left and right plots of Fig. 19, respectively. The Z 1 distribution has a tail towards low invariant mass, indicative that also the highest mass Z is often off-shell.
(GeV) The observed distribution of the K D discriminant values for invariant masses in the signal range 121.5 < m 4 < 130.5 GeV is shown in Fig. 21 (left). The m 4 distribution of events satisfying K D > 0.5 is shown in Fig. 21 (right). The clustering of events is clearly visible near m 4 ≈125 GeV.

H → WW
The decay mode H → WW is highly sensitive to a SM Higgs boson with a mass around the WW threshold of 160 GeV. With the lepton identification and E miss T reconstruction optimized for LHC pileup conditions, it is possible to extend the sensitivity down to 120 GeV. The search . The analysis of the 7 TeV data is described in Ref. [137] and remains unchanged, while the 8 TeV analysis is modified to cope with the more difficult conditions induced by the higher pileup in the 2012 data taking, and is explained below.

WW event selection
To improve the signal sensitivity, events are separated by jet multiplicity into three mutually exclusive categories, which are characterized by different expected signal yields and signalto-background ratios. We call these the 0-jet, 1-jet, and 2-jet categories. Jets are reconstructed using the selection described in Section 3, and events are classified according to the number of selected jets with E T > 30 GeV and |η| <4.7. To exclude electrons and muons from the jet sample, these jets are required to be separated from the selected leptons in ∆R by at least ∆R jet−lepton > 0.3. Events with more than 2 jets are only considered if there are no additional jets above this threshold present in the pseudorapidity region between the two highest-E T jets. Furthermore, the search splits candidate signal events into three final states, denoted by: e + e − , µ + µ − , and e ± µ ∓ .
The bulk of the signal arises through direct WW decays to dileptons of opposite charge, where the small contribution proceeding through an intermediate τ leptonic decays is implicitly included. The events are selected by triggers that require the presence of one or two high-p T electrons or muons. The trigger efficiency for signal events that pass the full event selection is measured to be above 97% in the µ + µ − final state, and above 98% in the e + e − and e ± µ ∓ final states for a Higgs boson mass of about 125 GeV. The trigger efficiencies increase along with Higgs boson mass. These efficiencies are measured using Z/γ * → + − events [116], with associated uncertainties of about 1%. A tight muon selection is applied, as described in Section 3. Muons are required to be isolated to distinguish between muon candidates from W boson decays and those from QCD background processes, which are usually in or near jets. For each muon candidate, the scalar sum of the transverse energy of all particles consistent with originating from the primary vertex is reconstructed in cones of several widths around the muon direction, excluding the contribution from the muon itself. This information is combined using a multivariate algorithm that exploits the differences in the energy deposition between prompt muons and muons from hadron decays inside a jet.
Electron candidates are identified using the multivariate approach described in Section 3. Electrons are required to be isolated by applying a threshold on the sum of the transverse energy of the particles that are reconstructed in a cone around them, excluding the contribution from the electron itself. For both electrons and muons, a correction is applied to account for the contribution to the energy in the isolation cone from pileup, as explained in Section 3.
In addition to high-momentum, isolated leptons and minimal jet activity, missing transverse momentum is present in signal events, but generally not in the background. In this analysis, a projected E miss To suppress the top-quark background, a top-quark tagging technique, based on low-momentum muon identification and b-jet tagging [31], is applied. The first selection is designed to veto events containing muons from b hadrons coming from top-quark decays. The second selection uses a b-jet tagging algorithm that looks for tracks with large impact parameter within jets. The rejection when combining the two selections for the top-quark background is about 50% in the 0-jet category and above 80% for events with at least one jet passing the selection criteria.
Various selection criteria are used to reduce the other background contributions. For the W+jets background, a minimum dilepton transverse momentum (p T ) of 45 GeV is required. To reduce the background from WZ production, any event that has a third lepton passing the identification and isolation requirements is rejected. This requirement rejects less than 1% of the WW → 2 2ν events, while rejecting around 35% of the remaining WZ events. The contribution from Wγ production, where the photon converts into a electron pair, is reduced by about 90% in the dielectron final state by requirements that reject γ conversions. Those requirements consist in finding tracks that associated with the electron give good conversion candidates. The background from low-mass resonances is rejected by requiring a dilepton mass (m ) greater than 12 GeV.
The Drell-Yan process produces same-flavour lepton pairs (e + e − and µ + µ − ). In order to suppress this background, a few additional requirements are applied in the same-flavour final states. First, the resonant Z component of the Drell-Yan production is rejected by requiring a dilepton mass outside a 30 GeV window centred on the Z mass. Then, the remaining off-peak contribution is suppressed by exploiting different E miss T -based approaches depending on the number of jets and the Higgs boson mass hypothesis. At large Higgs boson masses (m H > 140 GeV), signal events are associated with large E miss T and, thus, to suppress the Drell-Yan background it is sufficient to require the minimum of the two projected E miss T variables to be greater than 45 GeV. On the contrary, in low-mass Higgs boson events (m H ≤ 140 GeV) it is more difficult to separate the signal from the Drell-Yan background; therefore in this case, a dedicated multivariate selection, combining the missing transverse momentum with kinematic and topological variables, is used to reject Drell-Yan events and maximize the signal yield. A third approach is employed in events with two jets. Here, the dominant source of E miss T is the mismeasurement of the hadronic jet energy, and the optimal performance is obtained by requiring E miss T > 45 GeV. Finally, the momenta of the dilepton system and the most energetic jet must have an angle in the transverse plane smaller than 165 • . These selections reduce the Drell-Yan background by three orders of magnitude, while rejecting less than 50% of the signal, as determined from simulation.
After applying the full set of selection criteria, referred to as the WW selection, the observed yields in the combined 7 and 8 TeV data set are 1594, 1186, and 1295 events in the 0-jet, 1-jet, and 2-jet categories, respectively. This sample is dominated by nonresonant WW events in the 0-jet category and by a similar fraction of WW and top events in the other two categories. The main efficiency loss is due to the lepton selection and the stringent E miss    Table 6. The m distribution in the 0-jet (left) and 1-jet (right) categories for the eµ candidate events are shown in Fig. 24, along with the predictions for the background and a SM Higgs boson with m H = 125 GeV.  The 2-jet category is mainly sensitive to VBF production [74,75,77,138], whose cross section is roughly ten times smaller than that from gluon-gluon fusion. The VBF channel offers a different production mechanism to test the consistency of a signal with the SM Higgs boson hypothesis. The VBF signal can be extracted using simple selection criteria, especially in the relatively lowbackground environment of the fully leptonic WW decay mode, providing additional search sensitivity. The H → WW events from VBF production are characterized by two energetic forward-backward jets and very little hadronic activity in the rest of the event. Events passing the WW criteria are further required to satisfy p T > 30 GeV for the two highest-E T jets, with no jets above this threshold present in the pseudorapidity region between these two jets. Both leptons are required to be within the pseudorapidity region between the two jets. To reject the main background from top-quark decays, the two jets must have a pseudorapidity difference larger than 3.5 and a dijet invariant mass greater than 450 GeV. In addition, m T is required to be between 30 GeV and the Higgs boson mass hypothesis. Finally, a m H -dependent upper limit on the dilepton mass is applied.

Background predictions
A combination of techniques is used to determine the contributions from the background processes that remain after the final selection. The largest background contributions are estimated directly from data, avoiding uncertainties related to the simulation of these sources. The remaining contributions estimated from simulation are small.
The W+jets and QCD multijet backgrounds arise from semileptonic decays of heavy quarks, hadrons misidentified as leptons, and electrons from photon conversions. Estimations of these contributions are derived directly from data, using a control sample of events in which one lepton passes the standard criteria and the other does not, but instead satisfies a relaxed set of requirements ("loose" selection), resulting in a "tight-loose" sample. Then the efficiency, loose , for a lepton candidate that satisfies the loose selection to also pass the tight selection is determined, using data from an independent multijet event sample dominated by nonprompt leptons and parametrized as a function of the p T and η of the lepton. Finally, the background contamination is estimated using the events in the "tight-loose" sample, weighted by loose /(1 − loose ). The systematic uncertainty in the determination of loose dominates the overall uncertainty of this method, which is estimated to be about 36%. The uncertainty is obtained by varying the requirements to obtain loose , and from a closure test, where the tightloose rate derived from QCD simulated events is applied to a W + jets simulated sample to predict the rate of events with one real and one misidentified lepton.
The normalization of the top-quark background is estimated from data by counting the number (N tagged ) of top-quark-tagged events and applying a corresponding top-quark-tagging efficiency top . The top-quark-tagging efficiency ( top ) is measured with a control sample dominated by tt and Wt events, which is selected by requiring a b-tagged jet in the event. The number of top-quark background events in the signal region is then given by: N tagged × (1 − top )/ top . Background sources from non-top events are subtracted by estimating the misidentification probability from data control samples. The main uncertainty comes from the statistical uncertainty in the b-tagged control sample and from the systematic uncertainties related to the measurement of top . The uncertainty is about 20% in the 0-jet category and about 5% in the 1-jet category.
For the low-mass H → WW signal region, m H ≤ 200 GeV, the nonresonant WW background prediction is estimated from data. This contribution is measured using events with a dilepton mass larger than 100 GeV, where the Higgs boson signal contamination is negligible, and the MC simulation is then used to extrapolate into the signal region. The total uncertainty is about 10%, where the statistical uncertainty of the data control region is the largest component. For larger Higgs boson masses there is a significant overlap between the nonresonant WW and Higgs boson signal, and the simulation is used for the estimation of the background.
The Z/γ * → + − contribution to the e + e − and µ + µ − final states is estimated by extrapolat- ing the observed number of events with a dilepton mass within ±7.5 GeV of the Z mass, with the residual background in that region subtracted using e ± µ ∓ events. The extrapolation to the signal region is then performed using the simulation. The results are cross-checked with data, using the same algorithm and subtracting the background in the Z-mass region, estimated from the number of e ± µ ∓ events. The largest uncertainty in the estimate is the statistical uncertainty in the control sample, which is about 20% to 50%. The Z/γ * → τ + τ − contamination is estimated using Z/γ * → e + e − and µ + µ − events selected in data, where the leptons are replaced with simulated τ decays, thus providing a better description of the process Z/γ * → τ + τ − . The TAUOLA [54] program is used in the simulation of the τ decays to account for τ-polarization effects.
Finally, to estimate the Wγ * background contribution from asymmetric virtual photon decays to dileptons [139], where one lepton escapes detection, the MADGRAPH generator [104] with dedicated cuts is used. In particular, all the leptons are required to have a p T larger than 5 GeV and the mass of each lepton is considered in the generation of the samples. To normalize the simulated events, a control sample of high-purity Wγ * events from data with three reconstructed leptons is compared to the simulation prediction. A normalization factor of 1.6 ± 0.5 with respect to the theoretical leading-order Wγ * cross section is found.
Other minor backgrounds from WZ, ZZ (when the two selected leptons come from different boson decays), and Wγ are estimated from simulation. The Wγ background estimate is crosschecked in data using events passing all the selection requirements, except the two leptons must have the same charge; this sample is dominated by W+jets and Wγ events. The agreement between data and the background prediction in this test is at the 20% level.
The number of observed events and the expected number of events from all background processes after the WW selection are summarized in Table 7. The number of events observed in data and the signal and background predictions after the final selection are listed in Table 8 for two Higgs boson mass hypotheses.

Efficiencies and systematic uncertainties
The signal efficiency is estimated using simulations. All Higgs boson production mechanisms are considered: gluon-gluon fusion, associated production with a W or Z boson (VH), and VBF processes.
Residual discrepancies in the lepton reconstruction and identification efficiencies between data and simulation are corrected for by data-to-simulation scale factors measured using Z/γ * → + − events in the Z-peak region [116], recorded with dedicated unbiased triggers. These factors depend on the lepton p T and |η|, and are typically in the range 0.9-1.0. The uncertainties on the lepton and trigger efficiencies are about 2% per lepton leg.
Experimental effects, theoretical predictions, and the choice of MC event generators are considered as sources of systematic uncertainty, and their impact on the signal efficiency is assessed. The experimental uncertainties in lepton efficiency, momentum scale and resolution, E miss T modelling, and jet energy scale are applied to the reconstructed objects in simulated events by smearing and scaling the relevant observables, and propagating the effects to the kinematic variables used in the analysis. The 36% normalization uncertainty in the W + jets background is included by varying the efficiency for misidentified leptons to pass the tight lepton selection and by comparing the results of a closure test using simulated samples.
The relative systematic uncertainty on the signal efficiency from pileup is evaluated to be 1%. This corresponds to shifting the mean of the expected distribution of the number of pp collision per beam-crossing that is used to reweight the simulation up and down by one pp interaction. The systematic uncertainty on the integrated luminosity measurement is 4.4% [135].
The systematic uncertainties from theoretical input are separated into two components, which are assumed to be independent. The first component is the uncertainty in the fraction of events classified into the different jet categories and the effect of migration between categories. The second component is the uncertainty in the lepton acceptance and the selection efficiency of the other requirements. The effect of variations in the PDF, the value of α s , and the higherorder corrections are considered for both components, using the PDF4LHC prescription [95][96][97][98][99] and the recommendations from [25]. For the jet categorization, the effects of higher-order logarithmic terms via the uncertainty in the parton shower model and the underlying event are also considered by comparing different generators. These uncertainties range between 10% and 30%, depending on the jet category. The uncertainties related to the diboson cross sections are calculated using the MCFM program [131].
The systematic uncertainty in the overall signal efficiency is estimated to be about 20% and is dominated by the theoretical uncertainty in the missing higher-order corrections and PDF uncertainties. The total uncertainty in the background estimations in the H → WW signal 8 H → ττ region is about 15%, dominated by the statistical uncertainty in the observed number of events in the background-control regions.
The interpretation of the results in terms of upper limits on the Higgs boson production cross section will be given in Section 10.

H → ττ
The H → ττ decay mode is sensitive to a SM Higgs boson with a mass below about 145 GeV, for which the branching fraction is large. The search uses final states where the two τ leptons are identified either by their leptonic decay to an electron or muon, or by their hadronic decay designated as τ h . Four independent channels are studied: eτ h , µτ h , eµ, and µµ. In each channel, the signal is separated from the background, and in particular from the irreducible Z → ττ process, using the τ-lepton pair invariant mass m ττ , reconstructed from the four-momentum of the visible decay products of the two τ leptons and the E miss T vector, as explained in Section 8.2. Events are classified by the number of additional jets in the final state, in order to enhance the contribution of different Higgs boson production mechanisms. The 0-and 1-jet categories select primarily signal events with a Higgs boson produced by gluon-gluon fusion, or in association with a W or Z vector boson that decays hadronically. These two categories are further classified according to the p T of the τ-lepton decay products, because high-p T events benefit from a higher signal-to-background ratio. Events in the VBF category are required to have two jets separated by a large rapidity, which preferentially selects signal events from the vector-boson fusion production mechanism and strongly enhances the signal purity.

Trigger and inclusive event selection
The high-level trigger requires a combination of electron, muon, and τ h trigger objects [42,140,141]. The electron and muon HLT reconstruction is seeded by electron and muon level-1 trigger objects, respectively, while the τ h trigger object reconstruction is entirely done at HLT stage. A specific version of the particle-flow algorithm is used in the HLT to reconstruct these objects and quantify their isolation, as done in the offline reconstruction. The identification and isolation criteria and the transverse momentum thresholds for these objects were progressively tightened as the LHC instantaneous luminosity increased over the data taking period. In the eτ h and µτ h channels, the trigger requires the presence of a lepton and a τ h , both loosely isolated with respect to the offline isolation criteria described below. In the eµ and µµ channels, the lepton trigger objects are not required to be isolated. For the eτ h , µτ h , and µµ channels, the muon and electron trigger efficiencies are measured with respect to the offline selection in the data and the simulation using Z → ( = e, µ) events passing a single-lepton trigger. For the eµ channel, they are determined using Z → ττ → eµ events passing a single-lepton trigger. The τ h triggering efficiency is obtained using Z → ττ → µτ h events passing a single-muon trigger. In the analysis, simulated events are weighted by the ratio between the efficiency measured in the data and the simulation, which are parametrized as a function of the lepton or τ h transverse momentum and pseudorapidity.
To be considered in the offline event selection, electrons and muons must fulfill tight isolation criteria. The electron and muon isolation parameter R Iso is calculated as in Eq. (1) using a cone size ∆R = 0.4, but with the following differences. The sum ∑ charged p T is performed considering all charged particles associated with the primary vertex, including other electrons and muons. The contribution of neutral pileup particles is estimated as 0.5 ∑ charged,PU p T , where the sum is computed for all charged hadrons from pileup interactions in the isolation cone, and where the factor 0.5 corresponds approximately to the ratio of neutral-to-charged hadron energy in the hadronization process, as estimated from simulation. Electrons and muons are required to have R Iso < 0.1. This criterion is relaxed to 0.15 in the eµ channel for leptons in the barrel, and in the µµ channel for muons with p T < 20 GeV. The τ-isolation discriminator R τ Iso defined in Section 3 is used to select loosely isolated τ h so that the overall τ h identification efficiency is 60-65%, for a jet misidentification probability of 2-3%. Finally, electrons and muons misidentified as τ h are suppressed using dedicated criteria based on the consistency between the tracker, calorimeter, and muon-chamber measurements.
In the eτ h and µτ h channels, we select events containing either an electron with p T > 20 GeV or a muon with p T > 17 GeV, and |η| < 2.1, accompanied by an oppositely charged τ h with p T > 20 GeV and |η| < 2.3. In the 8 TeV data set analysis, the electron and muon p T thresholds are increased to 24 and 20 GeV, respectively, to account for the higher trigger thresholds. In these channels, events with more than one loosely identified electron or muon with p T > 15 GeV are rejected to reduce the Drell-Yan background. In the eµ channel, we demand an electron within |η| < 2.3 and an oppositely charged muon within |η| < 2.1. The higher-p T lepton must have p T > 20 GeV and the other lepton p T > 10 GeV. In the µµ channel, the higher-p T muon is required to have p T > 20 GeV and the other muon p T > 10 GeV. Both muons must be within |η| < 2.1.
Neutrinos produced in the τ-lepton decay are nearly collinear with the visible decay products because the τ-lepton energy is much larger than its mass after event selection. Conversely, in W+jets events where a jet is misidentified as τ h , one of the main backgrounds in the τ h channels, the high mass of the W results in a neutrino direction approximately opposite to the lepton in the transverse plane. In the eτ h and µτ h channels, we therefore require the transverse mass to be less than 40 GeV, where p T is the lepton transverse momentum and ∆φ is the azimuthal angle difference between the lepton momentum and the E miss T vector. In the eµ channel, instead of an m T requirement, we demand D ζ ≡ p ζ − 0.85 · p vis ζ > −25 GeV, where p vis ζ = p T,1 ·ζ + p T,2 ·ζ.
Here, as illustrated in Fig. 25,ζ is a unit vector along the ζ axis, defined as the bisector of the lepton directions in the transverse plane [142], p T,i are the lepton transverse momenta, and E miss T is the missing transverse energy vector.
The D ζ distribution is shown in Fig. 27(b). Requiring a large D ζ rejects W+jets and tt events, for which the E miss T vector is typically oriented in the opposite direction of the two-lepton system, resulting in a small D ζ . Conversely, in H → ττ or Z → ττ events, the neutrinos are emitted along the directions of the two τ leptons, resulting in a large D ζ . The 0.85 factor is introduced to bring the mean of the D ζ distribution to 0 for Z → ττ.
In the µµ channel, the sample of dimuon events is largely dominated by the Z → µµ background, which is suppressed using a BDT discriminant combining a set of variables related to the kinematics of the dimuon system, and the distance of closest approach between the two muons. p T,1 p T,2 E miss T ζ axis

The ττ invariant-mass reconstruction
The invariant mass m vis of the visible decay products of the two τ leptons can be used as an estimator of the mass of a possible parent boson, in order to separate the H → ττ signal from the irreducible Z → ττ background. However, the neutrinos from τ-lepton decays can have substantial energy limiting the separation power of this estimator. An alternative approach is to reconstruct the neutrino energy using a collinear approximation [143], which has the disadvantage of providing an unphysical solution for about 20% of the events, in particular when the E miss T and the parent boson p T are small. The SVFit algorithm described below reconstructs the ττ invariant-mass m ττ with improved resolution and gives a physical solution for every event.
Six parameters are needed to specify τ-lepton decays to hadrons: the polar and azimuthal angles of the visible decay product system in the τ-lepton rest frame, the three boost parameters from the τ-lepton rest frame to the laboratory frame, and the invariant mass m vis of the visible decay products. In the case of a leptonic τ-lepton decay, two neutrinos are produced, and the invariant mass of the two-neutrino system constitutes a seventh parameter. The unknown parameters are constrained by four observables that are the components of the four-momentum of the system formed by the visible τ-lepton decay products, measured in the laboratory frame. For each hadronic (leptonic) τ-lepton decay, 2 (3) parameters are thus left unconstrained. We choose these parameters to be: • x, the fraction of the τ-lepton energy in the laboratory frame carried by the visible decay products.
• φ, the azimuthal angle of the τ-lepton direction in the laboratory frame.
The two components E miss x and E miss y of the missing transverse energy vector provide two further constraints, albeit with an experimental resolution of 10-15 GeV on each [144].
The fact that the reconstruction of the τ-lepton pair decay kinematics is underconstrained by the measured observables is addressed by a maximum-likelihood fit method. The mass m ττ is reconstructed by combining the measured observables E miss x and E miss y with a likelihood model that includes terms for the τ-lepton decay kinematics and the E miss T resolution. The model gives the probability density f ( z| y, a 1 , a 2 ) to observe the values z = (E miss x , E miss y ) in an event, given that the unknown parameters specifying the kinematics of the two τ-lepton decays have values a 1 = (x 1 , φ 1 , m νν,1 ) and a 2 = (x 2 , φ 2 , m νν,2 ), and that the four-momenta of the visible decay products have the measured values y = (p vis 1 , p vis 2 ). The likelihood model is used to compute the probability as a function of mass hypothesis m i ττ . The best estimatem ττ for m ττ is taken to be the value of m i ττ that maximizes P(m i ττ ). The probability density f ( z| y, a 1 , a 2 ) is the product of three likelihood functions. The first two model the decay parameters a 1 and a 2 of the two τ leptons, and the last one quantifies the consistency of a τ-lepton decay hypothesis with the measured E miss T . The likelihood functions modelling the τ-lepton decay kinematics are different for leptonic and hadronic τ-lepton decays. Matrix elements from Ref. [145] are used to model the differential distributions in the leptonic decays, within the physically allowed region 0 ≤ x ≤ 1 and 0 ≤ m νν ≤ m τ √ 1 − x. For hadronic τlepton decays, a model based on two-body phase-space [146] is used, treating all the τ-lepton visible decay products as a single system, within the physically allowed region m 2 vis m 2 τ ≤ x ≤ 1. We have verified that the two-body phase space model is adequate for representing hadronic τ-lepton decays by comparing distributions generated by a parameterized MC simulation based on the two-body phase-space model with the detailed simulation implemented in TAUOLA. The likelihood functions for leptonic (hadronic) τ-lepton decays do not depend on the parameters x and φ (x, φ, and m νν ). The dependence on x enters via the integration boundaries, and the dependence on φ comes from the E miss T likelihood function.
The E miss T likelihood function L MET quantifies the compatibility of a τ-lepton decay hypothesis with the reconstructed missing transverse momentum in an event, assuming the neutrinos from the τ-lepton decays are the only source of E miss T , and is defined as In this expression, the expected E miss T resolution is represented by the covariance matrix V, estimated on an event-by-event basis using a E miss T -significance algorithm [144], and |V| is the determinant of this matrix.
The m ττ resolution achieved by the SVFit algorithm is estimated to be about 20% from simulation. Figure 26 shows the normalized distributions of m vis and m ττ in the µτ h channel from simulated Z → ττ events and simulated SM Higgs boson events with m H = 125 GeV. The SVFit mass reconstruction allows for a better separation between signal and background than m vis .

Event categories
To further enhance the sensitivity of the search for the SM Higgs boson, the selected events are split into mutually exclusive categories based on the jet multiplicity, and the transverse momentum of the visible τ-lepton decay products. The jet multiplicity categories are defined using jets within |η| < 5. In some cases, events are rejected if they contain a b-tagged jet, identified using the CSV algorithm described in Section 3. From simulation, the efficiency for b-jet tagging is 75%, with a misidentification rate of 1%. The event categories are: • VBF: In this category, two jets with p T > 30 GeV are required in the event. A rapidity gap is demanded by requiring there be no third jet with p T > 30 GeV between these two jets. A BDT discriminator is used to discriminate between VBF Higgs boson production and the background processes. This discriminator takes as input the invariant mass of the two jets, the differences in η and φ between the directions of the two jets, the p T of the τ h τ h system, the p T of the τ h τ h -E miss T system, the p T of the dijet system, and the difference in η between the τ h τ h system direction and the closest jet. In the eµ channel, the large tt background is suppressed by rejecting events with a b-tagged jet with p T > 20 GeV. • 1-jet: Events in this category are required to have ≥1 jet with p T > 30 GeV, not fulfill the VBF criteria, and not contain any b-tagged jet with p T > 20 GeV. This category addresses the production of a high-p T Higgs boson recoiling against a jet. Events with high-p T Higgs bosons typically have much larger E miss T and thus benefit from a more precise measurement of m ττ , owing to the improved E miss T resolution. In the eτ h channel, the large background from Z → ee + jets events with one electron misidentified as τ h is reduced by requiring E miss T > 30 GeV.
• 0-jet: This category requires events to have no jet with p T > 30 GeV and no b-tagged jet with p T > 20 GeV. In the eτ h channel, E miss T is required to be larger than 30 GeV as in the 1-jet category.
The 0-and 1-jet categories are each further divided into two subsets, using the p T of the visible τ-lepton decay products, either hadronic or leptonic. We label these subsets "low-p T " and "high-p T ". In the eτ h and µτ h channels, the boundary between the two subsets is defined as p T (τ h ) = 40 GeV. In the eµ and µµ channels, the threshold is at 35 GeV on the muon p T and 30 GeV on the leading muon p T , respectively. Thus, five independent categories of events are used in the SM Higgs boson search: VBF, 1-jet/high-p T , 1-jet/low-p T , 0-jet/high-p T , and 0-jet/low-p T .

Background estimation and systematic uncertainties
For each channel and each category, Table 9 shows the overall number of events observed in the 7 and 8 TeV data, as well as the corresponding number of expected events from the various background contributions, in the full m ττ range. The expected number of events from a SM Higgs boson signal of mass m H = 125 GeV is also shown. The numbers in Table 9 cannot be used to estimate the global significance of a possible signal since the expected significance varies considerably with m ττ , and the sensitive 1-jet/high-p T category is merged with the 1jet/low-p T category.
The largest source of background is the Drell-Yan production of Z → ττ. This contribution is greatly reduced by the 1-jet and VBF selection criteria, and is modelled using a data sample of Z → µµ events, in which the reconstructed muons are replaced by the reconstructed particles from simulated τ-lepton decays, a technique called "embedding". The background yield is rescaled to the Z → µµ yield in the data before any jet selection, thus, for this dominant background, the systematic uncertainties in the efficiency of the jet-category selections and the luminosity measurement are negligible. In the eτ h and µτ h channels, the largest remaining systematic uncertainty affecting this background yield is in the τ h selection efficiency. This uncertainty, which includes the uncertainty in the τ h triggering efficiency, is estimated to be 7% from an independent study based on a tag-and-probe technique [116].
The Drell-Yan production of Z → , labelled as Z+jets in Table 9, is an important source of background in the eτ h channel, owing to the 2-3% probability for electrons to be misidentified as τ h [48], and the fact that the reconstructed ττ invariant-mass distribution peaks in the Higgs boson mass search range. The contribution of this background in the eτ h and µτ h channels is estimated from simulation. The simulated Drell-Yan yield is rescaled to the data using Z → µµ events, and the efficiencies of the jet category selections are measured in a Z → µµ data sample. The dominant systematic uncertainty in the background yield is from the → τ h misidentification rate, which is obtained by comparing tag-and-probe measurements from Z → events in the data and the simulation, and is 30% for electrons and 100% for muons. The very small probability for a muon to be misidentified as τ h makes it difficult to estimate the systematic uncertainty in this probability, but also makes this background very small in the µτ h channel.
The background from W+jets production contributes significantly to the eτ h and µτ h channels when the W boson decays leptonically and one jet is misidentified as a τ h . The background is modelled for these channels using the simulation. The W+jets background yield is normalized to the data in a high-m T control region dominated by the background in each of the five categories. The factor for extrapolating to the low-m T signal region is obtained from the simulation, and has a 30% systematic uncertainty. In the 1-jet/high-p T and VBF categories, where the number of simulated events is marginal, mass-shape templates are obtained by relaxing the τ h isolation requirement, ensuring that the bias introduced in the shape is negligible. Figure 27 (upper left) shows the m T distribution obtained in the µτ h channel after the inclusive selection from data and simulation. In the high-m T region, the agreement between the observed and expected yields comes from the normalization of the W+jets prediction to the data. Table 9: Observed and expected numbers of events in the four H → ττ decay channels and the 3 event categories, for the combined 7 and 8 TeV data. The uncertainties include the statistical and systematic uncertainties added in quadrature. In the 0-and 1-jet categories, the low-and high-p T subcategories have been combined. The expected number of signal events for a SM Higgs boson of mass m H = 125 GeV is also given.  The tt production process is the main remaining background in the eµ channel. The predicted yield for all channels is obtained from simulation, with the yield rescaled to the one observed in the data from a tt-enriched control sample, extracted by requiring b-tagged jets. The systematic uncertainty in the yield includes a 10% systematic uncertainty in the b-tagging efficiency. QCD multijet events, in which one jet is misidentified as τ h and another as a lepton, constitute another important source of background in the eτ h and µτ h channels. In the 0-and 1-jet categories, the QCD multijet background prediction is obtained using a control sample where the lepton and the τ h are required to have the same charge. In this control sample, the QCD multijet distribution and yield are obtained by subtracting from the data the contribution of the Drell-Yan, tt, and W+jets processes, estimated as explained above. The expected contribution of the QCD multijet background in the opposite-charge signal sample is then derived by rescaling the yield obtained in the same-charge control sample by a factor of 1.1, which is measured in the data using a pure QCD multijet sample obtained by inverting the lepton isolation and relaxing the τ h isolation. The 10% systematic uncertainty in this factor covers its small dependence on p T (τ h ) and the statistical uncertainty in its measurement, and dominates the uncertainty in this background contribution. In the VBF category, the number of events in the same-charge control sample is too small to use this procedure. Instead, the QCD multijet yield is obtained by multiplying the inclusive QCD yield by the VBF selection efficiency measured in data using a QCD-dominated sample in which the lepton and the τ h are not isolated. The mass shape template is obtained from data by relaxing the muon and τ h isolation criteria.

H → ττ
The small background from W+jets and QCD multijet events in the eµ channel is estimated from the number of events with one identified lepton and a second lepton that passes relaxed selection criteria, but fails the nominal lepton selection. This number is converted to the expected background yield using the efficiencies for such loosely identified lepton candidates to pass the nominal lepton selection. These efficiencies are measured in data using QCD multijet events.
Finally, the small background contribution in each channel from diboson and single top-quark production is estimated using the simulation. The main experimental systematic uncertainties affecting the expected signal yield are from the τ h identification efficiency (7%), the E miss T scale (5%), owing to the m T requirement and the E miss T selection applied to the 0-and 1-jet categories of the eτ h channel, the integrated luminosity (5%), and the jet energy scale (< 4%). The uncertainties in the muon and electron selection efficiencies, including trigger, identification, and isolation, are both 2%. The theoretical uncertainty in the signal yield comes from the uncertainties in the PDFs, the renormalization and factorization scales, and the modelling of the underlying event and parton showers. The magnitude of the theoretical uncertainty depends on the production process (gluon-gluon fusion, VBF, or associated production) and on the event category. In particular, the scale uncertainty in the VBF production yield is 10%. The scale uncertainty in the gluon-gluon fusion production yield is 10% in the 1-jet/high-p T category and 30% in the VBF category. The τ h (3%) and electron (1%) energy scale uncertainties cause an uncertainty in the m ττ spectrum shape, and are discussed in the next section. The muon energy scale uncertainty is negligible.

Results
The statistical methodology described in Section 10.1 is used to search for the presence of a SM Higgs boson signal, combining the five categories of the four final states in the 7 and 8 TeV data sets as forty independent channels in a binned likelihood based on the m ττ distributions obtained for each channel. Systematic uncertainties are represented by nuisance parameters in the likelihood. A log-normal prior is assumed for the systematic uncertainties affecting the background normalization, discussed in the previous section. The τ h and electron energy scale uncertainties, which affect the shape of the m ττ spectrum, are represented by nuisance parameters whose variation results in a continuous change of this shape [147]. Figures 28 and 29 show the observed m ττ distributions in the eτ h , µτ h , eµ, and µµ channels, for each event category, compared with the background predictions. The 7 and 8 TeV data sets are merged, as well as the low-and high-p T subcategories of the 0-and 1-jet categories. The binning given in the figures corresponds to the binning used in the likelihood. The background mass distributions are the result of the global maximum-likelihood fit under the backgroundonly hypothesis. This fit finds the best set of values for the nuisance parameters to match the data, assuming no signal is present. The variation of the nuisance parameters is limited by the systematic uncertainties estimated for each of the background contributions and used as input to the fit. For example, in the VBF category of the eτ h channel, the most important nuisance parameters related to background normalization are the ones affecting the Z → ττ yield (τ h selection efficiency), the Z → ee yield (e → τ h misidentification rate), the W+jets yield (extrapolation from the high m T to the low m T region), and the QCD yield (ratio between the yields in the opposite-charge and same-charge regions). The fit makes use of the high-m ττ region of the VBF category to constrain the nuisance parameters affecting the W+jets yield. The nuisance parameter related to the τ h identification efficiency is mostly constrained by the 0-and 1-jet categories, where the number of events in the Z → ττ peak is much larger. It is also the case for the nuisance parameter related to the τ h energy scale, which affects the shape of the Z → ττ distribution.
The interpretation of the results in terms of upper limits on the Higgs boson production cross section is given in Section 10.

H → bb
The decay H → bb has the largest branching fraction of the five search modes for m H ≤ 135 GeV, but the signal is overwhelmed by the QCD multijet production of b quarks. The analysis is therefore designed to search for a dijet resonance in events where a Higgs boson is produced at high p T , in association with a W or Z boson that decays leptonically, which largely suppresses the QCD multijet background. The following final states are included in the search: W(µν)H, W(eν)H, Z(µµ)H, Z(ee)H, and Z(νν)H, all with the Higgs boson decaying to bb. Backgrounds arise from the production of vector bosons in association with jets (from all quark flavours), singly-and pair-produced top quarks, dibosons, and QCD multijet processes. Simulated samples of signal and background events are used to optimize the analysis. Control regions in data are selected to adjust the predicted event yields from simulation for the main background processes and to estimate their contribution in the signal region.
Several different high-level triggers are used to collect events consistent with the signal hypothesis in all five channels. For the WH channels, the trigger paths consist of several singlelepton triggers with tight lepton identification. Leptons are also required to be isolated from other tracks and calorimeter energy depositions to maintain an acceptable trigger rate. For the W(µν)H channel, in the 7 TeV data set, the trigger thresholds for the muon transverse momentum, p T , vary from 17 to 40 GeV. The higher thresholds are implemented for periods of higher instantaneous luminosity. For the 8 TeV data set, the muon p T threshold is 24 GeV for the isolated-muon trigger, and 40 GeV for muons without any isolation requirements. The combined single-muon trigger efficiency is ≈90% for signal events that pass all offline requirements, described in Section 9.1. For the W(eν)H channel, in the 7 TeV data set, the electron p T threshold ranges from 17 to 30 GeV. In addition, two jets and a minimum value on the missing transverse energy are required. These additional requirements help to maintain acceptable trigger rates during the periods of high instantaneous luminosity. For the 8 TeV data set, a single-isolated-electron trigger is used with a 27 GeV p T threshold. The combined efficiency for these triggers for signal events that pass the final offline selection criteria is larger than 95%.
The Z(µµ)H channel uses the same single-muon triggers as the W(µν)H channel. For the Z(ee)H channel, dielectron triggers with lower-p T thresholds of 17 and 8 GeV and tight isola-  Figure 28: Observed (points with error bars) and expected (histograms) m ττ distributions for the eτ h (left) and µτ h (right) channels, and, from top to bottom, the 0-jet, 1-jet, and VBF categories for the combined 7 and 8 TeV data sets. In the 0-and 1-jet categories, the low-and high-p T subcategories have been summed. The electroweak background combines the expected contributions from W+jets, Z+jets, and diboson processes. In the case of eτ h , the Z → ee background is shown separately. The dotted histogram shows the expected distribution for a SM Higgs boson with m H = 125 GeV (multiplied by a factor of 5 for clarity).   Figure 29: Observed (points with error bars) and expected (histograms) m ττ distributions for the eµ (left) and µµ (right) channels, and, from top to bottom, the 0-jet, 1-jet, and VBF categories for the combined 7 and 8 TeV data sets. In the 0-and 1-jet categories, the low-and high-p T subcategories have been summed. The electroweak background combines the contributions from W+jets, Z+jets, and diboson processes. In the case of µµ, the Z → µµ background is shown separately. The dotted histogram shows the expected distribution for a SM Higgs boson with m H = 125 GeV (multiplied by a factor of 5 for clarity). 9 H → bb tion requirements are used. These triggers are ≈ 99% efficient for ZH signal events that pass the final offline selection criteria. For the Z(νν)H channel, combinations of several triggers are used, all with the requirement that the missing transverse energy be above a certain threshold. Additional jet requirements are made to keep the trigger rates acceptable as the luminosity increases and to reduce the E miss T thresholds, in order to increase the signal acceptance. A trigger with E miss T > 150 GeV requirement is implemented for both the 7 and 8 TeV data sets. For the 7 TeV data, triggers that require the presence of two jets with |η| < 2.6, p T > 20 GeV, and E miss T thresholds of 80 and 100 GeV, depending on the instantaneous luminosity, are also used. For the 8 TeV data set, a trigger that requires two jets, each with |η| < 2.6 and p T > 30 GeV, and E miss T > 80 GeV is also implemented. As the instantaneous luminosity increased further, this trigger was replaced by one requiring E miss T > 100 GeV, two jets with |η| < 2.6, one with p T > 60 GeV and the other with p T > 25 GeV, the dijet p T > 100 GeV, and no jet with p T > 40 GeV within 0.5 radians in azimuthal angle of the E miss T vector. For Z(νν)H signal events with missing transverse energy > 160 GeV, the overall trigger efficiency is ≈ 98% with respect to the offline event reconstruction and selection described below. The corresponding efficiency for 120 < E miss T < 160 GeV is about 66%.

Event selection
The final-state objects used in the H → bb event reconstruction are described in Section 3. Electron candidates are considered in the pseudorapidity range |η| < 2.5, excluding the 1.44 < |η| < 1.57 transition region between the ECAL barrel and endcaps. Tight muon candidates are considered in the |η| < 2.4 range. An isolation requirement on R Iso of approximately 10%, as calculated in Eq. (1), that is consistent with the expectation for leptons originating from W and Z boson decays, is applied to electron and muon candidates. The exact requirement depends on the lepton η, p T , and flavour. To identify b jets, different values for the CSV output discriminant, which can range between 0 and 1, are used, with corresponding different efficiencies and misidentification rates. For example, with a CSV > 0.90 requirement, the efficiencies to tag b quarks, c quarks, and light quarks, are 50%, 6%, and 0.15%, respectively [31]. The corresponding efficiencies for CSV > 0.50 are 72%, 23%, and 3%. All events from data and simulation are required to pass the same trigger and event reconstruction algorithms. Scale factors that account for differences in the performance of these algorithms between data and simulation are computed and used in the analysis.
The background processes to VH production are V+jets, tt, single-top-quark, diboson (VV), and QCD multijet production. These overwhelm the signal by several orders of magnitude. The event selection is based on the kinematic reconstruction of the vector boson and the Higgs boson decay into two b-tagged jets. Backgrounds are then substantially reduced by requiring a significant boost of the p T of the vector boson and the Higgs boson [148], which tend to recoil from each other with a large azimuthal opening angle, ∆φ(V,H), between them. For each channel, two ranges of p T (V) are considered. These are referred to as "low" and "high". Owing to different signal and background compositions, each p T (V) range has a different sensitivity, and the analysis is performed separately for each range. The results from all the ranges are then combined for each channel. The ranges for the WH channels are 120 < p T (V) < 170 GeV and p T (V) > 170 GeV, for the Z(νν)H channel 120 < p T (V) < 160 GeV and p T (V) > 160 GeV, and for the Z( )H channel 50 < p T (V) < 100 GeV and p T (V) > 100 GeV.
Candidate W → ν decays are identified by requiring the presence of a single isolated lepton and missing transverse energy. Muons (electrons) are required to have a p T above 20 (30) GeV. For the W(eν)H channel only, to reduce contamination from QCD multijet processes, E miss T is required to be greater than 35 GeV. Candidate Z → decays are reconstructed by combin-ing isolated, oppositely charged pairs of electrons or muons with p T > 20 GeV and a dilepton invariant mass satisfying 75 < m < 105 GeV. The identification of Z → νν decays requires the E miss T in the event to be within the p T (V) ranges described above. Two requirements suppress events from QCD multijet processes with an E miss T arising from mismeasured jets. First, the E miss T vector must be isolated from jet activity, using the requirement that the azimuthal angle difference ∆φ(E miss T , j) between the E miss T direction and any jet with |η| < 2.5 and p T >20 (30) GeV be greater that 0.5 radians for the 7 (8) TeV data sample. Second, the azimuthal angle between the E miss T vector calculated using only charged particles with p T > 0.5 GeV and |η| < 2.5 and the direction of the standard E miss T vector (calculated using all particles, charged and neutral) must be greater than 0.5 radians. Subject to these two requirements, background from QCD multijet processes is reduced to a negligible level in the Z(νν)H channel. To reduce the tt and WZ background in the WH and Z(νν)H channels, events where the number of additional isolated leptons with p T > 20 GeV is greater than 0 are rejected.
Reconstruction of the H → bb decay is done by requiring two jets above the minimum p T thresholds listed in Table 10, having |η| < 2.5, and tagged by the CSV algorithm. If more than two such jets are found in the event, the pair with the highest total dijet transverse momentum, p T (jj), is selected. The background from V+jets and dibosons is reduced significantly through b tagging, and subprocesses where the two jets originate from genuine b quarks dominate the final selected data sample. After all the event selection criteria are applied, the invariant-mass resolution for the Higgs boson decay to bb is approximately 10%, as found in a previous CMS analysis [149]. The mass resolution is improved here by applying regression techniques similar to those used by the CDF experiment [150]. Through this procedure, a further correction, beyond the standard jet energy corrections, is computed for individual b jets in order to better measure the true parton energy. A BDT algorithm is trained on simulated H → bb signal events, with inputs that include detailed information about each jet that helps to differentiate b-quark jets from light-flavour jets. The resulting improvement in the bb invariant-mass resolution is approximately 15%, resulting in an increase in the analysis sensitivity of 10-20%, depending on the specific channel. The BDT regression is implemented in the TMVA framework [40]. The complete set of input variables is (though not all variables are used for every channel): • transverse momentum of the jet before and after energy corrections; • transverse energy and mass of the jet after energy correction; • uncertainty in the jet energy correction; • transverse momentum of the highest-p T constituent in the jet; • pseudorapidity of the jet; • total number of jet constituents; • length and uncertainty of the displacement of the jet's secondary vertex; • mass and transverse momentum of the jet's secondary vertex; • number and fraction of jet constituents that are charged; • event energy density, ρ, calculated using constituents with |η| < 2.5; • missing transverse energy in the event; • azimuthal angle between the missing transverse energy vector and the direction of the nearest jet in pseudorapidity.
To better discriminate the signal from background for different Higgs boson mass hypotheses, an event classification BDT algorithm is trained separately for each mass value using simu-9 H → bb Table 10: Selection criteria for the simulated event samples used in training of the signal and background BDT algorithm. Variables marked "-" are not used in the given channel. Entries in parentheses indicate the selection for the high-p T (V) range. The second and third rows refer to the p T threshold for the highest-and second-highest-p T jet, respectively, for the pair with the highest total dijet transverse momentum, p T (jj). The parameter N al is the number of additional isolated leptons in the event. Kinematic variables are given in GeV and angles in radians.
vector boson transverse momentum CSV max value of CSV for the b-tagged jet with the largest CSV value CSV min value of CSV for the b-tagged jet with the second largest CSV value ∆φ(V,H) azimuthal angle between the vector boson (or E miss T vector) and the dijet direction |∆η(jj)| difference in η between b jets from Higgs boson decay ∆R(j 1 , j 2 ) distance in η-φ between b jets from Higgs boson decay (not for Z( )H) N aj number of additional jets ∆φ(E miss T , j) azimuthal angle between E miss T and the closest jet (only for Z(νν)H) lated samples of signal and background events that pass the selection criteria described above, together with the requirements listed in Table 10. The set of input variables used in training this BDT is chosen by iterative optimization from a larger number of potentially discriminating variables. Table 11 lists these variables. The number N aj of additional jets in an event counts jets that satisfy p T > 20 GeV and |η| < 4.5 for W( ν)H, p T > 20 GeV and |η| < 2.5 for Z( )H, or p T > 30 GeV and |η| < 4.5 for Z(νν)H. The output distribution of this BDT algorithm is fitted to search for events from Higgs boson production. Fitting this distribution, rather than simply counting events in a range of the distribution with a good signal-to-background ratio, as in Ref. [149], improves the sensitivity of the analysis by approximately 20%.

Background control regions
Control regions are identified in the data and used to correct the estimated yields from the MC simulation for two of the important background processes: tt production and V+jets, originating from light-flavour partons (u, d, s, or c quarks and gluons) or from heavy-flavour (b quarks). Simultaneous fits are then performed to the distributions of the discriminating variables in the control regions to obtain scale factors by which the simulation yields are adjusted. This procedure is performed separately for each channel. For the Z( )H and WH modes the scale factors derived for the electron and muon decay channels are combined. These scale factors account not only for possible simulation cross-section discrepancies with the data, but also for potential differences in the selection efficiencies for the various physics object. Therefore, separate scale factors are used for each background process in the different channels. The uncertainties in the scale factor determination include a statistical uncertainty from the fits (owing to the finite size of the samples) and an associated systematic uncertainty. The latter is estimated by refitting the distributions in the control regions after applying estimates for sources of potential systematic shifts such as b-jet-tagging efficiency, jet energy scale, and jet energy resolution.
Tables 12-14 list the selection criteria used for the control regions in the Z( )H , Z(νν)H and WH channels, respectively. Table 15 summarizes the fit results for all channels separately for the 7 TeV and 8 TeV data sets. The fit results are found to be robust and the fitted scale factors are consistent with the values from the previous analysis [149].

Systematic uncertainties
Sources of systematic uncertainty in the expected signal and background yields and distribution shapes are listed in Table 16. The uncertainty in the integrated luminosity measurement is 2.2% for the 7 TeV data [151] and 4.4% for the 8 TeV data [135]. Muon and electron trigger, reconstruction, and identification efficiencies are determined in data from samples of leptonic Z boson decays. The uncertainty in the yields due to the trigger efficiency is 2% per charged lepton and the uncertainty in the identification efficiency is also 2% per lepton. The parameters describing the Z(νν)H trigger efficiency turn-on curve are varied within their statistical uncertainties and for different assumptions on the methodology. A 2% systematic uncertainty in the yield is estimated.
The jet energy scale is varied by ±1 standard deviation as a function of the jet p T and η, and the efficiency of the analysis selection is recomputed. A 2-3% yield variation is found, depending     The parameter N al is the number of additional isolated leptons in the event, and METsig is the ratio of the E miss T value to its uncertainty [144]. The values for kinematical variables are in GeV. The symbols e and µ mean that the selection is used only for the W(eν)H mode or W(µν)H mode, respectively.

Variable
W+jets (LF) tt W+jets (HF) on the particular decay channel and production process. The effect of the uncertainty in the jet energy resolution is evaluated by smearing the jet energies by the measured uncertainty, giving a 3-6% variation in yields. The uncertainties in the jet energy scale and resolution also affect the shape of the BDT output distribution. The impact of the jet energy scale uncertainty is determined by recomputing the BDT distribution after shifting the energy scale up and down by its uncertainty. Similarly, the impact of the jet energy resolution is determined by recomputing the BDT distribution after increasing or reducing the jet energy resolution.
Data-to-simulation b-tagging-efficiency scale factors, measured in tt events and multijet events, are applied to the jets in signal and background events. The estimated systematic uncertainties in the b-tagging scale factors are: 6% per b tag, 12% per c tag, and 15% per mistagged jet (originating from gluons and light quarks) [31]. These translate into yield uncertainties in the 3-15% range, depending on the channel and the production process. The shape of the BDT output distribution is also affected by the shape of the CSV distribution, and therefore recomputed according to the range of variations of the CSV distributions.
The theoretical VH signal cross section is calculated to NNLO, and the systematic uncertainty is 4% [25], including the effects of scale and PDF variations [95][96][97][98][99]. The analysis described in this paper is performed in the regime where the V and H have a significant boost in p T , and thus, potential differences in the p T spectrum of the V and H between the data and the MC simulation generators could introduce systematic effects in the estimates of the signal acceptance and efficiency. Theoretical calculations are available that estimate the NLO electroweak (EW) [83,152,153] and NNLO QCD [84] corrections to VH production in the boosted regime. The estimated effect from electroweak corrections for a boost of ≈150 GeV are 5% for ZH and 10% for WH. For the QCD correction, a 10% uncertainty is estimated for both ZH and WH, which includes effects due to additional jet activity from initial-and final-state radiation. The finite size of the signal MC simulation samples, after all selection criteria are applied, contributes an uncertainty of 1-5% in the various channels.
The total uncertainty in the prediction of the background yields from estimates using data is approximately 10%. For the V+jets background, the differences in the BDT output distribution for events from the MADGRAPH and HERWIG++ MC simulation generators are considered. For the single-top-quark and diboson yield predictions, which are obtained solely from simulation, a 30% systematics uncertainty in the cross sections is used.

Results
Maximum-likelihood fits are performed to the output distributions of the BDT algorithms, trained separately for each channel and each Higgs boson mass value hypothesis in the 110-135 GeV range. In the fit, the BDT shapes and normalizations, for signal and each background component, are allowed to vary within the systematic and statistical uncertainties described in Section 9.3. These uncertainties are treated as nuisance parameters, with appropriate correlations taken into account.
Tables 17-20 summarize the expected signal and background yields for both p T (V) bins in each channel from the 7 TeV and 8 TeV data. All the data/MC scale factors determined in Section 9.2 have been applied to the corresponding background yields. Examples of output BDT distributions, for the m H = 125 GeV training and for the high p T (V) bin, are shown in Figure 30. The signal and background shapes and normalizations are those returned by the fits. Figure 30 also shows the dijet invariant-mass distribution for the combination of all five channels in the combined 7 and 8 TeV data sets, using an event selection that is more restrictive than the one used in the BDT analysis and that is more suitable for a counting experiment in just this observable.
The events considered are those in the high p T (V) bin with tighter b-tagging requirements on both jets, and with requirements that there be no additional jets in the events and that the azimuthal opening angle between the dijet system and the reconstructed vector boson be large. The H → bb search with such a selection is significantly less sensitive than the search using the BDT discriminant and it is therefore not elaborated on further in this article.
The interpretation of the results from the BDT discriminant analysis, in terms of upper limits on the Higgs boson production cross section, is given in Section 10.

Combined results
In this section, we present the results obtained by combining the measurements from all five search channels described above. We begin with a short summary of the statistical method used to combine the analyses.

Combination methodology
Combining the Higgs boson search results requires a simultaneous analysis of the data selected by the individual decay modes, accounting for their correlations and for all the statistical and systematic uncertainties. The statistical methodology used in this combination was developed by the ATLAS and CMS Collaborations in the context of the LHC Higgs Combination Group. A description of the general methodology can be found in Refs. [20,107]. Results presented in this paper are obtained using asymptotic formulae from Ref. [154] and recent updates available in the ROOSTATS package [155]. The Higgs boson mass is tested in steps accordant with the expected Higgs boson width and the experimental mass resolution [107].

Characterizing the absence of a signal: limits
For the calculation of exclusion limits, we adopt the modified frequentist criterion CL s [156,157]. The chosen test statistic q, used to determine how signal-or background-like the data are, is based on a profile likelihood ratio. Systematic uncertainties are incorporated via nuisance parameters and are treated according to the frequentist paradigm, as described in Ref [107]. The profile likelihood ratio is defined as where "obs" stands for the observed data; s stands for the number and distribution of signal events expected under the SM Higgs boson hypothesis; µ is a signal-strength modifier, introduced to accommodate deviations from the SM Higgs boson predictions; b is the number and distribution of background events; µ · s + b is the signal-plus-background hypothesis, with the expected SM signal event yields s multiplied by the signal-strength modifier µ; θ are nuisance parameters describing the systematic uncertainties. The valueθ µ maximizes the likelihood in the numerator for a given µ, whileμ andθ define the point at which the likelihood reaches its global maximum.
The ratio of the probabilities to observe a value of the test statistic at least as large as the one observed in data, q obs µ , under the signal+background (µ · s + b) and background-only (b) hypotheses, is used as the criterion for excluding the presence of a signal at the 1 − α confidence level.
A signal with a cross section σ = µ · σ SM is defined to be excluded at 95% CL if CL s (µ) ≤ 0.05. Here, σ SM stands for the SM Higgs boson cross section.

Characterizing an excess of events: p-values and significance
To quantify the presence of an excess of events beyond what is expected for the background, we use a test statistic: where the likelihood in the numerator is for the background-only hypothesis. The local statistical significance Z local of a signal-like excess is computed from the probability p 0 henceforth referred to as the local p-value, using the one-sided Gaussian-tail convention: In the Higgs boson search, we scan over the Higgs boson mass hypotheses and find the value giving the minimum local p-value p min local , which describes the probability of a background fluctuation for that particular Higgs boson mass hypothesis. The probability to find a fluctuation with a local p-value lower or equal to the observed p min local anywhere in the explored mass range is referred to as the global p-value, p global : The fact that the global p-value can be significantly larger than p min local is often referred to as the "look-elsewhere effect" (LEE). The global significance (and global p-value) of an observed excess can be evaluated following the method described in Ref. [158], using: The constant C is found by generating a set of pseudo-experiments and using it to evaluate the global p-value corresponding to the p min local value observed in the data. Pseudo-experiments are a simulated outcome of an experiment obtained by randomly varying the average expected event yields and their distributions according to a specified model of statistical and systematic uncertainties. For example, a Poisson distribution is used to model statistical variations, while a Gaussian distribution is used to describe the systematic uncertainties.

Extracting signal-model parameters
The values of a set of signal-model parameters a (the signal-strength modifier µ is one of them) are evaluated from a scan of the profile likelihood ratio q(a): The values of the parametersâ andθ that maximize the likelihood L(obs | s(â) + b,θ), are called the best-fit set. The 68% (95%) CL interval for a given signal-model parameter a i is evaluated from q(a i ) = 1 (3.84), with all other unconstrained model parameters treated as nuisance parameters. The two-dimensional (2D) 68% (95%) CL contours for pairs of signalmodel parameters a i , a j are derived from q(a i , a j ) = 2.3 (6.0). Note that the boundaries of the 2D confidence-level region projected onto either parameter axis are not identical to the onedimensional (1D) confidence intervals for this parameter. In the H → γγ analysis, the SM Higgs boson signal is searched for in a simultaneous statistical analysis of the diphoton invariant-mass distributions for the eleven exclusive event classes: five classes (four untagged and one VBF-tagged) for the 7 TeV data and six classes (four untagged and two VBF-tagged) for the 8 TeV data, as described in Section 5. Figure  In the H → ZZ → 4 analysis, the SM Higgs boson signal is searched for in a simultaneous statistical analysis of six 2D distributions of the four-lepton invariant mass m 4 and the matrixelement-based kinematic discriminant K D , as described in Section 6. The six distributions correspond to the three lepton final states (4e, 4µ, 2e2µ) and the 7 and 8 TeV data sets. Figure 32 (upper left) shows the 95% CL upper limits on the Higgs boson production cross section. The H → ZZ → 4 search has reached the sensitivity for excluding the SM Higgs boson at 95% CL in the mass range 120-180 GeV, while the observed data exclude it in the following two mass ranges: 130-164 GeV and 170-180 GeV. The observed exclusion limits for m H = 120-130 GeV are much weaker than the expected limits for the background-only hypothesis, suggesting a significant excess of four-lepton events in this mass range. As a cross-check, the statistical analysis using only the m 4 distributions has been performed. The results are found to be consistent with the 2D analysis, although with less sensitivity.

Results of searches in the five decay modes
In the H → WW → ν ν analysis, the SM Higgs boson signal is searched for in a simultaneous statistical analysis of eleven exclusive final states: same-flavour (e + e − and µ + µ − ) dilepton events with 0 and 1 jet for the 7 and 8 TeV data sets, different-flavour e ± µ ∓ dilepton events with 0 and 1 jet for the 7 and 8 TeV data sets, dilepton events in the VBF-tag category for the 7 TeV data set, and same-flavour and different-flavour dilepton events in the VBF-tag category for the 8 TeV data set. All analysis details can be found in Section 7. Figure 32  122-160 GeV (the higher-mass range is not discussed in this paper), while the observed data exclude it in the mass range 129-160 GeV. The observed exclusion limits are weaker than the expected ones for the background-only hypothesis in the entire mass range, suggesting an excess of events in data. However, given the mass resolution of about 20% in this channel, owing to the presence of the two undetectable neutrinos, a broad excess is observed across the mass range from 110 to about 130 GeV. The dotted line in Fig. 32 (b) indicates the median expected exclusion limits in the presence of a SM Higgs boson with a mass near 125 GeV. The observed limits in this channel are consistent with the expectation for a SM Higgs boson of 125 GeV.
In the H → ττ channel, the 0-, 1-jet, and VBF categories are used to set 95% CL upper limits on the Higgs boson production. The ditau system is reconstructed in four final states: eτ h , µτ h , eµ, µµ, where the leptons come from τ → eνν or τ → µνν decays. The 0-and 1-jet categories are further split into two categories of low or high ditau transverse momentum. The 7 and 8 TeV data are treated independently giving a total of 40 ditau mass distributions. All analysis details can be found in Section 8. Figure 32 (lower left) shows the 95% CL upper limits on the Higgs boson production cross section in this channel. The H → ττ search has not yet reached the SM Higgs boson exclusion sensitivity; the expected limits on the signal event rates are 1.3-2.4 times larger than the event rates expected for the SM Higgs boson in this channel.
In the H → bb analysis, five final states are considered: two b-tagged jets with E miss T (Z → νν), e + e − , µ + µ − (Z → + − ), e + E miss T , and µ + E miss T (W → ν). Each of these categories is further split into two categories of low or high bb transverse momentum. The 7 and 8 TeV data are treated independently giving a total of 20 BDT-output distributions. All analysis details can be found in Section 9. Figure 32 (lower right) shows the 95% upper CL limits on the Higgs boson production cross section in this channel. The H → bb search has not yet reached the SM Higgs boson exclusion sensitivity; the expected limits on the signal event rates are 1.2-2.8 times larger than the event rates expected for the SM Higgs boson in this channel.

Combined results
The five individual search channels described above are combined into a single search for the SM Higgs boson. Figure 33 (left) shows the 95% CL upper limits on the signal-strength modifier, µ = σ/σ SM , as a function of m H . We exclude a SM Higgs boson at 95% CL in two mass ranges: 110-121.5 GeV and 128.0-145 GeV.
The CL s value for the SM Higgs boson hypothesis as a function of its mass is shown in Fig. 33 (right). The horizontal lines indicate CL s values of 0.05, 0.01, and 0.001. The mass regions where the observed CL s values are below these lines are excluded with the corresponding (1 − CL s ) confidence levels of 95%, 99%, and 99.9%, respectively. The 95% CL exclusion range for the SM Higgs boson is identical to that shown in Fig. 33 (left), as both results are simply different representations of the same underlying information. At 99% CL, we exclude the SM Higgs boson in three mass ranges: 110.0-111.5 GeV, 113.5-121.0 GeV, and 128.5-145.0 GeV. Figure 33 (right) shows that, in the absence of a signal, we would expect to exclude the entire m H range of 110-145 GeV at the 99.9% CL or higher. In most of the Higgs boson mass range, the differences between the observed and expected limits are consistent since the observed limits are generally within the 68% or 95% bands of the expected limit values. However, in the range 121.5 < m H < 128.0 GeV, we observe an excess of events, making the observed limits considerably weaker than expected in the absence of the SM Higgs boson and, hence, not allowing the exclusion of the SM Higgs boson in this range.

Results of searches in the H → γγ and H → ZZ → 4 decay modes
As presented in Section 10.2.1, the searches for the SM Higgs boson in the γγ and ZZ → 4 modes reveal a substantial excess of events with diphoton and four-lepton invariant masses near 125 GeV. Figure 34 shows the local p-value as a function of the SM Higgs boson mass in the γγ channel. The results are presented for the three analyses: (a) baseline analysis, and in the two alternative analyses: (b) cut-based analysis, and (c) sideband analysis. Figure 34 (top) shows about a 3σ excess near 125 GeV in both the 7 and 8 TeV data. The minimum local p-value p 0 = 1.8 × 10 -5 , corresponding to a local maximum significance of 4.1σ, occurs at a mass of 125.0 GeV for the combined 7 and 8 TeV data sets. The median expected significance for a SM Higgs boson of this mass is 2.7σ. In the asymptotic approximation, 68% (95%) of repeated experiments would give results within ±1σ (±2σ) around the median expected significance. Therefore, the excess seen in data, even being larger than the expected median for a Higgs boson signal, is consistent with a SM Higgs boson with a probability of about 16%. The consistency of the results from the three analyses is a good check on the robustness of the measurement.
The local p-value as a function of the Higgs boson mass m H for the ZZ → 4 channel is shown in Fig. 35. The minimum of the local p-value is at m H = 125.5 GeV and corresponds to a local significance of 3.2σ. A local significance of 2.2σ is found for a 1D fit of the invariant mass without using the K D discriminant. The median expected significance for a SM Higgs boson of this mass is 3.8σ and 3.2σ for the 2D and 1D fits, respectively.

Combined results
To quantify the inconsistency of the observed excesses with the background-only hypothesis, we show in Fig. 36 (left) the local p-value p 0 for the five decay modes combined for the 7 and 8 TeV data sets. The 7 and 8 TeV data sets exhibit excesses of 3.2σ and 3.8σ, respectively, for a SM Higgs boson with a mass near 125 GeV. In the combination, the minimum local p-value of p min = 3 × 10 −7 , corresponding to a local significance of 5.0σ, occurs at m H = 125.5 GeV. Figure 36 (right) gives the p-value distribution for each of the decay channels. The largest contributions to the overall excess are from the γγ and ZZ → 4 channels. Both channels have good mass resolution and allow a precise measurement of the mass of the resonance corresponding to the excess. Their combined significance is 5.0σ, as displayed in Fig. 37 (left). Figure 37 (right) shows the combined p-value distribution for the channels with poorer mass resolution: WW, ττ, and bb. Table 21 summarizes the median expected and observed local significance for a SM Higgs boson mass hypothesis of 125.5 GeV from the individual decay modes and their combinations. In the ττ channel, we do not observe an excess of events at this mass. The expected significance is evaluated assuming the expected background and signal rates. The observed significance is expected to be within ±1σ of the expected significance with a 68% probability.
The LEE-corrected significance is evaluated by generating 10 000 pseudo-experiments. After fitting for the constant C in Eq. (16), we find that the global significance of the signal at m H = 125.5 GeV is 4.6σ (4.5σ) for the mass search range 115-130 GeV (110-145 GeV).
The low probability for an excess at least as large as the observed one to arise from a statistical fluctuation of the background leads to the conclusion that we observe a new particle with a  mass near 125 GeV. The γγ and ZZ → 4 decay modes indicate that the new particle is a boson, and the diphoton decay implies that its spin is different from 1 [159,160].

Mass of the observed state
To measure the mass of the observed state, we use the γγ and ZZ → 4 decay modes. Figure 38 (left) shows the 2D 68% CL regions for the signal cross section (normalized to the SM Higgs boson cross section) versus the new boson's mass m X , separately for untagged γγ, VBF-tagged γγ, and ZZ → 4 events, and their combination. The combined 68% CL contour shown with a solid line in Fig. 38 (left) assumes that the relative event yields between the three channels are fixed to the SM expectations, while the overall signal strength is a free parameter.
The energy scale uncertainties for photons, electrons, and muons are treated as independent. The Z → ee peak is used for correcting both photon and electron energy scales. However, we find that they have a very weak correlation, since photons in H → γγ decays and electrons in H → ZZ → 4 decays have substantially different energy scales. Moreover, the photons have an additional systematic uncertainty associated with the extrapolation of the energy scale corrections derived for the electrons to the energy scale corrections to be used for the photons.
To measure the value of m X in a model-independent way, the untagged γγ, VBF-tagged γγ, and ZZ → 4 channels are assumed to have independent signal cross sections. This is achieved by scaling the expected SM Higgs boson event yields in these channels by independent factors µ i , where i denotes the individual channel. The signal is assumed to be a particle with a unique mass m X . The mass and its uncertainty are extracted from a scan of the combined test statistic q, frequently referred to as −2 ∆ ln L, versus m X . The signal-strengths µ i in such a scan are treated in the same way as the other nuisance parameters. Figure 38 (right) shows the test statistic as a function of m X for the three final states separately and their combination. The crossing of the q(m X ) curves with the horizontal thick (thin) lines at q =1 (3.8) defines the 68% (95%) CL interval for the mass of the observed particle. These intervals include both the statistical and systematic uncertainties. The resulting mass measurement and 68% CL interval in such a   Figure 38: (Left) The 2D 68% CL contours for a hypothesized boson mass m X versus µ = σ/σ SM for the untagged γγ, VBF-tagged γγ, and ZZ → 4 decay channels, and their combination from the combined 7 and 8 TeV data. In the combination, the relative signal strengths for the three final states are fixed to those for the SM Higgs boson. (Right) The maximum-likelihood test statistic q versus m X for the untagged γγ, VBF-tagged γγ, and ZZ → 4 final states, and their combination from the combined 7 and 8 TeV data. Neither the absolute nor the relative signal strengths for the three final states are constrained to the SM Higgs boson expectations. The crossings with the thick (thin) horizontal line q = 1 (3.8) define the 68% (95%) CL interval for the measured mass, shown by the vertical lines. combination is m X = 125.3 ± 0.6 GeV.
To determine the statistical component in the overall uncertainty, we evaluate the test statistic q(m X ) with all the nuisance parameters fixed to their best-fit values. The result is shown by the dashed line in Fig. 39. The crossing of the dashed line with the thick horizontal line q =1 gives the statistical uncertainty (68% CL interval) in the mass measurements: ±0.4 GeV. The quadrature difference between the overall and statistical-only uncertainties determines the systematic uncertainty component in the mass measurements: ±0.5 GeV. Therefore, the final result for the mass measurement is m X = 125.3 ± 0.4 (stat.) ± 0.5 (syst.) GeV.  Figure 39: The maximum-likelihood test statistic q versus the hypothesized boson mass m X for the combination of the γγ and ZZ → 4 modes from the combined 7 and 8 TeV data. The solid line is obtained including all the nuisance parameters and, hence, includes both the statistical and systematic uncertainties. The dashed line is found with all nuisance parameters fixed to their best-fit values and, hence, represents the statistical uncertainties only. The crossings with the thick (thin) horizontal line q = 1 (3.8) define the 68% (95%) CL interval for the measured mass, shown by the vertical lines.

Consistency of the observed state with the SM Higgs boson hypothesis
The p-value characterizes the probability of the background producing the observed excess of events or greater, but it does not give information about the consistency of the observed excess with the expected signal. The current data sample allows for only a limited number of such consistency tests, which we present in this section. These consistency tests do not constitute measurements of any physics parameters per se, but rather show the consistency of the various observations with the expectations for the SM Higgs boson. Unless stated otherwise, all consistency tests presented in this section are for the hypothesis of the SM Higgs boson with mass 125.5 GeV and all quoted uncertainties include both the statistical and systematic ones.

Measurement of the signal strength
The value for the signal-strength modifierμ = σ/σ SM , obtained by combining all the search channels, provides the first consistency test. Note thatμ becomes negative if the observed number of events is smaller than the expected rate for the background-only hypothesis.  Figure 40: The signal-strengthμ = σ/σ SM as a function of the hypothesized SM Higgs boson mass m H using all the decay modes and the combined 7 and 8 TeV data sets. The bands correspond to ±1 standard deviation including both statistical and systematic uncertainties. Figure 41 shows a consistency test of theμ values obtained in different combinations of search channels. The combinations are organized by decay mode and additional features that allow the selection of events with an enriched purity of a particular production mechanism. The expected purities of different combinations are discussed in the sections describing the individual analyses. For example, assuming the SM Higgs boson cross sections, the channels with the VBF dijet requirements have a substantial fraction (20-50%) of gluon-gluon fusion events. There is consistency among all the channels contributing to the overall measurement and their various combinations.
The four main Higgs boson production mechanisms can be associated with either top-quark couplings (gluon-gluon fusion and ttH) or vector-boson couplings (VBF and VH). Therefore, combinations of channels associated with a particular decay mode and explicitly targeting different production mechanisms can be used to test the relative strengths of the couplings of the new state to the vector bosons and top quark. Figure 42 shows the 68% and 95% CL contours for the signal-strength modifiers µ ggH+ttH of the gluon-gluon fusion plus ttH, and µ VBF+VH of the VBF plus VH production mechanisms. The three sets of contours correspond to the channels associated with the γγ, ττ, and WW decay modes; searches in these decay modes have subchannels with VBF dijet tags. The SM Higgs boson point shown by the diamond at µ ggH+ttH , µ VBF+VH = (1, 1) is within the 95% CL intervals for each of the three decay modes.

Consistency of the data with the SM Higgs boson couplings
The event yield N of Higgs bosons produced in collisions of partons x (xx → H) and decaying to particles y (H → yy), is proportional to the partial and total Higgs boson decay widths as follows: where σ(xx → H) is the Higgs boson production cross section, B(H → yy) is the branching fraction for the decay mode, Γ xx and Γ yy are the partial widths associated with the H → xx and H → yy processes, and Γ tot is the total width.
Seven partial widths (Γ WW , Γ ZZ , Γ tt , Γ bb , Γ ττ , Γ gg , Γ γγ ) and the total width Γ tot are relevant for the current analysis, where Γ gg is the partial width for the Higgs boson decay to two gluons. The partial widths Γ gg and Γ γγ are generated by loop diagrams and thus are directly sensitive to the presence of new physics. The possibility of Higgs boson decays to beyond-the-standardmodel (BSM) particles, with a partial width Γ BSM , is accommodated by making Γ tot equal to the sum of all partial widths of allowed decays to the SM particles plus Γ BSM .
The partial widths are proportional to the square of the effective Higgs boson couplings to the corresponding particles. To test for possible deviations of the measurements from the rates expected in different channels for the SM Higgs boson, we introduce different sets of coupling scale factors κ and fit the data to these new parameters. One can introduce up to eight independent parameters relevant for the current analysis. Significant deviations of the scale factors from unity would imply new physics beyond the SM Higgs boson hypothesis.
The current data set is insufficient to measure all eight independent parameters. Therefore, we measure different subsets, with the remaining unmeasured parameters either constrained to equal the SM Higgs boson expectations or included in the likelihood fit as unconstrained nuisance parameters.

A. Test of custodial symmetry
In the SM, the Higgs boson sector possesses a global SU(2) L × SU(2) R symmetry, which is broken by the Higgs boson vacuum expectation value down to the diagonal subgroup SU(2) L+R .
As a result, the tree-level relations between the ratios of the W and Z boson masses, m W /m Z , and their couplings to the Higgs boson, g W /g Z , are protected against large radiative corrections, a phenomenon known as "custodial symmetry" [161,162]. However, large violations of custodial symmetry are possible in BSM theories. To test custodial symmetry, we introduce two scaling factors κ W and κ Z that modify the SM Higgs boson couplings to W and Z bosons, and perform two different procedures to determine the consistency of the ratio λ WZ = κ W /κ Z with unity.
The dominant Higgs boson production mechanism for the inclusive H → ZZ and untagged H → WW channels is gg → H. Therefore, the ratio of the event yields for these channels provides a test of custodial symmetry. To quantify the test, we introduce two event-rate modifiers µ ZZ and R WZ . The expected H → ZZ → 4 event yield is scaled by µ ZZ , while the expected untagged H → WW → ν ν event yield is scaled by R WZ · µ ZZ . The mass of the observed state is fixed to 125.5 GeV. The test statistic q(R WZ ) as a function of R WZ , with µ ZZ included with the other nuisance parameters, is shown in Fig. 43 (left) and yields R WZ = 0.9 +1.1 −0.6 , where the uncertainty is the combined statistical and systematic. The contributions from VBF and VH production to the fit give a small bias of 0.02 when relating the observed event-yield ratio R WZ to the square of the ratio of the couplings λ 2 WZ . Hence, the current measurements are consistent, within the uncertainties, with the expectation from custodial symmetry. (Right) The test statistic q(λ WZ ) as a function of the ratio of the couplings to W and Z bosons, λ WZ , from the combination of all channels. The intersection of the curves with the horizontal lines q = 1 and 3.8 give the 68% and 95% CL intervals, respectively.
In the second method, we extract λ WZ directly from the combination of all search channels. In this approach, we use three parameters: λ WZ , κ Z , and κ F . The latter variable is a single eventrate modifier for all Higgs boson couplings to fermions. The BSM Higgs boson width Γ BSM is set to zero. The partial width Γ gg , induced by quark loops, scales as κ 2 F . The partial width Γ γγ is also induced via loop diagrams, with the W boson and top quark being the dominant contributors; hence, it scales as |α κ W + β κ F | 2 , where κ W = λ WZ · κ Z and the ratio of the factors α and β, β/α ≈ −0.22, is taken from the prediction for the SM Higgs boson with m H = 125.5 GeV [66]. In the evaluation of q(λ WZ ), both κ Z and κ F are included with the other nuisance parameters. Assuming a common scaling factor for all fermions makes this measurement model dependent, but using all the channels gives it greater sensitivity. The results are shown in Fig. 43 (right) by the solid line. The dashed line indicates the median expected result for the SM Higgs boson, given the integrated luminosity. The measured value is λ WZ = 1.1 +0.5 −0.3 , where the uncertainty is the combined statistical and systematic. The result is consistent with the expectation of λ WZ = 1 from custodial symmetry. In all further combinations presented below, we assume λ WZ = 1 and use a common factor κ V to modify the Higgs boson couplings to W and Z bosons.

B. Test of the couplings to vector bosons and fermions
We further test the consistency of the measurements with the SM Higgs boson hypothesis by fitting for the two free parameters κ V and κ F introduced above. We assume Γ BSM = 0, i.e. no BSM Higgs boson decay modes. At lowest order, all partial widths, except for Γ γγ , scale either as κ 2 V or κ 2 F . As discussed above, the partial width Γ γγ scales as |α κ V + β κ F | 2 . Hence, γγ is the only channel sensitive to the relative sign of κ V and κ F . Figure 44 shows the 2D likelihood test statistic over the (κ V , κ F ) plane. The left plot allows for different signs of κ V and κ F , while the right plot constrains both of them to be positive. The 68%, 95%, and 99.7% CL contours are shown by the solid, dashed, and dotted lines, respectively. The global minimum in the left plot occurs in the (+, −) quadrant, which is due to the observed excess in the γγ channel. If the relative sign between κ V and κ F is negative, the interference term between the W and top-quark loops responsible for the H → γγ decays becomes positive and helps boost the γγ branching fraction. However, the difference between the global minimum in the (+, −) quadrant and the local minimum in the (+, +) quadrant is not statistically significant since the 95% CL contours encompass both of them. The data are consistent with the expectation for the SM Higgs boson: the point at (κ V , κ F ) = (1, 1), shown by the diamond, is within the 95% CL contour. Any significant deviation from (κ V , κ F ) = (1, 1) would imply BSM physics, with the magnitude and sign of the κ V and κ F measurements providing a clue to the most plausible BSM scenarios. Figure 45 displays the corresponding 68% and 95% contours of κ V versus κ F from each of the individual decay modes, restricting the parameters to the (+, +) and (+, −) quadrants (left), and the (+, +) quadrant (right). The hypothesis of a "fermiophobic" Higgs boson that couples only to bosons is represented by the point at (1, 0). The point is just outside the 95% CL contour, which implies that a fermiophobic Higgs boson with m H = 125.5 GeV is excluded at 95% CL.
The 1D likelihood scans versus κ V and κ F , setting one parameter at a time to the SM value of 1, are given in the left and right plots of Fig. 46, respectively. The resulting fit values are: κ V = 1.00 ± 0.13 and κ F = 0.5 ± 0.2, where the uncertainties are combined statistical and systematic, with corresponding 95% CL intervals of [0.7; 1.3] and [0.2; 1.0], respectively.

C. Test for the presence of BSM particles
The presence of BSM particles can considerably modify the Higgs boson phenomenology, even if the underlying Higgs boson sector in the model remains unaltered. Processes induced by loop diagrams (H → γγ and gg → H) can be particularly sensitive to the presence of new particles. Therefore, we combine and fit the data to the scale factors κ γ and κ g for these two processes. The partial widths associated with the tree-level production processes and decay modes are assumed to be unaltered. Figure 47 displays the likelihood test statistic in the κ g versus κ γ plane, under the assumption that Γ BSM = 0. The results are consistent with the expectation for the SM Higgs boson of (κ γ , κ g ) = (1, 1). The best-fit value is (κ γ , κ g ) = (1.5, 0.75). Figure 48 gives the likelihood test statistic versus BR BSM = Γ BSM /Γ tot , with κ g and κ γ included    Figure 46: The likelihood test statistic q(κ V ; κ F = 1) (left) and q(κ F ; κ V = 1) (right). The intersections with the horizontal lines q = 1 and q = 3.84 mark the 68% and 95% CL intervals, respectively, as shown by the vertical lines.
as unconstrained nuisance parameters. The resulting 95% CL upper limit is BR BSM < 0.89.

D. Test for differences in the couplings to fermions
In two-Higgs-boson doublet models (2HDM) [163], the couplings of the neutral Higgs bosons to fermions can be substantially modified with respect to the Yukawa couplings of the SM Higgs boson. For example, in the minimal supersymmetric model (MSSM), the couplings of the neutral Higgs bosons to up-type and down-type fermions are modified, with the modification being the same for all three generations and for quarks and leptons. In more general 2HDMs, leptons can be nearly decoupled from the Higgs boson that otherwise would behave like a SM Higgs boson with respect to the W and Z bosons and the quarks. To test for such modifications to the fermion couplings, we evaluate two different combinations of the corresponding parameters: one in which we allow different ratios of couplings to the up-and down-type fermions (λ du = κ d /κ u ), and the other where we allow different ratios of the couplings to the leptons and quarks (λ q = κ /κ q ). We assume that Γ BSM = 0. Figure 49 (left) shows the resulting test statistic versus λ du , with the other free coupling modifiers, κ V and κ u , included as unconstrained nuisance parameters. The relative sign between the couplings to up-and down-type fermions is nearly degenerate, which manifests itself in the left-right symmetry observed in the plot. The symmetry is not perfect since there is some sensitivity to the sign of λ du because of the nonvanishing role of the b quark (in comparison to the top quark) in generating the Higgs boson coupling to gluons. Figure 49 (right) displays the corresponding results versus λ q , with the two coupling modifiers, κ V and κ q , treated as unconstrained nuisance parameters. There are no loop-induced processes measurably sensitive to the relative sign of the couplings to leptons and quarks; hence, the plot exhibits a perfect left-right symmetry. Both |λ du | and |λ q | are consistent with 0 and 1, with a 95% CL upper limit of 1.5 for both. The main reason for both parameters having their best-fit values close to 0 is the lack of any event excess in the H → ττ channel. However, neither the H → ττ nor the H → bb channels have reached sufficient sensitivity to place strong constraints on the parameters associated with the corresponding Higgs boson couplings.  Figure 47: The likelihood test statistic q(κ γ , κ g ) assuming Γ BSM = 0. The cross indicates the bestfit values. The solid, dashed, and dotted contours show the 68%, 95%, and 99.7% CL contours, respectively. The diamond shows the SM point (κ γ , κ g ) = (1, 1). The partial widths associated with the tree-level production processes and decay modes are assumed to be unaltered (κ = 1).  Figure 48: The likelihood test statistic q versus BR BSM = Γ BSM /Γ tot , with the parameters κ g and κ γ included as nuisance parameters. The solid curve is the data; the dashed curve indicates the expected median results in the presence of the SM Higgs boson. The intersections with the horizontal lines q = 1 and 3.8 give the 68% and 95% CL intervals, respectively. The partial widths associated with the tree-level production processes and decay modes are assumed to be unaltered (κ = 1).  Figure 49: (Left) Likelihood test statistic q as a function of the ratio λ du of the coupling to the up-and down-type fermions with the coupling modifiers κ V and κ u treated as nuisance parameters. (Right) The likelihood test statistic as a function of the ratio λ q of the couplings to leptons and quarks with the coupling modifiers κ V and κ q treated as nuisance parameters. The solid curves are the results from the data. The dashed curves show the expected distributions for the SM Higgs boson. The intersection of the curves with the horizontal lines q =1 and 3.8 give the 68% and 95% CL intervals, respectively.

Summary
In this paper, the analyses that were the basis for the discovery of a new boson at a mass of approximately 125 GeV have been described in detail. The data were collected by the CMS experiment at the LHC in proton-proton collisions at √ s = 7 and 8 TeV, corresponding to integrated luminosities of up to 5.1 fb −1 and 5.3 fb −1 , respectively.
The particle is observed in the search for the SM Higgs boson using five decay modes γγ, ZZ, WW, ττ, and bb. An excess of events is found above the expected background, with a local significance of 5.0σ, signaling the production of a new particle. The expected significance for a SM Higgs boson of that mass is 5.8σ.
The excess is most significant in the two decay modes with the best mass resolution, γγ and ZZ → 4 , and a fit to these invariant-mass peaks gives a mass of 125.3 ± 0.4 (stat.) ± 0.5 (syst.) GeV. The decay to two photons indicates that the new particle is a boson with spin different from one. Within the SM hypothesis, the couplings of the new particle to vector bosons, fermions, gluons, and photons have been measured. All the results are consistent, within their uncertainties, with expectations for a SM Higgs boson. More data are needed to ascertain whether the properties of this new state imply physics beyond the SM.
for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: the Austrian Federal Ministry of Science and Research and the Austrian Science Fund; the Belgian Fonds de la Recherche Scientifique, and Fonds voor Wetenschappelijk Onderzoek; the Brazilian Funding Agencies (CNPq, CAPES, FAPERJ, and FAPESP); the Bulgarian Ministry of Education, Youth and Science; CERN; the Chinese Academy of Sciences, Ministry of Science and Technology, and National Natural