Measurement of Higgs boson production and properties in the WW decay channel with leptonic final states

A search for the standard model Higgs boson decaying to a W-boson pair at the LHC is reported. The event sample corresponds to an integrated luminosity of 4.9 and 19.4 inverse femtobarns collected with the CMS detector in pp collisions at sqrt(s) = 7 and 8 TeV, respectively. The Higgs boson candidates are selected in events with two or three charged leptons. An excess of events above background is observed, consistent with the expectation from the standard model Higgs boson with a mass of around 125 GeV. The probability to observe an excess equal or larger than the one seen, under the background-only hypothesis, corresponds to a significance of 4.3 standard deviations for m[H] = 125.6 GeV. The observed signal cross section times the branching fraction to WW for m[H] = 125.6 GeV is 0.72+0.20-0.18 times the standard model expectation. The spin-parity J(P)=0(+) hypothesis is favored against a narrow resonance with J(P)=2(+) or J(P)=0(-) that decays to a W-boson pair. This result provides strong evidence for a Higgs-like boson decaying to a W-boson pair.


Introduction
The origin of the masses of the fundamental particles is one of the main open questions in the standard model (SM) of particle physics [1][2][3]. Within the SM, the masses of the electroweak vector bosons arise by the spontaneous breaking of electroweak symmetry by the Higgs field [4][5][6][7][8][9]. Precision electroweak data constrain the mass of the SM Higgs boson (m H ) to be less than 158 GeV at the 95% confidence level (CL) [10,11]. The ATLAS and CMS experiments at the Large Hadron Collider (LHC), have reported the discovery of a new boson with a mass of approximately 125 GeV with a significance of five or more standard deviations each [12][13][14]. Both observations show consistency with the expected properties of the SM Higgs boson at that mass. The CDF and D0 experiments at the Tevatron have also reported evidence for a new particle in the mass range 120-135 GeV with a significance of up to three standard deviations [15,16]. The determination of the properties of the observed boson, such as its couplings to other particles, mass, and quantum numbers, including spin and parity, is crucial for establishing the nature of this boson. Some of these properties are measured using the H → W + W − decay channel with leptonic final states.
Finding such a signal in the complex environment of a hadron collider is not straightforward. A complete reconstruction of all the final-state particles is not possible because of the presence of neutrinos which are not directly detected. Kinematic observables such as the opening angle between the two charged leptons in the transverse plane, the dilepton mass, and the transverse mass of the system of the two leptons and the neutrinos, can be used to distinguish not only the Higgs boson signal from background processes with similar signature [17,18], but also between the SM Higgs boson hypothesis and other narrow exotic resonances with different spin or parity. Phenomenological studies of the amplitudes for the decay of a Higgs or an exotic boson into the WW final state demonstrate a good sensitivity to distinguish between the SM Higgs boson hypothesis (spin-parity 0 + ) and a spin-2 resonance, which couples to the bosons through minimal couplings, referred to as 2 + min [19]. Some sensitivity has also been shown with this final state to distinguish between the 0 + and the pseudoscalar 0 − boson hypotheses.
Searches for the SM Higgs boson in the H → WW final state at the LHC have previously been performed using data at √ s = 7 TeV by CMS [20][21][22], excluding the presence of the SM Higgs boson at the 95% CL in the mass range 129-270 GeV, and by ATLAS [23], excluding the mass range 133-261 GeV. Using their full dataset at 7 and 8 TeV, ATLAS have reported a H → WW signal with a statistical significance of 3.8 standard deviations [24] as well as evidence for the spin zero nature of the Higgs boson [25]. This paper reports a measurement of the production and properties of the Higgs boson in the WW decay channel using the entire dataset collected by the CMS experiment during the 2011 and 2012 LHC running period. Various production modes, using events with two or three charged leptons ( ), electrons or muons, are investigated. The small contribution proceeding through an intermediate τ lepton is included. For Higgs boson masses around 125 GeV, the expected branching fraction of the Higgs boson to a pair of W bosons is about 22%. The production modes of the SM Higgs boson targeted by this analysis are the dominant gluon fusion (ggH), the vector-boson fusion (VBF), and the associated production with a W or Z boson (VH). The fraction of events from associated production with a top-quark pair (ttH) passing the analysis selection is negligible, and therefore this process is not considered in any of the measurements described in this paper. The analysis is performed in five exclusive event categories based on the final-state leptons and jets: 2 2ν + 0/1 jet targeting the ggH production, 2 2ν +2 jets targeting the VBF production, 2 2ν + 2 jets targeting the VH production, 3 3ν targeting the WH production, and 3 ν + 2 jets targeting the ZH production with one hadronically 3 Data and simulated samples decaying W boson. The overall sensitivity is dominated by the first category while the other categories probe different production modes of the SM Higgs boson. The search discussed here is performed for a Higgs boson with mass in the range 110-600 GeV. The search range stops at m H = 200 GeV for the analyses targeting the VH production since for larger masses the expected VH cross section becomes negligible. In the dilepton categories, non-resonant WW production gives rise to the largest background contribution while top-quark production is dominant in events with high jet multiplicity. In the trilepton categories, WZ and ZZ production are the main background processes. Because of the large inclusive cross section, the instrumental backgrounds from W-boson and Z-boson production with associated jets or photons are also present in the kinematic regions similar to that of the Higgs boson signal.
The paper is organized as follows. After a brief description of the CMS detector in Section 2 and the data and simulated samples in Section 3, the event reconstruction is detailed in Section 4. The statistical procedure applied and the uncertainties considered for the interpretation of the results are explained in Section 5, followed by the description of analysis strategies and performance for the dilepton categories and trilepton categories in sections 6 and 7, respectively. Finally, the results from the measurements of the Higgs boson production and properties combining all analysis categories are reported in Section 8, and the summary given in Section 9.

CMS detector
The CMS detector, described in detail in ref. [26], is a multipurpose apparatus designed to study high transverse momentum (p T ) physics processes in proton-proton and heavy-ion collisions. CMS uses a right-handed coordinate system, with the origin at the nominal interaction point, the x axis pointing to the center of the LHC, the y axis pointing upwards, perpendicular to the plane of the LHC ring, and the z axis along the counterclockwise beam direction. A superconducting solenoid occupies its central region, providing a magnetic field of 3.8 T parallel to the beam direction. Charged-particle trajectories are measured by the silicon pixel and strip trackers, which cover a pseudorapidity region of |η| < 2.5. Here, the pseudorapidity is defined as η = − ln [tan (θ/2)], where θ is the polar angle of the particle trajectory with respect to the direction of the counterclockwise beam. A crystal electromagnetic calorimeter (ECAL) and a brass/scintillator hadron calorimeter surround the tracking volume and cover |η| < 3. The steel/quartz-fiber Cherenkov hadron forward (HF) calorimeter extends the coverage to |η| < 5. The muon system consists of gas-ionization detectors embedded in the steel flux return yoke outside the solenoid, and covers |η| < 2.4. The first level of the CMS trigger system, composed of custom hardware processors, is designed to select the most interesting events in less than 4 µs, using information from the calorimeters and muon detectors. The high-level trigger processor farm further reduces the event rate to a few hundred Hz before data storage.

Data samples
The data samples used in this analysis correspond to an integrated luminosity of 4.9 fb −1 at a center-of-mass energy of √ s = 7 TeV collected in 2011 and of 19.4 fb −1 at √ s = 8 TeV collected in 2012. The integrated luminosity is measured using data from the HF system and the pixel detector [27,28]. The uncertainties in the integrated luminosity measurement are 2.2% in 2011 and 2.6% in 2012.
originating from a single primary vertex. Among the vertices identified in the event, the vertex with the largest ∑ p 2 T , where the sum runs over all tracks associated with the vertex, is chosen as the primary vertex.
Electron candidates are defined by a reconstructed charged-particle track in the tracking detector pointing to a cluster of energy deposition in the ECAL. A multivariate [84] approach to identify electrons is employed combining several measured quantities describing the track quality, the ECAL cluster shapes, and the compatibility of the measurements from the two detectors. The electron energy is measured primarily from the ECAL cluster energy. For low-p T electrons, a dedicated algorithm combines the momentum of the track and the ECAL cluster energy, improving the energy resolution [85]. Muon candidates are identified by signals of charged-particle tracks in the muon system that are compatible with a track reconstructed in the central tracking system. The precision of the muon momentum measurement from the curvature of the track in the magnetic field is ensured by minimum requirements on the number of hits in the layers of sensors and on the quality of the full track fit. Uncertainties in the lepton momentum scale and resolution are 0.5-4% per lepton depending on the kinematic properties, and the effect on the yields at the analysis selection level is approximately 2% for electrons and 1.5% for muons.
Electrons and muons are required to be isolated to distinguish between prompt leptons from W/Z-boson decays and those from QCD production or misidentified leptons, usually situated inside or near jets of hadrons. The variable ∆R = (∆η) 2 + (∆φ) 2 is used to measure the separation between reconstructed objects in the detector, where φ is the angle (in radians) of the trajectory of the object in the plane transverse to the direction of the proton beams. Isolation criteria are set based on the distribution of low-momentum particles in the (η, φ) region around the leptons. To remove the contribution from the overlapping pileup interactions in this isolation region, the charged particles included in the computation of the isolation variable are required to originate from the lepton vertex. A correction is applied to the neutral component in the isolation ∆R cone based on the average energy density deposited by the neutral particles from additional interactions [86]. The correction is measured in a region of the detector away from the known hard scatter in a control sample. Electron isolation is characterized by the ratio of the total transverse momentum of the particles reconstructed in a ∆R = 0.3 cone around the electron, excluding the candidate itself, to the transverse energy of the electron. Isolated electrons are selected by requiring this ratio to be below ∼10%. The exact threshold value depends on the electron η and p T [79,87]. For each muon candidate, the scalar sum of the transverse energy of all particles originating from the primary vertex is reconstructed in ∆R cones of several radii around the muon direction, excluding the contribution from the muon itself. This information is combined using a multivariate algorithm that exploits the differential energy deposition in the isolation region to discriminate between the signal of prompt muons and muons from hadron decays inside a jet.
Lepton selection efficiencies are determined using Z → events [29]. Simulated samples are corrected by the difference in the efficiencies found in data and simulation. The total uncertainty in lepton efficiencies, that includes effects from reconstruction, trigger, and various identification criteria, amounts to about 2% per lepton. The lepton selection criteria in the 7 and 8 TeV samples were tuned to maintain an efficiency independent of the instantaneous luminosity.
Jets are reconstructed using the anti-k T clustering algorithm [88] with a distance parameter of 0.5, as implemented in the FASTJET package [89,90]. A similar correction as for the lepton isolation is applied to account for the contribution to the jet energy from pileup events. Fur-thermore, the properties of the hard jets are modified by particles from pileup interactions. A combinatorial background arises from low-p T jets from pileup interactions which get clustered into high-p T jets. At √ s = 8 TeV the number of pileup events is larger than at √ s = 7 TeV and a multivariate selection is applied to separate jets from the primary interaction and those reconstructed due to energy deposits associated with pileup interactions [91]. The discrimination is based on the differences in the jet shapes, on the relative multiplicity of charged and neutral components, and on the different fraction of transverse momentum which is carried by the hardest components. Within the tracker acceptance the tracks belonging to each jet are also required to be compatible with the primary vertex. Jet energy corrections are applied as a function of the jet p T and η [92]. The jet energy scale and resolution gives rise to an uncertainty in the yields of 2% (5%) for the low (high) jet multiplicity events. Jets considered for the event categorization are required to have p T > 30 GeV and |η| < 4.7. Studies have been performed selecting Z + jets events and comparing the number of jets distribution as a function of the number of reconstructed vertices. A rather flat behavior has been found, which indicates that the effect from pileup interactions is properly mitigated.
Identification of decays of the bottom (b) quark is used to discriminate the background processes containing top-quark that subsequently decays to a bottom-quark and a W boson. The bottom-quark decay is identified by the presence of a soft-muon in the event from the semileptonic decay of the bottom-quark and by bottom-quark jet (b-jet) tagging criteria based on the impact parameter of the constituent tracks [93]. In particular, the Track Counting High Efficiency algorithm is used with a value greater than 2.1 to assign a given jet as b-tagged. Softmuon candidates are defined without isolation requirements and are required to have p T > 3 GeV. The set of veto criteria retain about 95% of the light-quark jets, while rejecting about 70% of the b-jets. The performance of b-jet identification for light-quark jets is verified in Z/γ * → candidate events, and is found to be consistent between data and simulation within 1% for the events with up to one jet and within 3% for the events with two central jets.
The missing transverse energy vector E miss T is defined as the negative vector sum of the transverse momenta of all reconstructed particles (charged or neutral) in the event, with E miss T = | E miss T |. For the dilepton analyses, a projected E miss T variable is defined as the component of E miss T transverse to the nearest lepton if the lepton is situated within the azimuthal angular window of ±π/2 from the E miss T direction, or the E miss T itself otherwise. A selection using this observable efficiently rejects Z/γ * → ττ background events, in which the E miss T is preferentially aligned with leptons, as well as Z/γ * → events with mismeasured E miss T associated with poorly reconstructed leptons or jets. Since the E miss T resolution is degraded by pileup, the minimum of two projected E miss T variables is used (E miss∠ T ): one constructed from all identified particles (full E miss T ), and another constructed from the charged particles only (track E miss T ). The uncertainty in the resolution of the E miss T measurement is approximately 10%, which is estimated from Z → events with the same lepton selection applied as in the rest of the analysis. Randomly smearing the measured E miss T by one standard deviation gives rise to a 2% variation in the estimation of signal yields after the full selection for all analyses.

Statistical procedure
The statistical methodology used to interpret subsets of data selected for the H → WW analyses and to combine the results from the independent categories has been developed by the ATLAS and CMS collaborations in the context of the LHC Higgs Combination Group. A general description of the methodology can be found in refs. [94,95]. Results presented in this paper also make use of asymptotic formulae from ref. [96] and recent updates available in the ROOSTATS package [97].
Several quantities are defined to compare the observation in data with the expectation for the analyses: upper limits on the production cross section of the H → WW process with and without the presence of the observed new boson; a significance, or a p-value, characterizing the probability of background fluctuations to reproduce an observed excess; signal strengths (σ/σ SM ) that quantify the compatibility of the sizes of the observed excess with the SM signal expectation; and results from a test of two independent signal hypotheses, namely a SM-like Higgs boson with spin 0 + with respect to a 2 + min resonance or a pseudoscalar 0 − boson. The modified frequentist method, CL s [98,99], is used to define the exclusion limits. A description of the statistical formulae defining these quantities is found in ref. [13,94].
The number of events in each category and in each bin of the discriminant distributions used to extract the signal is modeled as a Poisson random variable, whose mean value is the sum of the contributions from the processes under consideration. Systematic uncertainties are represented by individual nuisance parameters with log-normal distributions. An exception is applied to the qq → WW normalization in the 0-jet and 1-jet dilepton shape-based fit analyses, described in Section 6.2, which is an unconstrained parameter in the fit. The uncertainties affect the overall normalization of the signal and backgrounds as well as the shape of the predictions across the distribution of the observables. Correlation between systematic uncertainties in different categories and final states are taken into account. In particular, the main sources of correlated systematic uncertainties are those in the experimental measurements such as the integrated luminosity, the lepton and trigger selection efficiencies, the lepton momentum scale, the jet energy scale and missing transverse energy resolution (Section 4), and the theoretical uncertainties affecting the signal and background processes (Section 3). Uncertainties in the background normalizations or background model parameters from control regions (sections 6 and 7) and uncertainties of statistical nature are uncorrelated. A summary of the systematic uncertainties is shown in Table 1, with focus on the 0-jet and 1-jet dilepton categories. Table 1: Summary of systematic uncertainties relative to the yields (in %) from various signal and background processes. Precise values depend on the final state, jet category, and data taking period. The values listed in the table apply to the 0-jet and 1-jet dilepton categories. The horizontal bar (-) indicates that the corresponding uncertainty is not applicable. The jet categorization uncertainty originates from the uncertainties in the renormalization and factorization scales that change the fraction of events in each jet category. The systematic uncertainty from the same source is considered fully correlated across all relevant processes listed.

Final states with two charged leptons
The H → WW → 2 2ν decay features a signature with two isolated, high-p T , charged leptons and moderate E miss T . After all selection criteria are applied, the contribution from other Higgs boson decay channels is negligible. Kinematic distributions of the decay products exhibit the characteristic properties of the parent boson. The three main observables are: the azimuthal opening angle between the two leptons (∆φ ), which is correlated to the spin of the Higgs boson; the dilepton mass (m ), which is one of the most discriminating kinematic variables for a Higgs boson with low mass, especially against the Z/γ * → background; and the transverse mass (m T ) of the final state objects, which scales with the Higgs boson mass. The transverse mass is defined as where p T is the dilepton transverse momentum and ∆φ( , E miss T ) is the azimuthal angle between the dilepton momentum and E miss T .

WW selection and background rejection
To increase the sensitivity to the SM Higgs boson signal, events are categorized into lepton pairs of same flavor (two electrons or two muons, ee/µµ) and of different flavor (one electron and one muon, eµ), and according to jet multiplicities in zero (0-jet), one (1-jet), and two or more jet (2-jet) categories, where the jets are selected as described in Section 4. Splitting the events into categories that differ in signal and background composition imposes additional constraints on the backgrounds and defines regions with high signal purity.
The Higgs boson signal events in 0-jet and 1-jet categories are mostly produced by the gluon fusion process. These categories have relatively high yield and purity and allow measurements of the Higgs boson properties. The 2-jet category is further separated into events with a characteristic signature of VBF production with two energetic forward-backward jets and heavily suppressed additional hadronic activity due to the lack of color flow between the parent quarks, and those with a VH signature in which two central jets originate from the vector boson decay. While the sensitivity of the 2-jet category is limited with the current dataset, the two sub-categories explore specific production modes. A summary of the selection requirements and analysis approach, as well as the most important background processes in the dilepton categories is shown in Table 2. For all jet multiplicity categories, candidate events are composed of exactly two oppositely charged leptons with p T > 20 GeV for the leading lepton (p ,max T ) and p T > 10 GeV for the trailing lepton (p ,min T ). Events with additional leptons are analyzed separately, as described in Section 7. The electrons and muons considered in the analysis include a small contribution from decays via intermediate τ leptons. The E miss∠ T variable is required to be above 20 GeV.

WW selection and background rejection
The analysis is restricted to the kinematic region with m > 12 GeV, p T > 30 GeV, and m T > 30 GeV, where the signal-to-background ratio is high and the background content is correctly described.
The main background processes from non-resonant WW production and from top-quark production, including top-quark pair (tt) and single-top-quark (mainly tW) processes, are estimated using data. Instrumental backgrounds arising from misidentified ("non-prompt") leptons in W+jets production and mismeasurement of E miss T in Z/γ * +jets events are also estimated from data. Contributions from Wγ, Wγ * , and other sub-dominant diboson (WZ and ZZ) and triboson (VVV, V = W/Z) production processes are estimated partly from simulated samples, see Section 3. The Wγ * cross section is measured from data, as described in Appendix A. The shapes of the discriminant variables used in the signal extraction for the Wγ process are obtained from data, as explained in Appendix B.
The non-prompt lepton background, originating from leptonic decays of heavy quarks, hadrons misidentified as leptons, and electrons from photon conversions in W + jets and QCD multijet production, is suppressed by the identification and isolation requirements on electrons and muons, as described in Section 4. The remaining contribution from the non-prompt lepton background is estimated directly from data. A control sample is defined by one lepton that passes the standard lepton selection criteria and another lepton candidate that fails the criteria, but passes a looser selection, resulting in a sample of "pass-fail" lepton pairs. The efficiency, pass , for a jet that satisfies the loose lepton requirements to pass the standard selection is determined using an independent sample dominated by events with non-prompt leptons from QCD multijet processes. This efficiency, parameterized as a function of p T and η of the lepton, is then used to weight the events in the pass-fail sample by pass /(1pass ), to obtain the estimated contribution from the non-prompt lepton background in the signal region. The systematic uncertainties from the determination of pass dominate the overall uncertainty of this method. The systematic uncertainty has two sources: the dependence of pass on the sample composition, and the method. The first source is estimated by modifying the jet p T threshold in the QCD multijet sample, which modifies the jet sample composition. The uncertainty in the method is obtained from a closure test, where pass is derived from simulated QCD multijet events and applied to simulated samples to predict the number of background events. The total uncertainty in pass , including the statistical precision of the control sample, is of the order of 40%. Validation of the estimate of this background using lepton pairs with the same charge is described in Section 6.2.
The Drell-Yan Z/γ * production is the largest source of same-flavor lepton pair production because of its large production cross section and the finite resolution of the E miss T measurement. In order to suppress this background, a few additional selection requirements are applied in the same-flavor final states. The resonant component of the Drell-Yan production is rejected by requiring m to be more than 15 GeV away from the Z boson mass. To suppress the remaining off-peak contribution, in the 8 TeV sample, a dedicated multivariate selection combining E miss T and kinematic and topological variables is used. In the 7 TeV sample the amount of pileup interactions is smaller on average and a selection based on a set of simple kinematic variables is adopted. The p ,min T and m thresholds are raised to 15 GeV and 20 GeV respectively, and the selection based on E miss∠ T is applied progressively tighter as a function of the number of reconstructed vertices, N vtx , E miss∠ T > (37 + N vtx /2) GeV. This requirement is chosen to obtain a background efficiency nearly constant as a function of N vtx . Events in which the direction of the dilepton momentum and that of the most energetic jet with p T > 15 GeV have an angular difference in the transverse plane greater than 165 degrees are rejected. For the 2-jet category, the dominant source of E miss T is the mismeasurement of the hadronic recoil and the best performance in terms of signal-to-background separation is obtained by simply requiring E miss T > 45 GeV and the azimuthal separation of the dilepton and dijet momenta to be ∆φ( , jj) < 165 degrees. These selection requirements effectively reduce the Drell-Yan background by three orders of magnitude, while retaining more than 50% of the signal. The Z/γ * → ee/µµ contribution to the analysis in the same-flavor final states is obtained by normalizing the Drell-Yan background to data in the region within ±7.5 GeV of the Z boson mass after flavor symmetric contributions from other processes are subtracted using eµ events. The extrapolation to the signal region is performed using the simulation together with a cross-check using data. A more detailed explanation of the Drell-Yan background estimation is given in Appendix C. The largest uncertainty in the estimate arises from the dependence of this extrapolation factor on E miss T and the multivariate Drell-Yan discriminant, and is about 20 to 50%. The contribution of this background is also evaluated with an alternative method using γ + jets events, which provides results consistent with the primary method. The Z boson and the photon exhibit similar kinematic properties at high p T and the hadronic recoil is similar in the two cases, and therefore a γ + jets sample is suitable to estimate the Drell-Yan background.
To suppress the background from top-quark production, events that are top-tagged are rejected based on soft-muon and b-jet identification (Section 4). The reduction of the top-quark background is about 50% in the 0-jet category and above 80% for events with at least one jet with p T > 30 GeV. The top-quark background contribution in the analysis is estimated using top-tagged events (N tagged ). The top-tagging efficiency ( top-tagged ) is measured in a control sample dominated by tt and tW events, which is selected by requiring one jet to be b-tagged. The number of top-quark background events (N not-tagged ) expected in the signal region is estimated as: N not-tagged = N tagged × (1 − top-tagged )/ top-tagged . Background contributions from other sources are subtracted from the top-tagged sample. The total uncertainty in N not-tagged amounts to about 20% in the 0-jet, 5% in the 1-jet, and 30-40% in the 2-jet category. Additional selection requirements in the 2-jet category limit the precision of the control sample. A more detailed explanation of the top-quark background estimation is given in Appendix D.
The criteria described above define the WW selection. The remaining data sample is dominated by non-resonant WW events, in particular in the 0-jet category. The normalization of the WW background is obtained from the data 0-jet and 1-jet categories. The procedure depends on the analysis strategy being pursued, as described in Section 6.2.1. In the counting analysis, the WW contribution is normalized to data after subtracting backgrounds from other sources in the signal-free region of high dilepton mass, m > 100 GeV, for m H ≤ 200 GeV. For the higher Higgs boson mass hypotheses and in the 2-jet category, the control region for WW production is contaminated by the signal together with other backgrounds. In this case the WW background prediction is obtained from simulation and the theoretical uncertainty is 20-30% for the VH and the VBF selection requirements. Both shape and normalization of the WW background in the eµ final state for the 0-jet and 1-jet categories are determined from a fit to data, as described in Section 6.2. Studies to validate the fitting procedure are also summarized in that section.
A summary of the estimation of the background processes in the dilepton categories is shown in Table 3.
The m distributions after the WW selection in the eµ final state for the 0-jet and 1-jet categories are shown in Fig. 1, together with the expectation for a SM Higgs boson with m H = 125 GeV. The clear difference in the shape between the H → WW and the non-resonant WW processes for m is mainly due to the spin-0 nature of the SM Higgs boson. For a SM Higgs boson with m H = 125 GeV, an excess of events with respect to the backgrounds is expected at low m . For Table 3: Summary of the estimation of the background processes in dilepton categories in cases where data events are used to estimate either the normalization or the shape of the discriminant variables. A brief description of the control/template sample is given. The WW estimation in the 2-jet category is purely from simulation.

Process
Normalization Shape Control/template sample WW data simulation events at high m and m T Top-quark data simulation top-tagged events W + jets data data events with loosely identified leptons Wγ simulation data events with an identified photon Wγ * data simulation Wγ * → 3µ sample Z/γ * → µµ & Z/γ * → ee data simulation events at low E miss T Z/γ * → ττ data data τ embedded sample the 2-jet category, the dijet variables which are used to distinguish VH production from VBF production are shown in Fig. 2. Control regions in a similar kinematic topology are studied to cross-check the background normalization and distribution.

The zero-jet and one-jet ggH tag
The analysis in this category provides good sensitivity to identify Higgs boson production, and to test the spin-0 hypothesis against the spin-2 hypothesis. The majority of the SM Higgs boson events originate from the gluon fusion process, and the event selection relies entirely on the Higgs boson decay signature of two leptons and E miss T .
While the dominant background is the non-resonant WW production, a relatively small contamination from W + jets and Wγ ( * ) production nevertheless contributes sizeably to the total uncertainty in the measurements since these processes are less precisely known and can mimic the signal topology. Separating the analysis in lepton flavor pairs isolates the most sensitive eµ final state from the ee/µµ final states, which have additional background contributions from processes with a Z/γ * → decay. Splitting the sample into jet multiplicity categories with zero and one jet distinguishes the kinematic region dominated by top-quark background (1-jet category) which has jets from bottom-quark fragmentation, as shown in Fig. 1.

Analysis strategy
To enhance the sensitivity to a Higgs boson signal, a counting analysis is performed in each final state and category using a selection optimized for each m H hypothesis considered. In addition, a two-dimensional shape analysis is also pursued for the different-flavor final state only. In this case, a binned template fit is performed using the most sensitive variables to the presence of signal. This shape-based analysis is more sensitive than the counting analysis to the presence of a Higgs boson, as shown in Section 6.2.2, and is used as the default analysis for the eµ final state. The counting analysis is used as the default analysis for the ee/µµ final states, for which modeling of the Z/γ * background template is challenging. Furthermore, an unbinned parametric fit is pursued using alternative variables and a selection suitable for the measurement of the Higgs boson mass in the different-flavor final state. The mass measurement using the parametric fit and the test of spin hypotheses using a binned template fit are performed in the eµ final state.

Binned template fit in the different-flavor final states
Kinematic variables such as m and m T are independent quantities that effectively discriminate the signal against most of the backgrounds in the dilepton analysis in the 0-jet and 1-jet categories.
The binned fit is performed using template histograms that are obtained from the signal and background models at the level of the WW selection. For the Higgs boson mass hypotheses up to m H = 250 GeV the template ranges are 12 GeV < m < 200 GeV and 60 GeV < m T < 280 GeV. For mass hypotheses above 250 GeV the template ranges are 12 GeV < m < 600 GeV and 80 GeV < m T < 600 GeV, and a higher leading-lepton p T threshold of p ,max T > 50 GeV is required. The templates have 9 bins in m and 14 bins in m T . The bin widths vary within the given range, and are optimized to achieve good separation between the SM Higgs boson signal and backgrounds, as well as between the two spin hypotheses, while retaining adequate template statistics for all processes in the bins.
The signal and background templates, as well as the distribution observed in data, are shown in Fig. 3 for the 0-jet category and in Fig. 4 for the 1-jet category for the 8 TeV analysis. The distributions are restricted to the signal region expected for a low mass Higgs boson, that is: m  GeV and m T  GeV. The distribution of the two variables and the correlation between them are distinct for the Higgs boson signal and the backgrounds, and clearly separates the two spin hypotheses. Pseudo-experiments have been performed to assess the stability of the (m , m T ) template fit method by randomly varying the expected signal and background yields according to the Poisson statistics and to the spread of the systematic uncertainties, as discussed below.

Unbinned parametric fit in the different-flavor final states
A dedicated analysis to probe the Higgs boson mass is performed using a two-dimensional parametric maximum likelihood fit to variables computed in the estimated decay frame of  the Higgs boson candidate, the so-called "razor frame" [100]. One of the two variables is an estimator of the Higgs boson mass and the other is the opening angle of the two charged leptons in the razor frame. This analysis is performed for the Higgs boson mass range 115-180 GeV.
The razor mass variable is based on the generic process of pair production of heavy particles, each decaying to an unseen particle plus jets or leptons that are reconstructed in the detector. The application of this technique in SUSY analyses with hadronic and leptonic final states has been extensively studied [101].
Given the presence of the two neutrinos in the final state, the longitudinal and transverse boosts of the Higgs boson candidate cannot be determined. The razor frame is an approximation of the Higgs boson rest frame, defined unambiguously from measured quantities in the laboratory frame. A longitudinal boost to an intermediate frame, where the visible energies are written in terms of an overall scale that is invariant under longitudinal boosts, is defined as: where p i z is the component along the z axis of the four-momentum and E i is the energy of the ith lepton. In order to also account for the recoil of the Higgs boson candidate when produced in association with jets, a transverse boost is further applied, estimated with the measured E miss T . In the razor frame, an invariant quantity that serves as per-event estimator of the mass scale of the decaying Higgs boson candidate is defined as: This variable has a resolution of around 15% for a Higgs boson with m H = 125 GeV, regardless of the jet multiplicity. The distribution of the m R variable is parameterized with a relatively simple function with a linear dependence on the Higgs boson mass, enabling an unbinned fit to data and a smooth interpolation between mass hypotheses.
The parameterized distributions of the m R variable for different signal mass hypotheses and backgrounds are shown in Fig. 5. The functional form of the Higgs boson signal in m R is described by the convolution of a Breit-Wigner function, centered on the expected m H and with a width equal to the expected Higgs boson width, and a Crystal Ball function [102] to describe the resolution of the Gaussian core and the tail. For the Higgs boson mass hypotheses considered in this analysis, the theoretical width of the SM Higgs boson is negligible with respect to the experimental resolution.
The m R distribution for the majority of the backgrounds is described with a Landau function [103], except for the Z → ττ process which is modeled with a double Gaussian function. The parametric fit is carried out in bins of ∆φ R , which is the azimuthal separation between the two leptons computed in the same reference frame as m R . The two variables are largely uncorrelated in the decay of the Higgs boson, while the distributions for backgrounds are correlated. A total of 10 bins in ∆φ R are used with finer (coarser) bin widths at smaller (larger) value of ∆φ R .
A selection tighter than that of the (m T , m ) template fits is chosen for this analysis by applying p T > 45 GeV and m T > 80 GeV. The reason for the tighter selection is to reject a larger fraction of the W + jets and Wγ ( * ) background processes, which otherwise show a maximum at m R ∼ 125 GeV because of kinematic requirements. The upper bounds on m and m T that are used for the (m T , m ) template fits are removed. The range of 50 GeV < m R < 500 GeV, which contains almost 100% of the signal, is used for the fit.
All the theoretical and experimental systematic uncertainties are taken into account in the parametric fit. The shape uncertainties are estimated by refitting the distribution produced with the systematic variation for each source. The parametric fit to the (m R , ∆φ R ) distribution has been validated using pseudo-experiments and the results show no bias in the measurement of the signal and background yields neither for the 0-jet nor for the 1-jet category.

Counting analysis
A simple counting experiment is performed as a basic cross-check for all categories, and as default approach for the same-flavor ee/µµ final states. A tighter selection is applied to increase the signal-to-background ratio using kinematic variables that characterize the Higgs boson final state. The minimum requirement on dilepton p T is raised to p T > 45 GeV, and a series of selections are applied based on the lepton momenta (p ,max T and p ,min T ), m , the azimuthal separation between the two leptons (∆φ ), and m T . The threshold values are optimized for each Higgs boson mass hypothesis. Table 4 summarizes the selection requirements used in the counting analysis for a few representative mass points.  Table 4: Event selection requirements for the counting analysis in 0-jet and 1-jet categories. For the 2-jet categories the lower threshold on m T is set at 30 GeV.

Results
The data yields and the expected yields for the SM Higgs boson signal and various backgrounds in each of the jet categories lepton-flavor final states are listed in tables 5 and 6 for the counting analysis for representative Higgs boson mass hypotheses up to m H = 600 GeV, and for the selection used for the shape-based analyses. For a SM Higgs boson with m H = 125 GeV, a couple of hundred signal events are expected in total, and the purity of the counting analysis selection is around 20% in the most sensitive eµ final state. The looser selection used for the shape-based analyses recovers a large fraction of the signal events, and also accommodates background-dominated regions allowing the fit to impose constraints on the background contributions. Table 5: Signal prediction, observed number of events in data, and background estimates for √ s = 7 TeV after applying the requirements used for the H → WW counting analysis and for the shape-based analyses (eµ final state only). The combination of statistical uncertainties with experimental and theoretical systematic uncertainties is reported. The Z/γ * → process includes the ee, µµ and ττ final states. The shape-based selections correspond to the m H = 125 GeV selection. The overall signal efficiency uncertainty is estimated to be about 20% and is dominated by Table 6: Signal prediction, observed number of events in data, and background estimates for √ s = 8 TeV after applying the requirements used for the H → WW counting analysis and for the shape-based analyses (eµ final state only). The combination of statistical uncertainties with experimental and theoretical systematic uncertainties is reported. The Z/γ * → process includes the ee, µµ and ττ final states. The shape-based selections correspond to the m H = 125 GeV selection. the theoretical uncertainty due to missing higher-order corrections and PDF uncertainties. The total uncertainty in the background estimations in the signal region is about 15%, dominated by the statistical uncertainty in the number of observed events in the background control regions and the theoretical uncertainties affecting the non-resonant WW production. A summary of the systematic uncertainties is given in Table 1. The obtained WW continuum normalization uncertainty is between 3% and 12% depending on the jet category and center-of-mass energy.
Given the expected number of signal and background events, the sensitivity is limited by the systematic uncertainties for the counting analysis. The additional information from the distributions of the kinematic variables enables a significant improvement over the counting analysis. Expected and observed 95% CL upper limits on the production cross section of the H → WW process relative to the SM prediction are shown in Fig. 6, for counting and shape-based analyses. An excess of events is observed for low Higgs boson mass hypotheses, which makes the observed limits weaker than expected.
After the template fit to the (m T , m ) distribution, the observed signal events as a function of m T and m are shown in Figures 7 and 8, respectively. In these figures, each process is normalized to the fit result and weighted using the other variable. This means for the m T distribution, the m distribution is used to compute the ratio of the fitted signal (S)  Similarly, the fit results for the parametric approach using the (m R , ∆φ R ) distribution are shown in Figures 9 and 10. The fit projection of the m R variable integrated over ∆φ R is shown superimposed to the data distribution. The background-subtracted data distributions are shown weighted by the S/(S+B) ratio using the same weighting method previously described.
The expected and observed results for the H → WW → 2 2ν analyses in the 0/1-jet bin are summarized in Table 7. The upper limits on the H → WW production cross section are slightly higher than the SM expectation. The observed significance is 4.0 standard deviations for the default shape-based analysis for m H = 125 GeV using a template fit to the (m T , m ) distribution and the expected significance is 5.2 standard deviations. The best-fit signal strength, σ/σ SM , which is the ratio of the measured H → WW signal yield to the expectation for a SM Higgs boson is 0.76 ± 0.21.

Higgs boson mass [GeV]
SM σ / σ 95% CL limit on  : Expected and observed 95% CL upper limits on the H → WW production cross section relative to the SM Higgs boson expectation using the counting analysis (left) and the shape-based template fit approach (right) in the 0-jet and 1-jet categories. The shape-based analysis results use a binned template fit to (m T , m ) for the eµ final state, combined with the counting analysis results for the ee/µµ final states.

Validation of the template fits
The two-dimensional fit procedure has been extensively validated through pseudo-experiments and fits in data control regions. The former are used to validate the fit under known input conditions, while the latter are used to check the accuracy of background templates and the model       Table 7: A summary of the expected and observed 95% CL upper limits on the H → WW production cross section relative to the SM prediction, the significances for the backgroundonly hypothesis to account for the excess in units of standard deviations (sd), and the bestfit signal strength σ/σ SM , the ratio of measured signal yield to the expected yield at m H = 125 GeV for the 0-jet and 1-jet categories. The eµ and ee/µµ final states are combined for these results. The shape-based analysis results using a binned template fit or a parametric fit for the eµ final state are combined with counting analysis results for the ee/µµ final states. The binned template fit to (m T , m ) is used to obtain the default results. Assuming the SM expectation, the fit performance has been evaluated with pseudo-experiments in terms of process normalizations and nuisance parameters, both under default conditions and in the presence of input biases, which correspond to ±1 standard deviation on either normalization or shape of the most important backgrounds. Fit results are very stable and in most cases the signal yield is determined with no significant bias. The largest deviation is observed for input bias applied on the W + jets background normalization, with an average shift no larger than 10% which is more than three times smaller than the uncertainty in the signal yield. All nuisance parameter values and uncertainties resulting from the fit performed on data are compatible with expectations from pseudo-experiments. The most constrained parameters are related to the WW (and, secondarily, top-quark) background, as the fit can gauge it from a large signal-free region. It is therefore crucial to verify with data that the WW correlation model is correct.
For the purpose of checking the WW model a dedicated test is developed. First, the signal-free WW control sample is separated into two non-overlapping regions with a similar number of events. Then, each region is fitted separately. In this fit, only the WW background is allowed to change. In order to avoid fluctuations due to non-WW components, all other processes are fixed to the values obtained in the fit performed in the full range. The first region (CR1, high m T ) is defined by requiring 120 GeV < m T < 280 GeV and 12 GeV < m < 200 GeV, while the second region (CR2, high m ) is defined by requiring 60 GeV < m T < 120 GeV and 60 GeV < m < 200 GeV. The WW normalization and shape obtained from the fit in one region are extrapolated to the other region and compared to data. Figure 11 shows the m T and m distributions in the control regions CR1 and CR2 using fit results from the other control region. The uncertainty band is evaluated from pseudo-experiments. In each bin of the twodimensional distribution, the uncertainty in the background processes is obtained from the fit in the full range. All distributions show generally good agreement with data, indicating that the WW fit model is not biased.
Fits are performed in two types of control samples, one defined by b-tagged jets and the other by two leptons with the same charge. The first sample is dominated by top-quark processes, while the second sample is dominated by the W + jets and Wγ ( * ) processes. In both cases the background yields agree with the expectations and no signal component is found. Distributions of the discriminating variables in some of these control regions are shown in Fig. 12.
In summary, the templates for all main backgrounds (WW, tt + tW, W + jets, and Wγ ( * ) ) have   been tested in dedicated control regions with data. Both the fit procedure and the background estimations are found to be very robust.
Finally, the template shape for the dominant qq → WW background process has been crosschecked by replacing the template histogram obtained from the default generator by another one and rederiving the shape uncertainty templates that are allowed to vary in the fit. Table 8 summarizes the results of this procedure using MADGRAPH (a priori default used in the analysis), MC@NLO, and POWHEG. The signal significance, and the best-fit signal strength are found to be consistent with one another for the three different qq → WW template models tested.

The two-jet VBF tag
The second-largest production mode for the SM Higgs boson is through VBF, for which the cross section is approximately an order of magnitude smaller than that of the gluon fusion process. In this process two vector bosons are radiated from initial-state quarks and produce a Higgs boson at tree level. In the scattering process, the two initial-state partons may scatter at a polar angle from the beam axis large enough to be detected as additional jets in the signal events. Furthermore, these two jets, being remnants of the incoming proton beams, feature the distinct signature of having high momentum and large separation in pseudorapidity, hence sizeable invariant mass, with an absence of additional hadronic activity in the central rapidity region due to the lack of color exchange between the parent quarks. By exploiting this specific signature, VBF searches typically have a good signal-to-background ratio. In this analysis the signal-to-background ratio approaches one after all the selection criteria are applied.
To select events with the characteristics of the VBF process, the two highest p T jets in the event are required to have pseudorapidity separation of |∆η jj | > 3.5 and to form an invariant mass m jj > 500 GeV. Events with an additional jet situated in the pseudorapidity range between the two leading jets are rejected. Both leptons are also required to be within the pseudorapidity region defined by the two highest p T jets.

Analysis strategy
Given the small event yield for the 2-jet category with VBF tag with the currently available datasets, the signal extraction uses a template fit to a single kinematic variable with appropriatelysized bins. The dilepton mass, m , has been chosen for its simple definition and discrimination power, and also because the hadronic information is already extensively used in the event selection. The counting analysis is pursued for the same-flavor category, and also used as a cross-check of the shape-based approach for the different-flavor final state.
Since the fit to data uses only the m distribution, the events are preselected to satisfy m T smaller than the Higgs boson mass of the given hypothesis. For Higgs boson mass hypotheses of 250 GeV and above, p ,max T is required to be greater than 50 GeV. The m template has 14 bins for the 8 TeV sample and 10 bins for the 7 TeV sample, covering the range from 12 GeV to 600 GeV.
For the counting analysis, the same requirements as the 0-jet and 1-jet analyses are applied, as summarized in Table 4, except for the lower m T threshold which is kept at 30 GeV for all Higgs boson mass hypotheses. The results of the same-flavor counting analysis are combined with the results of the different-flavor shape analysis to provide the result for this category.

Results
The data yields and the expected yields for the SM Higgs boson signal and various backgrounds in each of the lepton-flavor final states for the VBF analysis are listed in tables 9 and 10, for several representative Higgs boson mass hypotheses. For a Higgs boson with m H = 125 GeV, a few signal events are expected to be observed with a signal-to-background ratio of about one. The contribution to the VBF selection from gluon fusion Higgs boson production after all selection requirements is approximately 20% of the total signal yield [87]. Figure 13 shows the comparison of m between the prediction and the data for a Higgs boson mass of 125 GeV after the selection for the shape-based analysis. The 95% CL observed and median expected upper limits on the production cross section of the H → WW process are shown in Fig. 14. Limits are reported for both counting and shape-based analyses. The observed (expected) signal significance for the shape-based approach is 1.3 (2.1) standard deviations for a SM Higgs boson with mass of 125 GeV. The observed signal strength for this mass is σ/σ SM = 0.62 +0. 58 −0.47 . A summary of the results for m H = 125 GeV is shown in Table 11.

The two-jet VH tag
The analysis of the associated production of a SM Higgs boson with a W or a Z boson in the dilepton final state selects events with two centrally produced (|η| < 2.5) jets from the decay of Table 9: Signal prediction, observed number of events in data, and background estimates for √ s = 7 TeV after applying the H → WW VBF tag counting analysis selection requirements and the requirements used for the shape-based approach (eµ final state only). The combined statistical, experimental, and theoretical systematic uncertainties are reported. The Z/γ * → process includes the dimuon, dielectron and ditau final state. The VZ background denotes the contributions from WZ and ZZ processes.     Figure 14: Expected and observed 95% CL upper limits on the H → WW production cross section relative to the SM Higgs boson expectation using the counting analysis (left), and shapebased template fit approach (right) in the 2-jet category with VBF tag. The shape-based analysis results use the one-dimensional binned template fit to m distribution for the eµ final state, combined with counting analysis inputs for the ee/µµ final states.
the associated vector boson. The dijet invariant mass is required to be consistent with the parent boson mass, i.e. in the range 65 GeV < m jj < 105 GeV, and the pseudorapidity separation between the two jets within |∆η jj | < 1.5. These requirements ensure no overlap of this selection with the VBF analysis for which a pair of forward-backward jets is required. Additionally, for m H < (≥) 180 GeV, events are required to have 60 (70) GeV < m T < m H .

Analysis strategy
The default analysis in the dilepton 2-jets category with VH tag is performed using a counting analysis approach because this category is statistically limited for the current datasets and the expected signal yield is relatively small. Further m H -dependent selections are applied to suppress top-quark processes, Z/γ * → , and WW contamination based on m and angular separation between the two leptons (∆R ). The lower threshold on m is raised to m > 20 GeV for m H > 135 GeV, and the upper bound is m < 60 GeV for m H < 180 GeV and m < 80 GeV for the higher Higgs boson masses. The maximum ∆R requirement varies between 1.5 and 2.0 from the lowest to the highest mass hypotheses tested.
As demonstrated for other analyses previously described, the sensitivity to the Higgs boson signal in this category is expected to gain from a fit to a kinematic distribution, especially when the integrated luminosity increases. The method has been tested in the eµ final state using the invariant mass of the dilepton system. The selection that is used for the counting analysis is simplified with m < 200 GeV and ∆R < 2.5 for the shape-based analysis. A total of 9 bins in m have been defined between the lower threshold and 200 GeV.

Results
The data yields and the expected yields for the Higgs boson signal and various backgrounds in each of the categories for the VH analysis are listed in tables 12 and 13. For a Higgs boson with m H = 125 GeV, a few signal events are expected with a signal-to-background ratio of approx-imately 8%. Among the selected signal events, the contribution of the associated production mode is ∼40%, and the majority of the remaining signal originates from gluon fusion process.  The m distribution at √ s = 8 TeV used as an input to the template fit in the eµ final state after the corresponding selection for m H = 125 GeV is shown in Fig. 15. The shape-based analysis has been tested and compared with the default counting analysis. No shape-based analysis was developed at √ s = 7 TeV because of very limited statistics.
The 95% CL observed and median expected upper limits on the production cross section of the H → WW process are shown in Fig. 16. Limits are reported for both counting and shapebased analyses. For the latter, the different-flavor final states are combined with the same-flavor counting analysis.
The expected and observed results for the VH analysis are summarized in Table 14. The upper limit on the H → WW production cross section using this category is about five times the SM expectation, and the observed (expected) significance of the signal is 0.2 (0.6) standard deviations.   Figure 16: Expected and observed 95% CL upper limits on the H → WW production cross section relative to the SM Higgs boson expectation using the counting analysis (left), and the shape-based template fit approach (right) in the VH category. The shape-based analysis results use the one-dimensional binned template fit to the m distribution for the eµ final state, combined with counting analysis results for the ee/µµ final states.

Final states with three charged leptons
Events with exactly three identified charged leptons also provide sensitivity to the VH production mode. Three charged-lepton candidates with total charge equal to ±1 are required, with p T >20 GeV for the leading lepton and p T >10 GeV for the other leptons. Events with any further identified lepton passing the selection criteria defined in Section 4 and p T >10 GeV are rejected. Two analyses have been developed for this topology. The first analysis selects triboson (VVV, V = W/Z) candidates in which all bosons decay leptonically, yielding an experimental signature of three isolated high-p T leptons, moderate E miss T , and little hadronic activity. The second analysis requires one opposite-sign same-flavor lepton pair compatible with a Z boson decay and two jets compatible with a hadronic W-boson decay, making the analysis sensitive to ZH production. A brief summary of the analyses in the trilepton categories is shown in Table 15. is not used since having three leptons in the event degrades the performance of such variable. To further suppress the topquark background, events are rejected if there is at least one jet with p T > 40 GeV, or if the event is top-tagged as described in Section 4. The WZ → 3 ν background is largely reduced by requiring that all the OSSF lepton pairs have a dilepton mass at least 25 GeV away from the Z mass peak. To reject the Vγ ( * ) background, the dilepton mass of all opposite-sign lepton pairs is required to be greater than 12 GeV. In addition to all the above requirements, the signal region is defined by requiring that the smallest dilepton mass m is less than 100 GeV, and that the smallest distance between the opposite-sign leptons ∆R + − is less than 2.
Finally, a shape-based analysis is carried out as the main analysis because of its superior sensitivity with respect to the counting analysis. In this analysis the requirement on ∆R + − is not applied, and instead that variable is used as the discriminant. Tests have shown this variable to provide the best discrimination between signal and background events, both in terms of expected limits and of expected significance.

Background estimation
There are five main background processes in this category: WZ → 3 ν, ZZ → 4 , tribosons, Zγ, and processes with non-prompt leptons. The first four contributions are estimated from simulation, with corrections from data control samples, while the non-prompt lepton background is solely evaluated from data.
The WZ → 3 ν decay is the main background in the analysis. The overall normalization is taken from data using trilepton events, where one of the same-flavor opposite-sign lepton pairs has a mass less than 15 GeV away from the Z boson mass peak. All other selection requirements are applied, except the ∆R + − and the upper m requirements. The sample is completely dominated by this process, and for m H = 125 GeV less than one signal event is expected in that region. The uncertainty in the normalization, which mainly arises from the statistics of the control sample, is 5-10%.
The ZZ → 4 background is reduced by the E miss T requirement and the veto of events containing a fourth lepton. The prediction from the simulation for this process is used without any further correction. The triboson background processes are also estimated with simulation.
The Zγ background is normalized in data using events in which the trilepton mass is compatible with the Z mass. The number of selected events for this background after the E miss T requirements is very small. A normalization uncertainty of 30% is assigned from studies in events with m 3 compatible with m Z .
The non-prompt lepton backgrounds are estimated as explained in Section 6, with the only difference that the contributions are derived from a control sample in data in which two leptons pass the standard criteria and the third one does not, but satisfies a relaxed set of requirements (loose selection), resulting in a "two-pass and one-fail" sample. The efficiency for a jet that satisfies the loose lepton selection to pass the tight selection, pass , is determined using an inde-pendent dataset dominated by non-prompt leptons from multijet events. Finally, a scale factor of 0.78 ± 0.31 is obtained by comparing the prediction from this method and a trilepton data sample in which a b-tagged jet is required. This last sample is heavily enriched in top-quark processes and allows to calibrate the background prediction. The systematic uncertainty from the efficiency determination dominates the overall uncertainty of this method, which is estimated to be 40%.
A summary of the estimation of the background processes in the WH → 3 3ν category in cases where data events are used to estimate either the normalization or the shape of the discriminant variables is shown in Table 16.

Results
The observed number of data events and the expected number of signal and background events at different stages of the analysis are shown in Table 17. The signal contribution from WH production with H → ττ decay to the total number of expected Higgs boson events decreases from 55% to 10% in the mass range 110-130 GeV, and it is about 15% for m H = 125 GeV. The ∆R + − distributions are shown in Fig. 17.
No significant excess of events is observed with respect to the background prediction, and the 95% CL upper limits are calculated for the production cross section of the WH → 3 3ν process with respect to the SM Higgs boson expectation. The expected and observed upper limits are shown in Fig. 18. Since the analysis is independent of m H , and the shape of the ∆R + − distribution has a mild dependence on m H , smooth changes are expected for different Higgs boson mass hypotheses. The observed (expected) upper limit at the 95% CL is 3.8 (3.7) times larger than the SM expectation for m H = 125 GeV for the counting analysis. For the shapebased analysis, the observed (expected) upper limit at the 95% CL is 3.3 (3.0) times larger than the SM expectation for m H = 125 GeV. A summary of the results for m H = 125 GeV is shown in Table 18.

Analysis strategy
To select ZH events, the first step is to identify the leptonic decay of the Z boson. Events are required to have one pair of opposite-sign same-flavor leptons for which |m − m Z | < 15 GeV. If there is more than one possible combination, the pair with an invariant mass closest to the Z mass is chosen. To reject the Vγ ( * ) background, the dilepton mass of all opposite-sign lepton pairs is required to be greater than 12 GeV. To reject possible contributions from Z bosons decaying to 4 , with one of the leptons not identified, the invariant mass of the system of the three leptons is required to be |m − m Z | ≥ 10 GeV. As one of the W bosons in this category decays hadronically, events are required to have at least two jets. The requirements described above define the preselection. The transverse component of the leptonically decaying W boson is reconstructed from the remaining lepton, that is not used to Table 17: Signal prediction for the SM Higgs boson with m H = 125 GeV, number of observed events in data, and estimated background at different stages of the WH → 3 3ν analysis. Only statistical uncertainties in the yields are reported in the first four rows of the selection stages, while all systematic uncertainties are considered in the last row. The column labeled as "nonprompt" is the combination of the backgrounds from Z + jets and top-quark decays. ZZ, Vγ ( * ) , and triboson processes are not reported separately since since they constitute a small fraction of the total background. The 3-lepton selection stage also includes the m > 12 GeV requirement.    reconstruct the Z boson, and E miss T . Events are further required to have the transverse mass m T of the leptonically decaying W boson to be less than 85 GeV, where m ν T is defined as m ν T = (p T,l + p T,ν ) 2 − (p x,l + p x,ν ) 2 − (p y,l + p y,ν ) 2 , where the transverse momentum components of the neutrino are approximated by the transverse components of E miss T . Furthermore, the invariant mass of the jet pair is required to be compatible with a W decay: |m jj − m W | ≤ 60 GeV. The angle ∆φ( ν, jj) between the system of the lepton and the neutrino, approximated by E miss T , and the system of the two jets in the transverse plane must be smaller than 1.8 radians. The selection criteria have been optimized for the best S/ √ B using simulated samples for a SM Higgs boson signal with m H = 125 GeV.
The criteria listed above comprise the selection for both a counting and a shape-based analysis in this category. For the shape-based analysis, which achieves better expected sensitivity than the counting analysis, the transverse mass of the Higgs boson is reconstructed using the two jets, the E miss T and the lepton from the W boson decay, m where in each sum, all the final-state objects from the Higgs boson decay are included. Therefore ∑ p T is given by ∑ p T = p T, + p T,ν + p T,j1 + p T,j2 , and similarly for ∑ p x and ∑ p y . For the counting analysis, m ν2j T is also used with the mass-dependent selection requirements presented in Table 19.

Background estimation
Four main background processes are present in the sample after full selection: WZ, ZZ, tribosons, and processes involving non-prompt leptons. The first three contributions are estimated from simulated samples, while the last one is evaluated from data. Unlike in the case of the WH → 3 3ν category, the contribution from H → ττ is negligible in this category.
The non-prompt lepton background processes are estimated as explained in Section 7.1. This kind of background arises predominantly from Z + jets production, a small contribution from top-quark production, and negligible contributions from other processes.

Results
The observed number of events and the expected number of signal and background events at different stages of the shape-based analysis are shown in Table 20. The m ν2j T distributions are shown in Fig. 19. The final number of events for the counting analysis for four different m H values at 7 and 8 TeV are presented in Table 21. Table 20: Expected signal, number of observed events in data, and estimated background at different stages of the ZH → 3 ν + 2 jets shape-based analysis assuming a Higgs boson mass of 125 GeV. Only statistical uncertainties in the yields are reported in the first three rows of the selection stages, while all systematic uncertainties are considered in the last one. The legend entry labeled as "non-prompt" refers to the combination of the backgrounds from Z + jets and top-quark decays.   distribution after all other requirements for the ZH → 3 ν + 2 jets analysis at 7 TeV (left), and at 8 TeV (right). The signal yield (red open histogram) is multiplied by 10 with respect to the SM expectation. The legend entry labeled as "non-prompt" is the combination of the backgrounds from Z + jets and top-quark decays.
No significant excess of events is observed with respect to the background prediction, and the 95% CL upper limits are calculated for the production cross section of the ZH → 3 ν + 2 jets process with respect to the SM Higgs boson expectation. Four final states are taken as inputs to the combination: eee, eeµ, µµe, and µµµ. These four final states contain approximately 18%, 23%, 24%, and 35% of events in the selected sample, respectively. The upper limits at the 95% CL for both counting and shape-based analyses are shown in Fig. 20. The observed (expected) upper limit at the 95% CL is 18.7 (17.8) times larger than the SM expectation for m H = 125 GeV for the counting analysis. For the shape-based analysis, the observed (expected) upper limit at the 95% CL is 21.4 (15.9) times larger than the SM expectation for m H = 125 GeV.

Combined results
In this section, the combined results obtained using all the individual search categories described in sections 6 and 7 are presented. The reference analysis for each individual search category, selected on the basis of the expected signal sensitivity, is used in the combination. A summary of the expected signal production mode fractions for the reference analyses for a SM Higgs boson with a mass of 125.6 GeV at √ s = 8 TeV is shown in Table 22, together with the total number of expected H → WW events at √ s = 7 and 8 TeV. The statistical methodology used in this combination is briefly described in Section 5. The Higgs boson mass hypothesis chosen to evaluate the measurements is m H = 125.6 GeV, which corresponds to the mass measurement of the observed boson from the H → ZZ → 4 decay channel [104]. It is important to emphasize that there is a relatively weak dependence for these analyses on the Higgs boson mass.

Signal strength
The expected 95% CL upper limits on the production cross section of the H → WW process with respect to the SM prediction for each category considered in the combination and the    Fig. 21 (top) for the Higgs boson mass range 110-600 GeV. Exclusion limits beyond 600 GeV deserve a specific study and are not addressed in this paper. The combined observed and expected 95% CL upper limits on the production cross section of the H → WW process with respect to the SM prediction are shown in Fig. 21 (bottom). Results are shown in two ways: without assumptions on the presence of a SM Higgs boson and considering the SM Higgs boson with m H = 125.6 GeV as part of the background processes. In the first case, an excess of events is observed for low m H hypothesis, which makes the observed limits much weaker than the expected ones. In particular, the observed (expected) 95% CL upper limit on the H → WW production cross section with respect to the SM prediction at m H = 125.6 GeV is 1 .1 (0.3). The combination of all categories excludes a SM Higgs boson in the mass range 127-600 GeV at the 95% CL, while the expected exclusion range for the background-only hypothesis is 115-600 GeV. In the second case, to search for another excess, the 95% CL upper limits are obtained including the SM Higgs boson with m H = 125.6 GeV as a background process, and no significant excess is found anywhere. Additional Higgs bosons with SM-like properties are excluded in the mass range 114-600 GeV at the 95% confidence level when assuming that a SM Higgs boson with m H = 125.6 GeV is present in the data.
The expected significance for the SM Higgs boson signal as a function of the mass hypothesis for each category and for the combination is shown in Fig. 22 (top left). The expected and observed significances for the combination are shown in Fig. 22 (top right). The observed (expected) significance of the signal is 4.3 (5.8) standard deviations for m H = 125.6 GeV. The observed σ/σ SM as a function of the Higgs boson mass is also shown in Fig. 22 (bottom). The  Fig. 23. The results from all categories are consistent within the uncertainties. Figure 24 shows the confidence intervals in the two-dimensional (σ/σ SM , m H ) plane and the one-dimensional likelihood profile in m H assuming the SM cross section and branching fraction, σ/σ SM =1, where the SM Higgs boson uncertainties in the production cross section are considered. The results are obtained with the analysis using a parametric fit to the (m R , ∆φ R ) distribution in the 0-jet and 1-jet categories of the eµ final state, as described in Section 6.2. The likelihood curve at σ/σ SM =1 yields a best-fit mass of 125.5 +3.6 −3.8 GeV. Furthermore, without the constraint on σ/σ SM , the best-fit mass is at 128.2 +6.6 −5.3 GeV. The uncertainty on the best-fit mass value is consistent with the expected resolution of the signal and the observed significance.

Couplings
The primary production mechanism contributing to the total cross section for the SM Higgs boson is the ggH process, with a smaller fraction of the cross section coming from VBF and VH production. Separating the ggH process from the other contributions is particularly relevant to explore the Higgs boson couplings, since in the first case the coupling to the fermions of the virtual loop is involved, while in the others tree-level couplings to vector bosons play a role. The likelihood profiles for the signal strength modifiers associated with production modes dominated by couplings to fermions (µ ggH ) and vector bosons (µ VBF,VH ) are shown at the 68% and 95% CL in Fig. 25. The expected and observed likelihood profiles for m H = 125.6 GeV for the three production modes, ggH, VBF, and VH, are shown separately in Fig. 26.
A way to verify the theory prediction is to compare the Higgs boson coupling constants to fermions and electroweak vector bosons with the SM expectation [36]. Two coupling modifiers κ V and κ f are assigned to vector and fermion vertices, respectively. They are then used to scale the expected product of cross section and branching fraction to match the observed signal yields in the data:      In each case, the modifiers for the other productions modes are profiled. The crossings with the horizontal line at −2∆ ln L = 1 (3.84) define the 68% (95%) CL interval. κ V and κ f . The κ i modifier is κ f for the ggH process and κ V for the VBF and VH processes. The assumption is made that only SM fields contribute to the total width. In the context of this analysis the branching fraction is always scaled by κ 2 V /κ 2 H ; the only direct coupling of the Higgs boson to fermions occurs in the gluon fusion process, whose strength is then parametrized by κ f . The two-dimensional likelihoods of the κ V and κ f parameters, for both the observed value and the SM expectation, are shown in Fig. 27 (left).
An alternative general scenario can be obtained by allowing for non-vanishing Higgs boson decays beyond the SM (BR BSM ), while at the same time constraining the fit to κ V ≤ 1, which is well-motivated by the electroweak symmetry breaking, with κ 2 H = κ 2 H (SM)/(1 − BR BSM ). The likelihood scan distribution versus BR BSM is shown in Fig. 27 (right) computed for this scenario. With these assumptions, an observed (expected) upper limit on BR BSM at the 95% CL is set at 0.86 (0.75) using the H → WW decay channel alone. This limit can be interpreted as, e.g., an indirect limit on invisible Higgs boson decays.

Spin and parity
The different-flavor 0-jet and 1-jet categories are used to distinguish between a 0 + boson like the SM Higgs boson and a 2 + min boson or a pseudoscalar 0 − boson. The 2 + min signal templates for the gg → X and qq → X processes, and the 0 − signal template for the gg → X process, are obtained from JHUGEN.
The results for the 2 + min case are shown as a function of the qq → X component, f qq . The yields of the gg → X and qq → X processes are nominally taken from the simulated samples assuming the SM Higgs boson cross section. A signal-plus-background model is built for each hypothesis, based on two-dimensional templates in m T and m , using the same bin widths and data selection as for the low m H case described in Section 6.2. For the SM Higgs boson case, the signal templates derived from POWHEG include the gluon fusion, VBF, and VH production modes. The background templates are the same as in the SM Higgs boson search analysis. The two-dimensional (m T , m ) distributions for the 0 + and 2 + min hypotheses are shown in Fig. 3 for the 0-jet category and in Fig. 4 for the 1-jet category for the 8 TeV analysis. The distribution of the two variables and the correlation between them clearly separates the two spin hypotheses, which are related to the different ν masses and azimuthal angle distributions [19].
For each hypothesis a binned maximum likelihood (L) fit is performed, to simultaneously extract the signal strength and background contributions. This likelihood fit model is the same as in the SM Higgs boson search. Fits are performed for both models, and the likelihoods are calculated with the signal rates allowed to float independently for each signal type. The test statistic, q = −2 ln(L J P /L 0 + ), where L 0 + and L J P are the best-fit likelihood values for the SM Higgs boson and the alternative hypothesis is then used to quantify the consistency of the two models with data. The expected separation between the two hypotheses, defined as the median of q expected under the J P hypothesis, is quoted in two scenarios, when events are generated with a-priori expectation for the signal yields (σ/σ SM ≡ 1) and when the signal strength is determined from the fit to data (σ/σ SM ≈ 0.75).
The distributions of q for the 0 + and 2 + min hypotheses at m H = 125.6 GeV for the two scenarios above and assuming f qq =0% or f qq =100% are shown in Fig. 28. Assuming σ/σ SM = 1 for both hypotheses, the median test statistic for the 0 + and 2 + min hypotheses as well as its observed value, as a function of f qq of the 2 + min particle is shown in Fig. 29 (left). The same results using the σ/σ SM value determined from the fit to data are shown in Fig. 29 (right). In all cases the data favor the SM hypothesis with respect to the 2 + min hypothesis. The alternative hypothesis 2 + min is excluded at a 83.7% (99.8%) CL or higher for f qq = 0% (100%) when the σ/σ SM value determined from the fit to data is used.
The same procedure described above is applied to perform a test of hypotheses between a 0 + boson like the SM Higgs boson and a pseudoscalar 0 − boson. The average separation between the two hypotheses is about one standard deviation, as shown in Fig. 30. The alternative hypothesis 0 − is disfavored with a CL s value of 34.7% when the σ/σ SM value determined from the fit to data is used. A summary of the list of models used in the analysis of the spin and parity hypotheses, J P , are shown in Table 23 together with the expected and observed separation J P /0 + .   4.9 fb Figure 29: Median test statistic for the 0 + and 2 + min hypotheses, as a function of f qq of the 2 + min particle, assuming σ/σ SM = 1 (left) and using the σ/σ SM value determined from the fit to data (right). The observed values are also reported in the second case.

Summary
A search for the SM Higgs boson decaying to a W-boson pair at the LHC has been reported. The event samples used in the analysis correspond to an integrated luminosity of 4.9 fb −1 and 19.4 fb −1 collected by the CMS detector in pp collisions at √ s = 7 and 8 TeV, respectively. The WW candidates are selected in events with exactly two or three charged leptons. The analysis has been performed in the Higgs boson mass range 110-600 GeV. An excess of events is observed above background, consistent with the expectations from the SM Higgs boson of mass around 125 GeV. The probability to observe an excess equal or larger than the one seen, under the background-only hypothesis, corresponds to a significance of 4.3 standard deviations for m H = 125.6 GeV. The observed σ/σ SM value for m H = 125.6 GeV is 0.72 +0.20 −0.18 . The spin-parity J P = 0 + hypothesis is favored against a narrow resonance with J P = 2 + or J P = 0 − that decays to a W-boson pair. This result provides strong evidence for a Higgs-like boson decaying to a W-boson pair.

A Measurement of the Wγ * cross section scale factor
The Wγ * electroweak process is included in standard CMS simulations as a part of the WZ process using MADGRAPH. Nevertheless the low-mass dilepton region is not properly covered since the standard simulations have a generator-level requirement at m γ * > 12 GeV and there could be a significant rate of events below that threshold passing the selection criteria described in Section 4. Since the WZ and Wγ * processes may contribute as background to the Higgs boson signal whenever one of the three leptons in the final state is not selected, the low mass part of Wγ * background has been simulated using MADGRAPH, requiring two leptons each with p T > 5 GeV and no restrictions on the third one. Electron and muon masses have been taken into account to properly simulate the kinematic cut-offs. The key point is to observe the process in data and validate the simulation. In particular, the cross section of the process needs to be measured to have a reliable prediction for the background outside the control region.
The cases where the virtual photon decays into a pair of electrons or muons have both been considered. The first is characterized by a cross section that is about three times larger than the latter, since the production threshold, defined by m , is lower. In both cases, at least one of the two leptons is soft, with an average p T of ∼5 GeV. In the ± e + e − case the way of mimicking the signal is similar to that of the Wγ background, with the photon converting in the material close to the interaction vertex, making the leptons look as though they were produced promptly. For the ± µ + µ − final state, the low p T of the softest muon often prevents it from reaching the muon detector and being correctly identified.
To measure the production rate of Wγ * in data, the ± µ + µ − final state has been studied, since the large background from multijet production makes it difficult to extract the Wγ * signal in the ± e + e − case. A region that has a high purity of Wγ * events is defined using the following selection criteria: • the muons associated with the virtual photon need to have opposite signs. In the 3µ final state, the opposite-sign pair with the lowest mass is assumed to originate from the γ * ; • m µ ± µ ∓ < 12 GeV is required; • since events have two muons very close to each other, the muon isolation is redefined to exclude muons from the isolation energy calculation; • to suppress the top-quark background, events with more than two reconstructed jets are rejected, and events with at least one jet will be rejected if that jet is b-tagged; • to suppress the multijet background, the minimum transverse mass of each lepton and E miss T must be larger than 25 GeV, and the transverse mass of the lepton associated with the W boson and E miss T must be larger than 45 GeV; • the J/ψ meson decays are rejected by requiring |m µ ± µ ∓ − m J/ψ | > 0.1 GeV. There is no need to apply a requirement against Upsilon decays due to the very small cross section.
The contribution from other background processes is rather small. The only process which is not completely negligible is W + jets, as shown in Fig. 31.
The measured K-factor with respect to the LO cross section is around 1.5, consistent with observations involving other electroweak processes computed at LO. This gives further confidence on the accuracy of the simulation. Some disagreement is observed between data and simulation in the virtual photon mass shape, due to the mismodeling of the reconstruction efficiency of close-by muons at very low p T . To account for this difference in the normalization measurement, the K-factor has been computed in different regions of the mass spectrum and compared to that obtained from the full range. The same analysis is performed in four independent categories: events with m µ ± µ ∓ < 2 GeV and 2 ≤ m µ ± µ ∓ < 12 GeV, in both ± µ + µ − final states. The average spread is taken as systematic uncertainty, leading to a K-factor value of 1.5 ± 0.5.

B Estimation of the Wγ background template shapes
In the dilepton final states, the Wγ background normalization is taken from simulated samples, while the distributions of the final discriminant variables are taken from data. To obtain the shapes, a sample of events with a lepton and an identified photon is used. For the photon the same counting selection as applied in ref. [13] is used. The ratio of the photon-to-lepton identification efficiency as a function of the photon η and p T is used to properly weight the lepton-photon event sample. The possible background contamination from non-prompt photons or leptons shows a negligible effect on the shape of the distributions relevant for the analysis. The m and m T distributions for the Wγ process in events at the dilepton selection level as described in Section 4 for simulated events and from a sample with a lepton and a photon are shown in Fig. 32. The lepton-photon sample has about 200 times more events than the simulated sample. Good agreement between the distributions is observed.

C Estimation of the Drell-Yan background in the same-flavor dilepton final states
A method based on measurements in data is used to estimate the Z/γ * → contributions in the same-flavor + − final states. The expected contributions from Z/γ * → events outside a region around the Z mass in data can be estimated by counting the number of events near the Z mass region in data, subtracting from it the non-Z contributions, and scaling it by a ratio R out/in defined as the fraction of events outside and inside the Z mass region in the simulation. The Z mass region is defined as |m − m Z | < 7.5 GeV. Such a tight window is chosen to reduce the non-Z contributions from top-quark and multi-boson backgrounds. The non-Z contributions close to the Z mass region in data are estimated from the number of events in the e ± µ ∓ final state N eµ in , applying a correction factor that accounts for the difference in selection efficiency between electrons and muons k ee/µµ . The R out/in factor can be estimated both from simulated events and data. In simulation it is defined as the ratio N MC out /N MC in . The number of Drell-Yan events in the signal region is therefore: for Z/γ * → ee and k µµ = N µµ,loose in N ee,loose in for Z/γ * → µµ. The factor 1 2 comes from the relative branching fraction between the and eµ final states. In the k calculation, the selection on the missing transverse energy is loosened to increase the available number of events under the Z peak. The value of k ee is about 0.8, with a very loose dependence both on the center-of-mass energy and jet category.
The ZZ and WZ (ZV) processes contribute to the events in the m control region dominated by the Drell-Yan. The contribution from ZV becomes comparable to that of Z/γ * → after a tight E miss∠ T selection, since those events contain genuine E miss T for which the detector simulation is reliable. The expected ZV peaking contribution is subtracted from the yield in the Z peak using the simulation. The ZV events without E miss T requirements are suppressed by the same large factor as the Drell-Yan ones, and therefore their contribution at the level of the final selection is as negligible as it would be in the yield at the Z peak without E miss T requirement.
When considering the full selection the Drell-Yan and ZV components allow for the extrapolation from control region to signal region to be different for the two processes.
This Z/γ * → estimation method relies on the assumption that the dependence of the ratio R out/in on the E miss T requirement is relatively flat. On the other hand, the value of R out/in changes as a consequence of the different kinematic requirements applied to select the Higgs boson signal regions for different Higgs boson mass hypotheses. Therefore R out/in is evaluated applying selection requirements close to the full Higgs boson selections: all requirements are applied except for variables depending on E miss T . As no statistically significant difference is observed between the ee and µµ final states, both of them are combined.
The R out/in value is cross-checked in data as well. After the full selection, and after all efficiency corrections, background processes contribute equally to ee, eµ, µe, and µµ final states. On the other hand, Drell-Yan only contributes to the ee and µµ final states. Therefore the eµ and µe contributions can be subtracted from the ee and µµ samples to obtain an estimate of the Drell-Yan background. The R out/in values as a function of the multivariate Drell-Yan output variable, described in Section 4, in the 0-jet and 1-jet categories for the m H = 125 GeV counting analysis at √ s = 8 TeV are shown in Fig. 33.

D Estimation of top-quark backgrounds in the dilepton final states
In the dilepton analysis, the top-quark-induced background originates from tt and tW processes [105], the latter being especially important in the 0-jet category. A consistent theoretical description of the two processes at higher orders is not straightforward to attain as already at NLO some tW diagrams coincide with LO tt ones. The simulated samples used in the analysis exploit an approach recently proposed, which addresses the overlap by discarding the common diagrams from the tW process either at the amplitude level ("diagram removal") or at the cross section level ("diagram subtraction"). The former is considered the default scheme, whereas the latter is used as a cross-check. High output values are signal-like events, while low output values are more likely to be Drell-Yan events. The vertical dashed line indicates the minimum threshold on the discriminant value used to select events for the analysis, which is 0.88 for the 0-jet and 0.84 for the 1-jet category. The dependence of the R out/in ratio on the Drell-Yan discriminant value and the agreement between the data and the simulation are studied in the regions below this threshold.
The top-quark background is estimated at the WW selection level where a common scale factor for the tt and tW simulated samples is computed. Once properly normalized, those samples are used to predict the corresponding yields after the mass-dependent Higgs boson selection requirements in the counting analyses and to produce the templates in the shape-based analyses.
The procedure for top-quark background estimation can be summarized as follows. The topquark background is suppressed using a top-tagging veto. If the tagging efficiency is known, the top-quark background can be estimated as: where N not-tagged is the estimated number of top-quark events in the signal region that pass the veto, N tagged is the number of top-quark events that are top-tagged and top-tagged is the top-tagging efficiency as measured in a control region dominated by top-quark events. For the evaluation of N tagged and top-tagged , non-top-quark backgrounds are properly subtracted using the estimates depending on the jet category. The systematic uncertainty in the top-quark background estimation is due to the uncertainty in non-top-quark background contributions and the statistical uncertainty in the efficiency measurement. The actual implementation of the estimation method depends on the jet category, and is detailed below.

D.1 Method for the 0-jet category
Rejection for the top-quark background is achieved by top-tagging of events via the identification of a low-p T b-tagged jet or a soft-muon as defined in Section 4. The estimation of this background relies on the measurement of the top-tagging efficiency in data.
In the 0-jet category, the key ingredient for the top-quark background estimation is that tt events are characterized by two b-jets with p T below 30 GeV, while tW events have one low-p T b-jet. Nevertheless a fraction x of tW events contains two bottom-quark jets and these events are effectively indistinguishable from tt. The procedure described in the following steps properly accounts for this feature: • First, the top-tagging efficiency for one "top-taggable" leg ( data 1-leg ) is computed. A region enriched in top-quark background events is defined requiring exactly one b-tagged jet with p T > 30 GeV; this is the denominator. Events in this sample but with an additional b-tagged jet with 10 GeV < p T < 30 GeV or one soft-muon define the numerator. The ratio of the yields in the numerator and denominator provides data 1-leg . This efficiency is computed for tt only; i.e., non-top-quark backgrounds and tW yields are subtracted from the measured data in the control region. The tW yield is estimated from the simulation, which is normalized accordingly, using the predictions previously evaluated from the 1-jet category.
• The overall top-tagging efficiency, data top-tagged , is defined to account for the fraction x of tW events that look like tt, that is with two top-taggable legs: where the first term accounts for events with two taggable legs and the second term for events with one taggable leg. The f MC tt factor represents the fraction of tt events with respect to the total tt + tW and it is determined from simulation in the 0-jet category at the WW selection level, without applying the top-quark veto requirements. The fraction x matches the value of 1-leg estimated from the tW simulation. This is considered a good approximation because 1-leg is the fraction of events with one b-tagged jet with p T larger than 30 GeV (the first "top-taggable" leg) out of all events with a top-tagged leg (a b-tagged jet below 30 GeV or a soft-muon).
• Finally, a dedicated control region is defined in the 0-jet category by requiring toptagged events. The data yields in this region, corrected for the contamination from other backgrounds, are then used together with the top-tagging efficiency to predict the top-quark background:

D.2 Method for the 1-jet category
To measure the top-tagging efficiency in the 1-jet category, top-quark events with two reconstructed jets are used as the control sample. The top-tagging efficiency for the highest p T jet is approximately the same in the 1-jet and 2-jet categories. Therefore, the top-tagging efficiency for the highest p T jet is used and it is measured in the 2-jet category where, in order to increase the top-quark purity, the second jet is required to be b-tagged. The residual number of top-quark events in the 1-jet category is then given by, where N 1-jet tagged is the number of events where the counted jet is tagged and none of the other non-counted jets are tagged, and highest-p T -jet is the top-tagging efficiency for the highest p T jet measured from the 2-jet category. The closure test, performed by comparing the estimate using this procedure in simulated events, gives the same result to within 2%.
The scale factor is actually derived in a region that is slightly different from the signal region, but then it is consistently applied to the yield from simulated samples in the signal region. The difference is due to the soft-muon selection. In the signal region, events with soft-muons are always rejected. Instead, in the 1-jet top-quark background estimation, soft-muons are allowed inside the leading jet. This is also done in the top-veto region, in the top-tag region and in the efficiency measurement. The reason is the correlation between soft-muons, and b-tagging, since when a soft-muon is present in the jet, its b-tagging efficiency is slightly higher. To avoid this correlation, the top-quark background is estimated without any requirement on soft-muons close to the jet.
The m and m T distributions in the 1-jet category for top-tagged events in the different-flavor final state at the WW selection level for the √ s = 8 TeV data sample are shown in Fig. 35.

D.3 Method for the 2-jet category
Estimation of the top-quark background in the 2-jet categories is complicated by the additional requirements involved in tagging VBF and VH events since the data sample is largely reduced.
The method employed measures the top-tagging efficiency for the most central jet in the event as a function of its η in an inclusive top-quark-enriched control sample, and then applies that rate to fully selected events where the most central jet is top-tagged. In this way the possible kinematical differences between the control and signal regions are taken into account. Therefore, the residual number of top-quark events in the 2-jet category after applying the selection is given by,