Search for the standard model Higgs boson produced in association with W and Z bosons in pp collisions at sqrt(s) = 7 TeV

A search for the Higgs boson produced in association with a W or Z boson in proton-proton collisions at a center-of-mass energy of 7 TeV is performed with the CMS detector at the LHC using the full 2011 data sample, from an integrated luminosity of 5 inverse femtobarns. Higgs boson decay modes to tau tau and WW are explored by selecting events with three or four leptons in the final state. No excess above background expectations is observed, resulting in exclusion limits on the product of Higgs associated production cross section and decay branching fraction for Higgs boson masses between 110 and 200 GeV in these channels. Combining these results with other CMS associated production searches using the same dataset in the H to gamma gamma and H to b b-bar decay modes, the cross section for associated Higgs boson production 3.3 times the standard model expectation or larger is ruled out at the 95% confidence level for a Higgs boson mass of 125 GeV.


Introduction
Spontaneous electroweak symmetry breaking is introduced in the standard model (SM) [1][2][3] to give mass to the vector bosons (W ± and Z) that mediate weak interactions, while keeping the photon, which mediates electromagnetic interactions, massless. This mechanism [4][5][6][7][8][9] results in a single scalar in the SM, the Higgs boson. While the mass of the Higgs boson is a free parameter in the SM, its couplings to the massive vector bosons, Yukawa couplings to fermions, decay branching fractions, and production cross sections in proton-proton collisions are defined and well understood theoretically [10]. Gluon fusion (GF), weak vector boson fusion (VBF), associated production (AP) with weak bosons, and associated production with a tt pair (ttH) are the four most important Higgs boson production mechanisms at the Large Hadron Collider (LHC). Although the cross section for AP is an order of magnitude lower than that of the GF mechanism, the presence of isolated high momentum leptons originating from W and Z decays suppresses the backgrounds dramatically, making these channels viable for searches for the Higgs boson.
Direct searches at the Large Electron-Positron Collider (LEP) have excluded a Higgs boson with a mass m H < 114.4 GeV at 95% confidence level (CL) [11]. The ATLAS experiment has excluded the SM Higgs boson in the mass ranges 111-122 and 131-559 GeV [12], and the CMS experiment in the mass ranges 110-121.5 [13] and 127-600 GeV [14]. Both experiments have reported the observation of a new boson with a mass near 125 GeV [12,13], predominantly in channels sensitive to Higgs bosons decaying to photon or Z boson pairs. Tevatron experiments have reported an excess of events in the bb final state in the mass range 120-135 GeV [15]. This paper reports a search for the SM Higgs boson produced in association with a W boson (WH channel) or a Z boson (ZH channel). The search uses a data sample of proton-proton collisions at √ s = 7 TeV recorded by the Compact Muon Solenoid (CMS) [16] experiment at the LHC. The data were collected in 2011 from an integrated luminosity of 5.00 ± 0.11 fb −1 [17]. Throughout this document, the expression "light lepton," or symbol , will refer to an electron or muon, the symbol τ h to a hadronically-decaying tau, and the symbol L to an e, µ, or τ h . The search for WH production is performed in three-lepton (3L) events in four final states with three electrons or muons (3 ): eee, eeµ, eµµ, and µµµ, and two final states that have a hadronic decay of a tau (2 τ h ): eµτ h and µµτ h . The search for ZH production is performed in four-lepton (4L) events with a pair of electrons or muons consistent with the decay of a Z boson, and a Higgs boson candidate with one of the following final states: eµ, eτ h , µτ h , or τ h τ h . These final states can be produced by two Higgs boson decay modes: decays to a pair of W bosons (H → W + W − ) that both decay to leptons, and decays to a pair of taus (H → τ + τ − ). The contribution of the H → ZZ decay mode is negligible.
While the sensitivity to a Higgs boson of the AP search presented here is lower than previously published results dominated by the GF and VBF production mechanisms, the final states used in this search are essential for determining if the recently observed boson at 125 GeV is consistent with the Higgs boson predicted by the SM. The Tevatron excess has been observed in the associated production H → bb channel [15]. No evidence for associated Higgs boson production has been observed at the CMS and ATLAS experiments [12,13,18]. Furthermore, the exclusive measurement of all three production processes (GF, AP, and VBF) using the H → τ + τ − decay mode will be critical to determine the structure of the Higgs boson couplings [19], as the H → τ + τ − decay mode is the only fermionic decay mode that is experimentally sensitive to both Yukawa coupling (GF) and gauge coupling (AP and VBF) production processes. The fermionic H → bb decay mode is not experimentally accessible in the GF production mechanism due to the overwhelming multijet background. parton distribution functions. While the next-to-leading-order (NLO) calculations are used for background cross sections, the cross sections used for the Higgs boson signal samples are computed at next-to-NLO [10]. For all processes, the detector response is simulated using a detailed description of the CMS detector, based on the GEANT4 package [34]. The simulations include pileup interactions matching the distribution of the number of such interactions observed in data.

Trigger and event selection
Candidate signal events are recorded if they pass a trigger requiring the presence of a high-p T electron pair, muon pair, or electron-muon pair. The leading and subleading triggering lepton candidates are required to have p T > 17 GeV and p T > 8 GeV, respectively. Offline, electron and muon candidates are subjected to standardized quality criteria described in Ref.
[35] and Refs. [36,37], respectively, to ensure high efficiency and precision. In the 3L channels, the electron candidate is subjected to a multivariate selection exploiting the correlations among electron observables [38] to reduce the rate of quark or gluon jets misidentified as electrons. Three (four) charged-lepton candidates with total charge ±1(0) are required for the 3L and 4L channels, respectively. The two triggering light leptons are required to have p T > 20 GeV and p T > 10 GeV, respectively. Non-triggering e and µ candidates are required to have p T > 10 GeV. The minimum p T of τ h candidates is 20 GeV. Electron, muon, and τ h candidates are required to originate from the primary vertex of the event, which is chosen as the vertex with highest ∑ p 2 T , where the sum is made using the tracks associated with the vertex. In the 4L channels, two leptons are required to be compatible with the decay of a Z boson, having the same flavor, opposite charge, and invariant mass within 20 GeV of the mass of the Z boson.
Leptons from the Higgs or vector-boson decays are typically isolated from the rest of the event activity, in contrast to background from jets, which are immersed in considerable hadronic activity. For each lepton candidate a cone defined by ∆R ≡ (∆η) 2 + (∆φ) 2 , where φ is the azimuthal angle in radians, is constructed around the lepton direction at the event vertex. The size of the cone is 0.4 for e and µ candidates, and 0.5 for τ h candidates in the 2 τ h and 4L channels. In the 3 channels a smaller ∆R = 0.3 cone is used. An isolation variable is constructed from the scalar sum of the transverse energy of all charged and neutral reconstructed particles contained within the cone, excluding the contribution from the lepton candidate itself. The contributions of charged particles coming from pileup interactions longitudinally displaced from the primary event vertex are excluded from the isolation variable. In the 2 τ h and 4L channels, the neutral contribution to the isolation variable from the pileup is estimated using the energy deposited by tracks from pileup vertices which point into the isolation cone, and is subtracted from the isolation variable. In the 3 channels, the neutral contribution from pileup, which is typically composed of many low p T particle candidates, is mitigated by excluding neutral particle candidates with p T < 1 GeV from the isolation variable calculation.
For a Higgs boson mass of 125 GeV, the H → WW → branching fraction is approximately 1.8 times larger than the H → ττ → branching fraction [10]. Accordingly, the expected signal yield in the 3 channel is dominated by the H → WW decay. Conversely, the H → ττ → τ h decays dominate the signal yield in the 2 τ h and 4L channels, as their branching fraction is 3.6 times larger than the H → WW → τ h branching fraction.
When the Higgs boson mass is above approximately 140 GeV, the H → WW decay dominates in all channels. The topological event selections are optimized for the WWW final states in the 3 channels, and for the H → ττ final state in the 2 τ h and 4L channels. In all channels, topquark background events are suppressed by vetoing events containing jets with p T > 20 GeV

Trigger and event selection
that are identified as coming from b quarks [39,40]. Events with additional isolated leptons (e, µ, or τ h candidates) are vetoed. In the 3L channels, this requirement removes diboson ZZ → 4 background events. The lepton veto ensures that each channel is exclusive to all other channels presented in this paper and to the published CMS H → ZZ → 4 analysis [41].
In the 3 channel, the dominant WZ → ν background is reduced by rejecting events with a same-flavor opposite-charge lepton pair with an invariant mass within 25 GeV of the Z-boson mass (m Z ). Events are rejected if there is a jet with E T > 40 GeV to remove tt background events, which typically contain multiple high-p T jets. In WH → WWW events, the neutrinos associated with the decays of the W bosons escape detection, resulting in large E miss T . Drell-Yan background events are expected to have low E miss T . To mitigate degradation of the E miss T resolution due to pileup, the minimum of two different observables is defined as the E miss T . The first includes all PF particle candidates of the event in the computation of E miss T , while the second uses only the charged PF particle candidates associated with the primary vertex. To improve rejection of background events with E miss T associated with poorly reconstructed leptons, the "projected" E miss T [42] is used. This projected E miss T is defined as the component of E miss T transverse to the direction of the closest lepton if it is closer than π/2 in azimuthal angle, and the full E miss T otherwise. The use of both E miss T definitions exploits the presence of a correlation between the two observables in signal events with genuine E miss T and its absence otherwise. Events in the 3 channel are required to have projected E miss T above 40 GeV. To further reject WZ background events, the constituents of at least one opposite-charge any-flavor (OCAF) lepton pair must be separated by less than 2 in ∆R. Finally, the smallest OCAF pair mass must be above 12 GeV and below 100 GeV to suppress Wγ and WZ events, respectively.
In the 2 τ h channels, the dominant backgrounds are Z, W, and tt events with an additional quark or gluon jet incorrectly identified as an e, µ, or τ h . The probability for a quark or gluon jet to pass the τ h identification (misidentified τ h ) is 10 to 100 times greater than the probability for a jet to pass the e or µ identification and isolation requirements. To remove the large Z/γ * → + − + misidentified τ h and tt backgrounds, the light leptons eµ (µµ) are required to have the same charge in the eµτ h (µµτ h ) channel. The variable L T , defined as the scalar sum of the transverse energy of the three lepton candidates in the event, is required to be larger than 80 GeV. This requirement is effective in rejecting some of the background coming from the semi-leptonic decays of heavy quarks, which has a softer p T spectrum.
The largest background in the 4L channels is the irreducible diboson ZZ background. The dominant reducible backgrounds in the 4L channels are Z + 2 jet events, where both jets are misidentified as leptons, and WZ events with one additional misidentified jet. These backgrounds are highly suppressed by the lepton identification and isolation requirements. There is an additional non-negligible contribution from tt → + ν − νbb events which is suppressed by the lepton identification, isolation, and the requirement of a Z-boson candidate present in the event.
The resulting signal efficiencies after all selections vary between 0.1% and 12%, depending on production mode, decay channel, and Higgs boson mass, and are given in Table 1. The performance of the 3 Z-boson mass and minimum ∆R requirements, and the eµτ h and µµτ h L T selections are illustrated in Fig. 1.
The event selections used in the H → γγ and H → bb channels are described in detail elsewhere [20,21]. Briefly, AP H → γγ candidate events are selected by requiring the presence of two high-p T photon candidates and an isolated electron or muon. Events in the AP H → bb analysis are selected by requiring two jets identified as coming from b quarks and a vector boson candidate with high p T . The vector boson candidate can decay into one light lepton, two   Table 1: Efficiency for signal events to pass the selections in each channel for the different Higgs boson production and decay modes. The efficiency is defined with respect to WH and ZH events in which the W or Z boson decays to final states containing an e, a µ, or a τ. The residual corrections described in Section 5 are applied, and the uncertainties correspond to the combined statistical and systematic uncertainties; theoretical uncertainties are not included. The uncertainty on the efficiency is dominated by the systematic (statistical) uncertainty for the H → ττ (H → W + W − ) decay in the 2 τ h and 4L channels, with the reverse being true in the 3 channels.

Background estimation
A combination of methods using data control samples and detailed studies with simulated events is used to estimate residual background contributions after selection. There are two background categories: irreducible diboson backgrounds, and events with at least one nonprompt lepton. The irreducible diboson backgrounds consist of WZ and ZZ events with the same number of isolated prompt leptons as the signal processes, and Zγ events with an asymmetric photon conversion. The WZ and ZZ backgrounds are estimated using simulated samples, and are scaled by a residual correction factor obtained by comparing the observed data in diboson-enriched sidebands with the prediction from simulation.
The non-prompt lepton backgrounds arise from decays of charm and beauty quarks and hadrons misidentified as leptons. The non-prompt backgrounds are evaluated using data with the "misidentification rate method". The misidentification probabilities as a function of candidate p T and η, f (p T , η), for non-prompt lepton candidates (e, µ, or τ h ) to pass the final identification and isolation criteria are measured in independent, highly pure control samples of multijet, W → µν + jet, and Z → µµ + jet events. The control samples are exclusive to the signal sample due to different final state topology requirements. To minimize possible biases, the same trigger, kinematic, and quality criteria used in the final analysis are applied to the control samples. Sidebands are defined for each channel, where all selection criteria are satisfied, with the exception that the final identification or isolation criterion is not satisfied for one or more of the final-state lepton candidates. The sidebands are dominated by the non-prompt backgrounds. The number of non-prompt background events in the final selection is estimated by weighting each observed non-prompt lepton candidate in the sideband by its corrected probability f (p T , η)/(1 − f (p T , η)) to pass the final identification and isolation criteria. The estimate of the non-prompt yield in the final selection is computed using all sideband events where any two light-lepton candidates pass all requirements and the third candidate fails the isolation requirement. In the 2 τ h channels, the backgrounds with a misidentified τ h and two genuine prompt light leptons (eµ or µµ) are negligible, due to the requirement that the two light leptons have the same charge. Accordingly, the misidentified-τ h sideband is ignored in these channels.
Background processes with more than one non-prompt lepton, such as multijet events, W → τν + 2jet in the 2 τ h channels, or Z + 2jet in the 4L channels, are counted twice by this method since they are present in both sidebands. The double-counting is corrected using a high-purity control region with two non-prompt leptons selected by requiring two lepton candidates to fail the isolation requirement simultaneously. The observed events in the sideband are weighted by the corrected probability where f 1 and f 2 are the mis-identification probabilities for the leading and subleading lepton candidates, respectively, that both candidates will pass the final identification and isolation requirements; the weighted events are an independent estimate of the quantity that was double-counted. The double-counted events are removed from the total background estimate by subtracting the independent estimate of the background with two misidentified leptons.
In the 3L channels, the irreducible WZ background normalization is estimated in data using a control sample of observed events with three light leptons where one of the same-flavor opposite-charge lepton pairs is compatible with a Z boson using a ±15 GeV mass window. The control sample is completely dominated by WZ events. The same trigger and lepton identification requirements described in Section 3 are applied. The ZZ background is largely reduced by the veto of events containing an additional e, µ, or τ h candidate. The theoretical NLO calculation [43] is used as the normalization of the ZZ background. The Zγ background, where the γ is misidentified as an electron through an asymmetric conversion is estimated from simulation. In the 3 channels the expected contribution from this background is negligible after the E miss T requirement, and it is highly suppressed due to the small branching fraction in the τ h channels.
In the 4L channels, WZ events have at least one non-prompt lepton and are estimated using the misidentification-rate method described above. The dominant background comes from irreducible ZZ events. The number of ZZ background events N est ZZ is estimated by scaling the observed inclusive Z yield N obs Z by the expected ratio of ZZ and Z production: where σ SM ZZ [43] and σ SM Z are the theoretical SM cross sections, and A ZZ and A Z are the acceptances to pass all event selections for the ZZ and Z processes, respectively. The acceptances A are estimated using MC simulation. The Zγ background is negligible in the 4L channels.

Efficiencies and systematic uncertainties
The trigger, identification, and isolation efficiencies for electrons and muons are measured with data using the "tag-and-probe" technique [38] in Z → events. The τ h identification efficiency is measured with an uncertainty of 6% using the tag-and-probe technique in Z→ ττ → µτ h events [25]. Efficiencies for the Higgs boson signal and WZ, ZZ, and Zγ diboson samples are estimated using MC simulation, and residual differences between the lepton efficiencies in the simulation and data are corrected by scaling the simulation to match the efficiency measured in data. The uncertainty on the residual correction is taken as a systematic uncertainty in the final result. The uncertainty on the b-tagging efficiency is 6% [21]. Uncertainties on the jet energy scale and E miss T have been evaluated in Z + jet and γ + jet events [30], and are propagated to systematic uncertainties on the final yields. The uncertainty due to the pileup description is evaluated by varying the distribution of the estimated number of expected pileup interactions per event in data, and is 1% or less. There is a 2.2% uncertainty [17] on the total integrated luminosity of the collected data sample.

Results
Two theoretical systematic uncertainties on the overall signal yield are considered. The uncertainty on the QCD factorization and renormalization scales affects the expected signal cross section and, in the 3 channel, the efficiency of the jet veto. The effect of variations in the parton distribution functions, the value of α s , and higher-order corrections are propagated to the efficiency of the signal selection using the PDF4LHC prescription [44][45][46][47][48].
The methods to estimate the different backgrounds are explained in Section 4. For the 3L channels, the associated uncertainty on the diboson backgrounds is 12% and 4% for the WZ and ZZ components, respectively. In the 4L channels, the theoretical uncertainty of 10% on the ZZ production cross section [10] dominates the uncertainty on the estimate of the ZZ background. The uncertainty on the estimate of the non-prompt lepton backgrounds is 30% and is dominated by uncertainties in the measurement of the misidentification rate. The final estimate of the nonprompt backgrounds has an additional systematic uncertainty due to the limited number of observed events with leptons failing the isolation requirements. In the eµτ h and µµτ h mass spectra, a shape uncertainty [49] is added for each bin in the spectra, corresponding to the statistical uncertainty of the control region bin used to compute the non-prompt background estimate.

Results
After all selections, a total of 29 events are observed, while 33.5 ± 4.3 are expected from the background. The number of observed and expected background events are enumerated for each channel in Table 2. The observed data are consistent with the expected yield from the backgrounds. The efficiency for signal events to pass all selections are detailed for each channel and Higgs boson mass, production mechanism, and decay mode in Table 1. The efficiencies are defined with respect to events where all W and Z bosons decay to leptons (excluding Z → νν decays). Table 2: Observed number of events and expected number of signal and background (bkg) events for the different channels. The uncertainties correspond to the combined statistical and systematic uncertainty. The second and third columns give the expected yield of a Higgs boson signal (m H = 120 GeV) from the H → ττ and H → WW decays, respectively. The theoretical uncertainties on the signal yields are not included. In the 2 τ h channels, it is not possible to definitively assign the same-charge electrons or muons to either the W or the Higgs boson candidate. However, as the signal is dominated by H → ττ decays, the final-state light leptons produced in the decays of the τ leptons have a softer p T spectrum than light leptons from W → ν decays, as they are associated with two neutrinos instead of one. Accordingly, we define the subleading light lepton and τ h as the Higgs boson candidate. The invariant mass of the Higgs boson candidate is shown for the final selected events in the 2 τ h and 4L channels in

Limits on SM Higgs boson production
In the searches presented in this paper, the observed events show no evidence for the presence of a Higgs boson signal, and we set 95% CL upper bounds on the Higgs boson associated production cross section. To obtain exclusion limits we use the CL s method [50][51][52] based on a binned likelihood of the invariant mass spectrum in the eµτ h and µµτ h channels (Fig. 2), and the number of observed and expected events in the 3 and 4L channels. The non-prompt background mass spectra for the 2 τ h channels has a shape uncertainty for each bin in the spectra. Systematic uncertainties are represented in the limit computation by nuisance parameters using a log-normal constraint. Correlated uncertainties among channels are represented by common nuisance parameters. The nuisance parameters are varied from one pseudoexperiment to the next in the calculation of the CL s test statistic. Figure 3 shows the observed and median expected 95% CL upper limits on SM Higgs boson production set by this analysis for each channel individually and for the combination of all three. The limit is expressed in terms of the ratio of the Higgs boson cross section times the relevant branching fractions, to that predicted in the SM, σ/σ SM . The two bands give the variation around the median expected limit by one and two standard deviations. We set a 95% CL upper limit on σ/σ SM in the range 3.1-9.1.
We additionally combine the searches presented here with the CMS AP Higgs boson searches, using the same dataset, in the H → γγ [20] and H → bb [21] decay modes. The H → γγ and H → bb searches are included in the limit combination for Higgs boson masses below 150 GeV and 135 GeV, respectively. The treatment of systematic uncertainties in these channels is similar to that described in Section 5. The potential contributions of the VBF and GF SM Higgs boson production mechanisms to these analyses are negligible. The associated ttH production mechanism contributes approximately 5% and 14% of the expected signal yield in the 4L and AP H → γγ channels, respectively. The contributions from ttH to the other channels are negligible. The limits for each sub-channel and for the combination of all CMS AP searches are shown in Fig. 4. The full combination excludes, at 95% CL, the associated production of SM Higgs bosons at 2.1-3.7 times the SM prediction for Higgs boson masses below 170 GeV. The observed and expected limits for a Higgs boson mass of 125 GeV are enumerated for the full combination and for each exclusive sub-channel in Table 3.

Summary
A search for the standard model Higgs boson, produced in association with a W or Z boson, has been described. The search is conducted using final states with three or four isolated leptons in the entire 2011 CMS dataset. The analysis is sensitive to associated production where the Higgs boson decays into either a τ pair or W-boson pair. A total of 29 events are observed, and are compatible with the background prediction. Upper limits of about 2.6-9 times greater than the predicted value are set at 95% CL for the product of the SM Higgs boson associated production cross section and decay branching fraction in the mass range 110 < m H < 200 GeV. The searches presented in this paper are combined with two other CMS associated production  Figure 4: At left, the observed and expected limits, at 95% CL, on SM Higgs boson production combining the AP searches presented in this paper with the previously published AP H → γγ [20] and H → bb [21] searches. At right, the exclusive observed and expected limits (indicated by the solid and dashed lines respectively) are shown for each sub-channel.  [23] CMS Collaboration, "Electron Reconstruction and Identification at √ s = 7 TeV", CMS Physics Analysis Summary CMS-PAS-EGM-10-004, (2010).