Search for a heavy Higgs boson decaying to a pair of W bosons in proton-proton collisions at $$ \sqrt{s} $$ = 13 TeV

Abstract
 A search for a heavy Higgs boson in the mass range from 0.2 to 3.0 TeV, decaying to a pair of W bosons, is presented. The analysis is based on proton-proton collisions at $$ \sqrt{s} $$
 
 s
 
 = 13 TeV recorded by the CMS experiment at the LHC in 2016, corresponding to an integrated luminosity of 35.9 fb−1. The W boson pair decays are reconstructed in the 2ℓ2ν and ℓν2q final states (with ℓ = e or μ). Both gluon fusion and vector boson fusion production of the signal are considered. Interference effects between the signal and background are also taken into account. The observed data are consistent with the standard model (SM) expectation. Combined upper limits at 95% confidence level on the product of the cross section and branching fraction exclude a heavy Higgs boson with SM-like couplings and decays up to 1870 GeV. Exclusion limits are also set in the context of a number of two-Higgs-doublet model formulations, further reducing the allowed parameter space for SM extensions.


Introduction
The discovery of the standard model (SM) Higgs boson, with a mass close to 125 GeV, by the CERN LHC experiments ATLAS and CMS in 2012 [1][2][3] represents a major advancement in particle physics. Studies of the new particle have so far shown consistency with the SM Higgs mechanism predictions [4][5][6][7][8][9][10][11][12][13][14][15]. Throughout this paper, the observed SM Higgs boson is denoted as h(125). In order to determine whether the SM gives a complete description of the Higgs sector, precise measurements of the h(125) coupling strengths, CP structure and kinematic distributions are required [16][17][18][19][20]. A complementary strategy involves the search for an additional Higgs boson, denoted X, whose existence would prove the presence of beyond the SM (BSM) physics in the form of a non minimal Higgs sector [21,22]. The search for an additional scalar resonance in the full mass range accessible at the LHC remains one of the main objectives of the experimental community.
The search for a high-mass Higgs boson has been performed at ATLAS [23][24][25][26] and CMS [27,28] in a number of final states, using proton-proton (pp) collisions at centre-of-mass energies ( √ s) of 7, 8 and 13 TeV, with no significant excess observed. For Higgs boson masses above 200 GeV one of the most sensitive channels is the decay to a pair of W bosons [22]. In this analysis, a search is performed in the fully leptonic, 2 2ν, and semileptonic, ν2q, WW decay channels (with = e or µ) using pp collisions recorded at √ s = 13 TeV by the CMS experiment in 2016, corresponding to an integrated luminosity of 35.9 fb −1 .
The fully leptonic channel has a clear signature of two isolated leptons and missing transverse momentum (p miss T ), due to the neutrinos escaping detection. For the semileptonic channel, the leptonically decaying boson is reconstructed as a single isolated lepton and p miss T . The hadronically decaying boson may be sufficiently boosted that its decay products are contained in a single merged jet. Jet substructure techniques are used to identify merged jets with two well defined subjets and to determine the merged jet mass, helping to discriminate vector boson hadronic decays from other jets. When the W boson hadronic decay products are resolved, it may be reconstructed using two quark jets (a dijet). The search is performed in a wide mass range from 0.2 up to 3.0 TeV. Events are categorized to enhance the sensitivity to the gluon fusion (ggF) and vector boson fusion (VBF) Higgs boson production mechanisms. extraction procedure and the systematic uncertainties affecting the analysis are presented in Section 8; the results are presented in Section 9. Finally, results are summarized in Section 10.

The CMS detector
The CMS detector, described in detail in Ref. [33], is a multipurpose apparatus designed to study high transverse momentum (p T ) physics processes in pp and heavy-ion collisions. A superconducting solenoid occupies its central region, providing a magnetic field of 3.8 T parallel to the beam direction. Charged-particle trajectories are measured by the silicon pixel and strip trackers, which cover a pseudorapidity region of |η| < 2.5. A crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter surround the tracking volume and cover |η| < 3. The steel and quartz-fiber Cherenkov hadron forward (HF) calorimeter extends the coverage to |η| < 5. The muon system consists of gas-ionization detectors embedded in the steel flux return yoke outside the solenoid, and covers |η| < 2.4. The first level of the CMS trigger system [34], composed of custom hardware processors, is designed to select the most interesting events in less than 4 µs, using information from the calorimeters and muon detectors. The high-level trigger processor farm further reduces the event rate to 1 kHz before data storage.

Data and simulated samples
The events used to study the ν2q final state are selected by high-level trigger algorithms that require the presence of one electron with p T > 25 GeV and |η| < 2.1 passing tight identification and isolation requirements, or one muon with p T > 24 GeV and |η| < 2.4 passing loose identification and isolation requirements. The trigger efficiency for ν2q signal events passing the offline event selection is about 93%. Both single-lepton and dilepton triggers are used to select events to study the 2 2ν final state. In addition to the single-lepton triggers described, the 2 2ν final state events are also selected by a trigger which requires one electron outside the central region (2.1 < |η| < 2.5) with p T > 27 GeV. The dilepton triggers require the presence of two leptons passing relatively loose identification and isolation requirements. For the dielectron (dimuon) trigger, the p T thresholds are 23 (17) GeV for the leading and 12 (8) GeV for the subleading electrons (muons). For the different-flavour dilepton trigger, the p T thresholds are either 8 GeV for the muon and 23 GeV for the electron, or 23 GeV for the muon and 12 GeV for the electron. The overall trigger efficiency for the combination of the single-lepton and dilepton triggers for 2 2ν signal events passing the offline event selection is larger than 99%.
Several event generators are used to optimize the analysis and estimate the yields of signal and background events, as well as the associated systematic uncertainties. The heavy Higgs boson signal samples are generated in the ggF and VBF production modes at next-to-leading order (NLO) in quantum chromodynamics (QCD) using POWHEG v2 [35][36][37][38][39], for a number of masses ranging from 0.2 to 3.0 TeV. The resonance width is set according to the SM Higgs boson expectation for signal masses up to 1 TeV. For signal masses higher than 1 TeV the width is set to half the resonance mass, which approximately corresponds to the SM Higgs boson prediction at 1 TeV. The decay of the signal to a pair of W bosons is simulated with JHUGEN v6.2.8 [40,41]. The simulated signal samples are normalized using cross sections and decay rates computed by the LHC Higgs Cross Section Working Group [42].
The W+jets process is produced at NLO with the MADGRAPH5 aMC@NLO v2.2.2 event generator [43], using the FxFx merging scheme [44] between the jets from matrix element calculations If more than one vertex is reconstructed, the vertex with the largest value of summed physicsobject p 2 T is taken to be the primary pp interaction vertex. The physics objects are those returned by a jet finding algorithm [64,65] applied to all charged tracks assigned to the vertex, and the associated missing transverse momentum, computed as the negative vectorial sum of the p T of those jets.
Electrons are reconstructed from a combination of the deposited energy of the ECAL clusters associated with the track reconstructed from the measurements determined by the inner tracker, and the energy sum of all photons spatially compatible with being bremsstrahlung from the electron track [66]. The electron candidates are required to have |η| < 2.5. Additional requirements are applied to reject electrons originating from photon conversions in the tracker material or jets mis-reconstructed as electrons. Electron identification criteria rely on observables sensitive to the bremsstrahlung along the electron trajectory, the geometrical and momentum-energy matching between the electron trajectory and the associated supercluster, as well as ECAL shower shape observables and compatibility with the primary vertex.
Muon candidates are reconstructed by combining charged tracks in the muon detector with tracks reconstructed in the central tracking system [67]. They are required to have |η| < 2.4. Identification criteria based on the number of hits in the tracker and muon systems, the fit quality of the muon track, and the consistency of the trajectory with the primary vertex, are imposed on the muon candidates to reduce the misidentification rate.
Prompt leptons from electroweak interactions are usually isolated, whereas misidentified leptons and leptons from jets, are often accompanied by charged or neutral particles, and can arise from a secondary vertex. Therefore leptons are required to be isolated from hadronic activity by requiring that the sum of the p T of charged hadrons associated with the primary vertex, and the p T of neutral hadrons and photons, in a cone around the lepton of radius ∆R = √ (∆φ) 2 + (∆η) 2 = 0.4 (where φ is the azimuthal angle in radians), is below a certain fraction of the lepton p T . To mitigate the effect of pileup on the isolation variable, a correction based on the mean event energy density [68] is applied.
The jet reconstruction uses all PF candidates, except those charged candidates that are not associated with the primary vertex. This requirement mitigates the effect of pileup for |η| < 2.5. Particle candidates are clustered using the anti-k T algorithm [64,65] with a distance parameter of 0.4 (AK4) or 0.8 (AK8). To reduce the residual pileup contamination from neutral PF candidates, a correction based on jet median area subtraction [68] is applied. The jet energy is calibrated using both simulation and data following the technique described in [69]. Only AK4 jets with p T > 30 GeV (20 GeV for b quark jets) and |η| < 4.7 (2.4 for b quark jets) are considered. The AK8 jets are required to have p T > 200 GeV and |η| < 2.4. Those AK4 (AK8) jets which overlap with a well identified and isolated lepton within a distance of ∆R = 0.4 (0.8) are ignored.
The vector p miss T , whose magnitude is the p miss T in the event, is computed as the negative vectorial sum in the transverse plane of all the PF candidates momenta. The p miss T is modified to account for the corrections to the energy scale of the jets described above.
A jet grooming procedure, which removes contributions from soft radiation and additional interactions, is used on the AK8 jets to help identify and discriminate between jets from Lorentzboosted hadronic W boson decays and jets from quarks and gluons. First, the pileup mitigation corrections provided by the pileup per particle identification (PUPPI) algorithm [70] are applied. The jets are then groomed by means of a modified mass drop algorithm [71,72], known as the soft-drop algorithm [73], with parameters β = 0, z cut = 0.1 and R 0 = 0.8. The soft-drop mass (m J ) used in the ν2q analysis is computed from the sum of the four-momenta of the jet constituents passing the grooming algorithm.
Discrimination between AK8 jets originating from W boson decays and those originating from gluons and quarks is also achieved by using the N-subjettiness jet substructure variable [74]. This observable exploits the distribution of the jet constituents found in the proximity of the subjet axes to determine if the jet can be effectively subdivided into a number N of subjets. The generic N-subjettiness variable τ N is defined using the p T -weighted sum of the angular distance ∆R N,k of the jet constituents k with respect to the axis of the N th subjet: The normalization factor d 0 is defined as d 0 = ∑ k p T,k R 0 , with R 0 being the clustering parameter of the original jet. The variable which best discriminates W boson jets from those coming from quarks and gluons is the ratio of the 2-to 1-subjettiness: τ 21 = τ 2 /τ 1 . The τ 21 observable is calculated for the jet after applying the PUPPI algorithm corrections for pileup mitigation.
To identify jets coming from b quarks, a multivariate b tagging algorithm [75] and the combined secondary vertex algorithm [75] are used in the 2 2ν and ν2q analyses, respectively. In both cases, the chosen working point corresponds to about 80% efficiency for genuine b quark jets and to a mistagging rate of about 10% for light-flavour or gluon jets, and of about 40% for c quark jets.
For each event in the fully leptonic channel, at least two high-p T lepton candidates originating from the primary vertex are required. Opposite-charge dielectron pairs, dimuon pairs and electron-muon (eµ) pairs are accepted. In the semileptonic channel, at least one high-p T lepton candidate, and two AK4 jets or one AK8 jet, originating from the primary vertex are required.

Signal models
A signal interpretation in terms of a heavy Higgs boson with SM-like couplings and decays is implemented in this analysis. Both the ggF and VBF production mechanisms are considered. Due to the large expected width of the X resonance at high-mass, its interference with the WW continuum and the h(125) off-shell tail becomes significant [29]. The MELA matrix-element package [18, 40,41], based on JHUGEN for Higgs bosons, and on MCFM for the continuum WW background, has been used to estimate the interference of high-mass X resonances with the WW continuum and the h(125). The two sources of interference have opposite signs and partially cancel out with the size of the cancellation depending on the signal mass. Figure 1 displays the generator-level mass distribution of a ggF-produced 700 GeV signal and the effects of interference with the gg → WW continuum and gg → h(125) off-shell tail. The interference effect is taken into account for both the ggF and VBF production mechanisms. A parameter f VBF , which is the fraction of the VBF production cross section with respect to the total cross section, is included in the model and a number of hypotheses investigated.  Higgs boson. The 2HDM has two important free parameters, α and tan β, which are the mixing angle and the ratio of the vacuum expectation values of the two Higgs doublets, respectively. The quantity cos(β − α) is also of interest, as the coupling of the heavy Higgs boson H to two vector bosons is proportional to this factor. In the alignment limit, which occurs at cos(β − α) = 0, the properties of h approach those of the SM Higgs boson, while the decay of H to vector bosons becomes heavily suppressed. Based on the constraints given by the measurements of the h(125) couplings, the largest possible deviations of cos(β − α) from 0 allowed are approximately 0.3 and 0.1 for the Type-I and -II scenarios respectively [76,77]. Therefore the value of cos(β − α) has been fixed to 0.1 for the 2HDM scenarios considered here. In this way the measured properties of the h(125) are incorporated into the definition of the scenarios while still allowing for a non-negligible branching fraction for H to vector bosons. In the limit that m A >> m Z , the masses of the H, A, and H ± bosons become approximately degenerate. For simplicity it is assumed that m H = m A = m H ± for the 2HDM scenarios considered. The width of H has a dependence on tan β, with relatively large widths predicted in comparison to both the SM widths and the experimental resolution for tan β below ≈0.2 and m H above ≈400 GeV. However, for the majority of the phase space explored the SM width assumption gives a reasonable approximation of the 2HDM predictions.
The minimal supersymmetric standard model (MSSM) [78,79], which incorporates a Type-II 2HDM, is also considered. At tree level, the whole phenomenology can be described using just two parameters. By convention, these parameters are chosen to be tan β and m A , the mass of the pseudoscalar Higgs boson. Beyond the tree level, the MSSM Higgs sector depends on additional parameters which enter via higher-order corrections in perturbation theory, and which are usually fixed to values motivated by experimental constraints and theoretical assumptions. The m mod+ h [80] and hMSSM [81][82][83][84] benchmark scenarios are defined by setting these parameters such that a wide range of the m A -tan β parameter space is compatible with the h(125) mass and production rate measurements at ATLAS and CMS. For the M 125 h , M 125 h (alignment), M 125 h ( χ), and M 125 h ( τ ) benchmark scenarios, a significant portion of the parameter space is consistent with the h(125) measurements and with limits from searches for supersymmetry particles and additional Higgs bosons at ATLAS and CMS using pp collisions at √ s = 7, 8, and 13 TeV [85]. The assumption of a SM width is a reasonable approximation for the MSSM scenarios considered, with relatively small widths predicted with respect to the experimental resolution for the majority of the phase space explored.
Model predictions for the MSSM scenarios are provided by the LHC Higgs Cross Section Working Group [42]. The ggF cross sections have been computed with SUSHI [86,87], which includes NLO QCD corrections [88], NNLO QCD corrections for the top quark contribution in the effective theory of a heavy top quark [89][90][91] and electroweak effects by light quarks [92,93]. For most of the scenarios considered, NLO supersymmetric-QCD corrections [94][95][96][97] in expansions of heavy SUSY masses are also included in SUSHI. The masses, mixing angles, and the effective Yukawa couplings of the Higgs bosons for all scenarios except the hMSSM are calculated with FEYNHIGGS [98][99][100][101][102][103][104]. The branching fractions for the hMSSM scenario are obtained with HDECAY [105][106][107], while for all other scenarios the branching fractions are obtained from a combination of FEYNHIGGS, HDECAY and PROPHECY4F [108,109]. The results for the general 2HDM interpretation are obtained using the ggF cross sections computed with SUSHI and the branching fractions with 2HDMC [110]. These calculations are compatible with the results from HIGLU [111] and HDECAY within the uncertainties [112]. The VBF cross sections are approximated using the SM Higgs boson production cross sections for VBF, which are provided for different masses by the LHC Higgs Cross Section Working Group [42], multiplied by cos 2 (β − α).

Selection and categorization
At √ s = 13 TeV, the ggF cross section for the h(125) is almost one order of magnitude larger than that for VBF production [42]. However, the ggF cross section decreases with m X while the VBF/ggF cross section ratio increases, meaning that the VBF production mechanism becomes more important at higher masses. The main feature distinguishing the two production mechanisms is the presence of associated forward jets for VBF production. A categorization of events based on both the kinematic properties of associated jets and matrix element techniques is employed to optimize the signal sensitivity. Events with a VBF topology are selected by requiring the presence of two associated jets with an invariant mass of at least 500 GeV and a ∆η greater than 3.5.

X → 2 2ν
The 2 2ν analysis selects two oppositely charged leptons in the same-and different-flavour final states. To suppress the background from nonprompt leptons arising from W+jets production, both leptons must be well identified and isolated. Events are categorized according to the lepton flavour composition and the number of AK4 jets with p T > 30 GeV. To suppress the top quark background, events are required to have no b-tagged AK4 jets with p T > 20 GeV. The final discriminating variable is the reconstructable mass m reco = is the dilepton four-momentum. This variable is chosen for its effectiveness in discriminating between signal and background, and between different signal mass hypotheses.

Different-flavour final state
For the different-flavour eµ channel, one of the two leptons is required to have p T > 25 GeV and the other is required to have p T > 20 GeV. To suppress background processes with three or more leptons in the final state, such as ZZ, WZ, or triboson production, events with an additional identified and isolated lepton with p T > 10 GeV are rejected. The dilepton invariant mass m is required to be higher than 50 GeV to reduce the h(125) contamination. Due to the presence of neutrinos in the final state of interest, only events with p miss T > 20 GeV are considered. The DY → τ + τ − background is suppressed by requiring that the dilepton transverse momentum p T is above 30 GeV and the X transverse mass m T is above 60 GeV, where m T = √ 2p T p miss T (1 − cos ∆φ ) and ∆φ is the azimuthal angle between p miss T and p T . Finally, motivated by the high-mass of the signals under investigation, the condition m reco > 100 GeV must be satisfied.
In this channel four exclusive jet categories are defined: a zero-jet, one-jet, two-jet and VBF category. The last category requires the presence of exactly two jets which satisfy the VBF selection criteria. Dijet events failing these criteria enter the two-jet category. Figure 2 displays the m reco distributions for events passing the 2 2ν different-flavour selection in the four exclusive jet categories.

Same-flavour final state
For the same-flavour e + e − and µ + µ − channels, both leptons are required to have p T > 20 GeV. Events with an additional identified and isolated lepton with p T > 10 GeV are rejected. The background rejection requirements described for the eµ channel are also applied in these channels. To suppress the large DY → e + e − and DY → µ + µ − backgrounds only those events with two jets satisfying the VBF selection criteria are considered. For the further reduction of this background, the m and p miss T requirements are raised to 120 and 50 GeV, respectively. Figure 2 displays the m reco distributions for events passing the 2 2ν same-flavour selection.

X → ν2q
In the ν2q analysis, the W → ν candidates are reconstructed by combining the p miss T and a lepton which has p T > 30 GeV and |η| < 2.1 (2.4) for electrons (muons). Those events containing additional electrons (muons) with p T > 15 (10) GeV passing loose identification requirements are rejected. The p miss T is considered as an estimate of the neutrino p T with the longitudinal component p z of the neutrino momentum estimated by imposing a W boson mass constraint to the ν system and solving the corresponding quadratic equation. The solution with the smallest magnitude of neutrino p z is chosen. When a real solution is not found, only the real part is considered. The W → qq candidates are reconstructed as either high-p T merged jets or as resolved low-p T jet pairs. A W boson mass window selection is applied to suppress the W+jets background. If an additional AK4 jet with p T > 20 GeV which is b-tagged is present, then the event is rejected to suppress the top quark background. The W → ν and W → qq decay candidates are combined into WW resonance candidates. The final discriminating variable is the invariant mass of the WW system, m WW .
Events are categorized based on the tagging of VBF and ggF production mechanisms. A VBF category is defined by requiring two additional AK4 jets satisfying the VBF selection criteria. Those events failing the VBF selection are considered for the ggF category. The tagging of ggF candidates is achieved using a kinematic discriminant based on the angular distributions of the X candidate decay products. This is implemented with MELA which uses JHUGEN and MCFM matrix elements to calculate probabilities for an event to come from either signal or background, respectively. A WW resonance candidate is considered ggF-tagged if the kinematic discriminant is greater than 0.5. Those events with WW resonance candidates failing this requirement enter the untagged category, resulting in three production mechanism categories.

Boosted final state
For the boosted final state, an AK8 jet with m J in the mass window 65 < m J < 105 GeV is required. To suppress the background from nonprompt leptons in QCD multijet events, only events with p miss T > 40 GeV are considered. For heavy-resonance decays the p T of the W candidates are expected to be roughly half of the resonance mass. Therefore both the leptonic and hadronic W candidates must satisfy the condition p W T /m WW > 0.4. Finally, to identify boosted W candidates (boosted W tagging) the N-subjettiness ratio τ 21 is required to be <0.4. The m WW distributions for events passing the ν2q boosted selection in the three production categories are shown in Fig. 3.

Resolved final state
For events that do not contain a boosted W-tagged jet with m J > 40 GeV, a resolved hadronic W boson decay reconstruction is attempted using two AK4 jets with p T > 30 GeV and |η| < 2.4. In events with greater than two jets the selection of the dijet pair is performed by means of a kinematic fit [113]. For each dijet pair the kinematic fit algorithm constrains the jet four-momenta, assuming the dijet invariant mass is that of the W boson, and assigns a χ 2 according to the goodness of the fit. The dijet pair with the smallest χ 2 is chosen as the hadronic W candidate. The invariant mass of the dijet system must be in the mass window 65 < m jj < 105 GeV. To suppress the background from nonprompt leptons in QCD multijet events, it is required that p miss T > 30 GeV and that the leptonic W candidate transverse mass m T is above 50 GeV, where m T = √ 2p T p miss T (1 − cos ∆φ ) and ∆φ is the azimuthal angle between p miss T and the lepton transverse momentum p T . The leptonic and hadronic W candidates must also satisfy the condition p W T /m WW > 0. 35. Further reduction in the QCD multijet background is achieved by requiring that the X transverse mass m

Background estimation
The dominant backgrounds are modeled via simulation that has been reweighted to account for known discrepancies between data and simulated events. Corrections associated with the description in simulation of the trigger efficiencies, as well as the efficiency for electron and muon reconstruction, identification, and isolation, are extracted from events with leptonic Z boson decays using a "tag-and-probe" technique [114]. The b tagging efficiency is measured using data samples enriched in b quark jets and corrections for simulation derived [115]. For the ν2q boosted category, corrections are applied to the W tagging efficiency and the m J scale and resolution of W-tagged jets. These corrections have been measured in an almost pure sample of semileptonic tt events, where boosted W bosons produced in the top quark decays are separated from the combinatorial tt background by means of a simultaneous fit to m J [116]. For the normalization of the major backgrounds data driven estimates using control regions are employed.

X → 2 2ν
The main background processes contributing to the 2 2ν final state are from nonresonant WW and top quark production. The nonresonant WW background populates the entire phase space in m reco while the high-mass signal contribution is concentrated at high values of this variable. Therefore, this background is estimated directly in the final fit to the data by allowing the WW normalization to float freely and independently in each category.
The estimation of the top quark background is performed using a top quark enriched data control region, defined by inverting the b jet veto requirement. It is used to constrain the top quark background normalization which is allowed to float freely in the final fit to the data. The estimation is performed separately for each of the different-and same-flavour categories. The m reco distributions in the top quark control regions of each of the different-flavour categories are shown in Fig. 4. The expected backgrounds before fitting the data are shown, good agreement between the top quark background predictions and the data is observed.
The DY process is a significant source of background in the same-flavour categories. A subleading source of background in the different-flavour categories comes from DY → τ + τ − , where each τ decays leptonically. In the final fit to the data, the DY normalization is also allowed to float freely and independently in each category, and is constrained using control regions which are defined using modified signal region selections. For the different-flavour channel, a DY control region is defined for each jet category by inverting the signal region m T selection, requiring m T < 60 GeV. The invariant mass of the two leptons is restricted to the interval between 50 and 80 GeV to reduce contributions from nonprompt leptons and from top quark processes. For the same-flavour channels, the control regions are defined by changing the signal region m selection to require 70 < m < 120 GeV. Discrepancies are observed between the p miss T distributions in data and simulation for the same-flavour control regions. A linear p miss T correction is derived for the simulation by fitting the ratio between data, with minor background subtracted, and the DY prediction. The m reco distributions in the DY control regions of each of the same-flavour categories are shown in Fig. 4. The expected backgrounds before fitting the data are shown, good agreement between the DY background predictions and the data is observed.
The instrumental background arising from nonprompt leptons in W+jets production is estimated to be between 2 and 8% of the total background. An estimate is done in a control region that uses looser lepton identification criteria with relaxed isolation requirements. The probability for a jet that satisfies the loose lepton requirements to also satisfy the standard selection is determined using dijet events. Similarly, the efficiency for a prompt lepton that satisfies the loose lepton identification requirements to also satisfy the standard selection is determined using DY events. These efficiencies are then used to weight the data events with the probability for the event to contain a nonprompt lepton and the relative probability for the candidates in this event to also satisfy the standard selection. Other subleading backgrounds, such as WZ, ZZ, and triboson production, are estimated from simulation.

X → ν2q
The main backgrounds for the ν2q analysis are from W+jets and top quark production, with subdominant contributions from diboson, DY, and QCD multijet production.
The majority of the events passing the ν2q selection come from W+jets and top quark production. An estimate of the W+jets and top quark background normalizations using two control regions in data is employed. A top quark enriched data control region is defined reversing the b jet veto, by requiring events with an additional jet which is b-tagged. Additionally, a sideband control region, with a similar background composition to that of the signal region, is defined by adapting the hadronic W candidate mass requirements of the signal region selection. In the boosted (resolved) category m J (m jj ) is required to be outside the W boson mass window (65-105 GeV) and within the range 40 < m J (m jj ) < 250 GeV. In the final fit to the data, the nor- malizations of both the W+jets and top quark backgrounds are allowed to float freely, with the observed yields in the control regions used to constrain the normalizations. This background estimation procedure is applied independently in each category.
The contamination from diboson events represents 6 and 3% of the total background in the boosted and resolved categories, respectively. Production of WW, WZ, and ZZ through qq annihilation is estimated directly from simulation while the gg → WW and qq → qqWW backgrounds are estimated through the reweighting of signal samples using MELA.
The DY contamination is suppressed due to the second-lepton veto. It is estimated directly from simulation and represents between 1 and 2% of the total background.
Contamination from nonprompt leptons in QCD multijet production is estimated from simulation to be between 1 and 2% of the total background. The contribution from this source is largely suppressed due to the W candidate p T , transverse mass, and substructure requirements. The QCD multijet enriched samples are defined through a reversal of these requirements, allowing a test of the multijet simulation. The resolved selection is altered by requiring m T < 50 GeV, m jj T < 60 GeV, and p W T /m WW < 0.35, while for the boosted selection it is required that m T < 50 GeV, τ 21 > 0.4, and p W T /m WW < 0.4. The QCD multijet contamination levels attained are 35 and 14% in the boosted and resolved categories, respectively. After subtracting the estimated prompt-lepton backgrounds, the predicted number of QCD multijet events in each category is found to agree with the data within 3%, with the statistical uncertainties of the order of 10%.
To help verify the background estimation procedure, a fit is performed to the m WW distributions in the sideband allowing the W+jets and top quark background normalizations to float freely. The observed yield in the top quark control region is included in the fit to help constrain the top quark background normalization. Figure 5 shows the result of the fit to the sideband m WW distributions for the boosted and resolved categories. A good level of agreement between data and the background predictions is observed.

Signal extraction and systematic uncertainties
The methodology used to interpret the data and to combine the results from independent categories has been developed by the ATLAS and CMS Collaborations in the context of the LHC Higgs Combination Group. A general description of the method can be found in Refs. [117][118][119].
The signal extraction procedure is based on a combined binned maximum likelihood fit of the discriminant distributions with signal and background templates, performed simultaneously in all the ν2q and 2 2ν signal region categories. Signal templates for both the ggF and VBF production modes are included in the fit, with a number of hypotheses for f VBF considered. The various control regions used to constrain the dominant backgrounds are included in the form of single bins, representing the number of events in each control region. The dominant background normalizations are initially unconstrained and are determined during the fit. After fitting the data the uncertainties on the WW, top quark and DY background normalizations in the 2 2ν categories are in the range 6-45% , 3-5%, and 5-20%, respectively. In the ν2q categories, the corresponding uncertainties on the W+jets and top quark background normalizations are in the range 7-10% and 4-20%, respectively. The remaining systematic uncertainties are represented by individual nuisance parameters with a log-normal model used for normalization uncertainties and a Gaussian model used for shape uncertainties. For each source of uncertainty, the correlations between different categories, and different signal and background processes, are taken into account. Uncertainties arising from limited number of events in the MC simulated samples are included for each bin of the discriminant distributions, in each category independently, following the Barlow-Beeston approach [120]. Depending on the category, the statistical uncertainties due to the MC simulated sample sizes on the background and signal normalisations are in the range 1-8%.
The theoretical sources of uncertainty considered include the effect of PDFs and the strong coupling constant α S , and the effect of missing higher-order corrections via variations of the renormalization and factorization scales. Acceptance uncertainties are evaluated for signal and background by varying the PDFs and α S within their uncertainties [121], and by varying the factorization and renormalization scales up and down by a factor of two [122]. Depending on the process and the category, the PDF uncertainties in the signal and background yields amount to 1-7%, while those of the renormalization and factorization scales are within 1-18%. The PDF, and the renormalization and factorization scales uncertainties in the signal cross section, computed by the LHC Higgs Cross Section Working Group [42], are also considered and amount to 2-16% and 0.2-9%, respectively, depending on the resonance mass and production mechanism.
Effects due to experimental uncertainties are studied by applying a scaling and/or smearing of certain variables of the physics objects in the simulation, followed by a subsequent recalculation of all the correlated variables. The uncertainty in the measured luminosity is 2.5% for data collected during 2016 [123]. The trigger efficiency uncertainties are approximately 1 and 2% for the ν2q and 2 2ν final states, respectively. Lepton reconstruction and identification efficiency uncertainties vary between 1 and 3%, while the muon momentum and electron energy scale uncertainties amount to 0.1-1.0% each. Depending on the process and the category, the jet energy scale uncertainties are in the range 1-10%. The p miss T uncertainty is taken into account by propagating the corresponding uncertainties in the leptons and jets and amounts to 0.1-1%. The scale factors correcting the b tagging efficiency and mistag rate are varied within their uncertainties with resulting uncertainties of 0.1-5% depending on the process and the category. This systematic uncertainty affects the top quark control regions and the signal regions in an anticorrelated way.
In addition, for each final state there are channel-specific uncertainties which are now discussed.

X → 2 2ν
A conservative 30% uncertainty in the normalization of the instrumental background arising from nonprompt leptons in W+jets production is estimated by varying the jet p T threshold in the dijet control sample used in the background prediction procedure, and from propagation of the statistical uncertainties in the measured lepton misidentification probabilities. Uncertainties of 3-10% due to the p WW T reweighting are evaluated by varying the factorization and renormalization scales up and down by a factor of two, and by varying the resummation scale. The UE uncertainty for the WW background is estimated by comparing two different UE tunes, while the PS modeling uncertainty is estimated by comparing samples interfaced with different PS models, as described in Section 3. The combined effect is evaluated to be 5-10%. A dedicated nuisance parameter for the linear p miss T correction in the same-flavour DY control region is introduced. The uncertainty is 0.2-1%, estimated with the maximum and minimum best fit lines of the linear fit used to derive the correction. The categorization of events based on jet multiplicity introduces additional signal uncertainties related to higher-order corrections. These uncertainties are associated with the ggF production mode and are evaluated independently following the method described in Ref. [124] and are about 5% for the 0-jet, 10% for the 1-jet, and 20% for the 2-jet and VBF categories.

X → ν2q
The diboson and DY production cross sections are each assigned an uncertainty of 10% based on the level of agreement between theoretical predictions and cross section measurements at CMS using 13 TeV data [125,126]. An uncertainty of 10% in the normalization of the background arising from nonprompt leptons in QCD multijet production is assigned based on the observed level of agreement between data and simulation in QCD multijet enriched samples. The impact of the jet energy resolution uncertainty is about 0.3-2%, depending on the process and the category. For W-tagged jets the m J scale and resolution uncertainties are evaluated to be 0.1-1 and 2-5%, respectively. The τ 21 scale factor correcting the boosted W tagging efficiency has an associated uncertainty of 6%. Since this is measured in tt events using jets with a typical p T of 200 GeV, an uncertainty of 1-13% in the extrapolation to the higher-p T regime of the high-mass signal is also included.
A summary of the systematic uncertainties included for the ν2q and 2 2ν final states are shown in Table 1.

Results
No evidence for an excess of events with respect to the SM predictions is observed. Upper exclusion limits at 95% confidence level (CL) on the X cross section times branching fraction of the decay to two W bosons are evaluated for masses between 0.2 and 3.0 TeV using the asymptotic modified frequentist method (CL s ) [117][118][119]. A number of hypotheses for f VBF have been investigated by setting this fraction to the SM value, by allowing it to float, and by setting f VBF = 0 and 1. The expected and observed exclusion limits for the full combination of the 2 2ν and ν2q analyses are shown in Fig. 6. For signals below ≈800 GeV, the sensitivity of the 2 2ν final state is dominated by the different-flavour channel, while at higher masses the same-and different-flavour channels have similar sensitivities. For the ν2q final state, the sensitivity is dominated by the boosted channel for signals above ≈400 GeV, while at lower masses the resolved channel dominates. Comparing the two final states, the 2 2ν sensitivity is dominant up to ≈400 GeV, while at higher masses the ν2q final state is more sensitive by a factor of approximately two. Comparing the excluded cross section values to the expectations from theoretical calculations, a X signal is excluded up to 1870 (1370) GeV with f VBF set to the SM value ( f VBF allowed to float). A X signal is excluded up to 1060 GeV for the f VBF = 0 hypothesis, while the mass ranges 200-245 and 380-1840 GeV are excluded for the f VBF = 1 hypothesis.
Exclusion limits are also set for neutral heavy Higgs bosons in the context of a Type-I and Type-II 2HDM, with the assumptions that m H = m A = m H ± and cos(β − α) = 0.1. Fig. 7 shows the expected and observed exclusion limits in the m H -tan β plane. The dashed lines mark the expected limits while the dark and bright gray bands indicate the 68 and 95% CL uncertainties, respectively. The observed exclusion contours are indicated by the blue areas. In both scenarios, the observed exclusion contours reach m H values of ≈800 GeV, while the maximum tan β value excluded is ≈3. Fig. 8 shows the expected and observed exclusion limits for the m mod+ h and the hMSSM scenarios. The maximum tan β value excluded for both scenarios is ≈9, while the maximum value of m A excluded is ≈430 GeV. The exclusion of the regions at low Table 1: Summary of systematic uncertainties, quoted in percent, affecting the normalization of the background and signal samples. The uncertainties on the WW, top quark and DY (W+jets and top quark) background estimates in the 2 2ν ( ν2q) categories have been determined during the fit to the data. The numbers shown as ranges represent the uncertainties for different processes and categories. Missing values represent uncertainties either estimated to be negligible (<0.1%), or not applicable in a specific channel. Those systematic uncertainties found to affect the shape of kinematic distributions are labeled with *.    values of m A and tan β complement the exclusion limits set by the MSSM H → τ + τ − analyses from ATLAS and CMS using 13 TeV data [127,128], which have reduced sensitivity in these regions. Fig. 9 shows the expected and observed exclusion limits for the M 125 h , M 125 h (alignment), M 125 h ( χ), and M 125 h ( τ ) scenarios. Low values of m A and tan β are also excluded for these scenarios. The observed exclusion contours reach m A values of ≈400 GeV, while the maximum tan β values excluded are in the range 5-9. These results further reduce the allowed parameter space for extensions of the SM.

Summary
A search for a heavy Higgs boson decaying to a pair of W bosons in the mass range from 0.2 to 3.0 TeV has been presented. The data analysed were collected by the CMS experiment at the LHC in 2016, corresponding to an integrated luminosity of 35.9 fb −1 at √ s = 13 TeV. The W boson pair decays are reconstructed in the 2 2ν and ν2q final states. Both gluon fusion and vector boson fusion production of the signal are considered, with a number of hypotheses for their relative contributions investigated. Interference effects between the signal and background are also taken into account. Dedicated event categorizations based on both the kinematic properties of associated jets and matrix element techniques are employed to optimize the signal sensitivity. No evidence for an excess of events with respect to the standard model (SM) predictions is observed. Combined upper limits at 95% confidence level on the product of the cross section and branching fraction exclude a heavy Higgs boson with SM-like couplings and decays up to 1870 GeV. Exclusion limits are also set in the context of a number of two-Higgs-doublet model formulations, further reducing the allowed parameter space for extensions of the SM.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies:  [5] ATLAS Collaboration, "Combined measurements of Higgs boson production and decay using up to 80 fb −1 of proton-proton collision data at √ s = 13 TeV collected with the ATLAS experiment", (2019). arXiv:1909.02845. Accepted by PRD.