1 Introduction

The discovery of a Higgs boson [13], together with the absence of any experimental hint of physics beyond the standard model (SM) at the Large Hadron Collider (LHC), have had a major impact on proposed theoretical models for new physics. All measurements of the recently observed 125 \(\text {GeV}\) boson to date indicate compatibility with the SM Higgs boson, but the associated uncertainties are large, and the possibility for non-SM properties remains. Moreover, although additional SM-like Higgs bosons have been excluded over a wide range of masses, additional Higgs bosons with exotic decay modes remain a possibility.

Invisible Higgs boson decays are possible in a wide range of models, for example through decays to neutralinos in supersymmetric models [4], or graviscalars in models with extra dimensions [5, 6]. In general, interactions of the Higgs boson with the unknown dark matter (DM) sector may introduce invisible decay modes, and bounds on these decays can constrain DM models. In so-called “Higgs-portal” models of DM interactions [79], the Higgs boson takes the role of mediator between the SM particles and the DM particle. Recent theories proposing that the Higgs boson played a central role in the evolution of the early universe [10] provide further motivation to understand the relationship between the Higgs boson and DM.

Indirect constraints on non-SM decay modes of the recently observed Higgs boson have been inferred from the visible SM decay modes by including an additional non-SM partial width term in the combined fit to the data [3]. The resulting upper limit on the non-SM branching fraction is 0.89, at 95 % confidence level (CL). Direct searches for invisible Higgs boson decays, \(\mathrm {H}(\text {inv})\), are possible by requiring that the Higgs boson recoils against a visible system. Such searches were performed at LEP [1113], using the ZH associated production mode. They excluded at 95 % CL an invisible Higgs boson of mass smaller than 105 \(\text {GeV}\) and produced with a cross section higher than 0.2 times the standard model ZH cross section. Phenomenological studies of hadron collider searches for \(\mathrm {H}(\text {inv})\) have considered all production mechanisms [1420]. Recently, the ATLAS Collaboration reported a search for invisible decays of a Higgs boson produced in association with a Z boson that decays to leptons [21], placing an upper limit on the invisible Higgs boson branching fraction of 0.75 at 95 % CL for \(m_{\mathrm {H}}=125.5\) \(\text {GeV}\). The ATLAS Collaboration also searched for an invisibly decaying Higgs boson in association with either a W or Z boson decaying hadronically [22].

Here we report searches for \(\mathrm {H}(\text {inv})\) in the ZH mode, where the Z boson decays to leptons or a \(\mathrm {b}\overline{\mathrm {b}}\) quark pair, and the first search for \(\mathrm {H}(\text {inv})\) in the vector boson fusion (VBF) production mode, where the Higgs boson is produced in association with two quarks, as shown in Fig. 1 (left). Although the VBF signal benefits from a relatively large SM cross section, the final state of two jets plus missing transverse energy (\(E_{\mathrm {T}}^{\text {miss}}\)) suffers from large backgrounds. However, the backgrounds can be controlled by utilizing the distinct topology of the VBF process, in which the two jets are produced in a forward/backward configuration, with large invariant mass, and are well separated in rapidity. In addition, hadronic activity in the rapidity gap between the two scattered quarks is reduced, due to the absence of color flow in the VBF process. The ZH signal, shown in Fig. 1 (center) and (right), provides a complementary search to the VBF analysis. Despite a lower SM production cross section, the final state of a Z boson with large \(E_{\mathrm {T}}^{\text {miss}}\) provides a clear topology with much lower backgrounds. We maximize the sensitivity of the search by including decays of the Z boson to leptons and \(\mathrm {b}\overline{\mathrm {b}}\) quark pairs, which we refer to as \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\), and \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\), respectively, where \(\ell \) represents either an electron or a muon. The Higgs boson production modes we consider here rely only on the Higgs boson coupling to the electroweak vector bosons. New physics that introduces invisible decays of the Higgs boson may also modify these couplings.

Fig. 1
figure 1

The Feynman diagrams for Higgs production in the VBF (left), \(\mathrm {Z}(\ell \ell )\mathrm {H}\) (center) and \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}\) (right) channels. The Higgs boson is assumed to decay invisibly

In the following sections of this article, we present a brief overview of the Compact Muon Solenoid (CMS) experimental apparatus, physics object reconstruction and datasets in Sects. 2 to 4, followed by a description of the event selection and background estimation for each of the three search channels in Sects. 5 to 7. We then present the results of the searches, and their combination, as upper limits on the production cross section times invisible branching fraction in Sect. 8. In Sect. 9 we interpret these cross section upper limits in terms of a Higgs-portal model of dark matter interactions, and we summarize our conclusions in Sect. 10.

2 The CMS apparatus

The central feature of the CMS apparatus is a superconducting solenoid of 6\(\text {m}\) internal diameter, providing a magnetic field of 3.8 \(\text {T}\). Within the volume of the superconducting solenoid are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass-scintillator hadron calorimeter, each composed by the barrel and endcap detectors. Muons are measured with detection planes made using three technologies: drift tubes, cathode strip chambers, and resistive-plate chambers, embedded in the steel flux-return yoke outside the solenoid. Extensive forward calorimetry complements the coverage provided by the barrel and endcap detectors. Data are selected online using a two-level trigger system. The first level, consisting of custom made hardware processors, selects events in less than 1 \(\upmu \text {s}\), while the high-level trigger processor farm further decreases the event rate from around 100\(\text {kHz}\) to a few hundred Hz before data storage. The CMS experiment uses a right-handed coordinate system, with the origin at the nominal interaction point, the \(x\) axis pointing to the center of the LHC, the \(y\) axis pointing up (perpendicular to the LHC plane), and the \(z\) axis along the counterclockwise-beam direction. The polar angle \(\theta \) is measured from the positive \(z\) axis and the azimuthal angle \(\phi \) is measured in the \(x\)–\(y\) plane. The pseudorapidity, \(\eta \), is defined as \(- \ln [\tan (\theta /2)]\). A more detailed description of the CMS apparatus can be found in Ref. [23].

3 Data samples and Monte Carlo simulation

The analyses presented here all use the 8 \(\text {TeV}\) data sample collected by the CMS Collaboration during 2012, corresponding to an integrated luminosity of 19.5 \(\text {fb}^{-1}\) in the VBF channel, 19.7 \(\text {fb}^{-1}\) in the \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\) channel, and 18.9 \(\text {fb}^{-1}\) in the \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) channel. The \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\) channel also uses the 7 \(\text {TeV}\) dataset collected during 2011, corresponding to 4.9 \(\text {fb}^{-1}\). The uncertainty assigned to the luminosity measurement is 2.6 % (2.2 %) at \(\sqrt{s}=8\) (7) \(\text {TeV}\)  [24]. Backgrounds arising from sources other than pp collisions are suppressed using a set of filters that remove events due to anomalous calorimeter signals, beam halo identified in the muon endcaps, inoperable calorimeter cells, and tracking failure. We further require a well reconstructed vertex within the interaction region; \({|z |}<24\) \(\text {cm}\), \(r<2\) \(\text {cm}\), where \(r=\sqrt{x^{2}+y^2}\).

The VBF signal is simulated using the powheg 2.0 event generator [2531], while the \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\) and \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) signals are simulated with pythia 6.4.26 [32]. The background processes are simulated using MadGraph 5.1.1 [33], with the exception of some minor backgrounds—specifically, the \(\mathrm {V}\mathrm {H}(\mathrm {b}\overline{\mathrm {b}})\) background to the \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) analysis is simulated with powheg 2.0, the diboson backgrounds in the VBF analysis are simulated with pythia 6.4.26, and the single-top-quark backgrounds in the VBF and \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\) analyses use powheg 1.0. The QCD multijet background is simulated with pythia 6.4.26. All samples use the leading-order CTEQ6L1 parton distribution functions (PDFs) [34], apart from the \(\mathrm {V}\mathrm {H}(\mathrm {b}\overline{\mathrm {b}})\) powheg samples, which use the next-to-leading-order (NLO) CTEQ6M PDFs [34]. Where yields are estimated directly from MC simulation, the PDF uncertainty is estimated using the PDF4LHC prescription [35, 36]. For all Monte Carlo (MC) samples, the detector response is simulated using a detailed description of the CMS detector based on the Geant4 package [37]. Minimum bias events are superimposed on the generated events to simulate the effect of multiple pp interactions per bunch crossing (pileup). Simulated events are weighted such that the distribution of the number of pileup interactions reproduces that observed in data. The mean number of pileup interactions per bunch crossing was approximately 9 in 2011, and 21 in 2012. Additional weights are applied to simulated events to ensure trigger efficiency, lepton identification efficiency, and b-tagging efficiency match measurements from data.

The VBF and ZH production cross sections are taken from Refs. [38, 39]. The ZH searches are performed in the boosted regime, where the Higgs boson has significant transverse momentum (\({p_{\mathrm {T}}}\)), and thus, potential differences in the \({p_{\mathrm {T}}}\) spectrum of the \({\mathrm {Z}}\) and Higgs bosons between data and MC generators could introduce systematic effects in the signal acceptance and efficiency estimates. Two sets of calculations are available that estimate the NLO electroweak corrections [4042] and next-to-next-to-leading order (NNLO) QCD [43] corrections to vector boson plus Higgs boson production in the boosted regime. Both sets of corrections are applied to the signal MC samples. For VH production, the estimated uncertainty arising from the NLO electroweak corrections is 2 %, and from the NNLO QCD corrections is 5 %. In addition, we include NNLO electroweak corrections [44] to the \({\mathrm {Z}}{\mathrm {Z}}\) and \({\mathrm {W}}{\mathrm {Z}}\) background processes as a function of the \({p_{\mathrm {T}}}\) of the \({\mathrm {Z}}\) boson.

4 Event reconstruction

The reconstructed interaction vertex with the largest value of \(\sum _i {{p_{\mathrm {T}}}}_i^2\), where \({{p_{\mathrm {T}}}}_i\) is the transverse momentum of the \(i\)th track associated with the vertex, is selected as the primary event vertex. This vertex is used as the reference vertex for all relevant objects in the event, which are reconstructed with a particle-flow algorithm [45, 46]. The pileup interactions affect jet momentum reconstruction, missing transverse energy reconstruction, lepton isolation, and \(\mathrm {b}\)-tagging efficiencies. To mitigate these effects, all charged-hadrons that do not originate from the primary interaction are identified by a particle-flow-based algorithm and removed from consideration in the event. In addition, following Ref. [47], the average neutral energy density from pileup interactions is evaluated on an event-by-event basis from particle-flow objects and used to compute a correction to the reconstructed jets in the event and to the summed energy in the isolation cones used for leptons.

Muons are reconstructed in the pseudorapidity range \({|\eta |}\!<\!2.4\). Two muon reconstruction algorithms are used [48]: one in which tracks in the silicon tracker are matched to signals in the muon detectors, and another in which a global track fit is performed using hits in both the tracker and muon detectors. The muon candidates used in the analysis are required to be successfully reconstructed by both algorithms. The efficiency to reconstruct a muon of \({p_{\mathrm {T}}}\!\!>\!5\) \(\text {GeV}\) is larger than 95 %, while the probability to misidentify a hadron as a muon is below 0.1 %. Further identification criteria are imposed on the muon candidates to reduce the fraction of tracks misidentified as muons. These include the number of measurements in the tracker and in the muon systems, the fit quality of the global muon track and its consistency with the primary vertex.

Electron reconstruction requires the matching of an energy cluster in the ECAL with a track in the silicon tracker [49]. Electron identification relies on a multivariate technique that combines observables sensitive to the amount of bremsstrahlung along the electron trajectory, the geometrical and momentum matching between the electron trajectory and associated clusters, as well as shower-shape observables. Additional requirements are imposed to remove electrons produced by photon conversions. In this analysis, electrons are considered in the pseudorapidity range \({|\eta |} < 2.5\), excluding the \(1.44\!<\!{|\eta |}\!<\! 1.57\) transition region between the ECAL barrel and endcap, where electron reconstruction is suboptimal.

Jets are reconstructed from particle-flow objects using the anti-\({k_{\mathrm {T}}}\) clustering algorithm [50], with a distance parameter of 0.5, as implemented in the fastjet package [51, 52]. Jets are found over the full calorimeter acceptance, \({|\eta |} < 5\). Jet energy corrections are applied as a function of the pseudorapidity and transverse momentum of the jet [53]. Jets resulting from pileup interactions are removed using a boosted decision tree (BDT), implemented in the TMVA package [54], with the following input variables: momentum and spatial distribution of the jet particles, charged- and neutral-particle multiplicities, and consistency of the charged hadrons within the jet with the primary vertex. The missing transverse momentum vector is calculated as the negative of the vectorial sum of the transverse momenta of all particle-flow objects identified in the event, and the magnitude of this vector is referred to as \(E_{\mathrm {T}}^{\text {miss}}\) in the rest of this article.

Jets that originate from the hadronization of \(\mathrm {b}\) quarks are referred to as “\(\mathrm {b}\) jets”. The CSV \(\mathrm {b}\)-tagging algorithm [55] is used to identify such jets. The algorithm combines the information about track impact parameters and secondary vertices within jets in a likelihood discriminant to provide separation between \(\mathrm {b}\) jets and jets originating from light quarks, gluons, or charm quarks. The output of this CSV discriminant has values between zero and one; a jet with a CSV value above a certain threshold is referred to as being “\(\mathrm {b}\) tagged”. The efficiency to tag \(\mathrm {b}\) jets and the rate of misidentification of non-\(\mathrm {b}\) jets depend on the threshold chosen, and are typically parameterized as a function of the \({p_{\mathrm {T}}}\) and \(\eta \) of the jets. These performance measurements are obtained directly from data in samples that can be enriched in \(\mathrm {b}\) jets, such as \({\mathrm {t}\overline{\mathrm {t}}}\) and multijet events (where, for example, requiring the presence of a muon in the jets enhances the heavy-flavor content of the events). Several thresholds for the CSV output discriminant are used in this analysis. Depending on the threshold used, the efficiency to tag jets originating from \(\mathrm {b}\) quarks is in the range 50–75 %, and the probability to incorrectly tag jets originating from \(\mathrm {c}\) quarks, and light quarks or gluons as \(\mathrm {b}\) jets are 5–25, and 0.15–3.0 %, respectively.

5 Search for \(\mathrm {H}(\text {inv})\) in vector boson fusion

5.1 Search strategy

In the VBF mode, the Higgs boson is produced in association with two final-state quark jets separated by a large rapidity gap, and having high invariant mass. Loosely following the selection criteria discussed in Ref. [16], we select final states with two jets and large missing transverse energy and utilize the distinct topology of the VBF jets to discriminate the invisible Higgs boson signal from background.

The dominant backgrounds in this channel result from \({\mathrm {Z}}(\nu \nu )\text {+jets}\), and \({\mathrm {W}}(\ell \nu )\text {+jets}\), where the charged lepton is not identified. These backgrounds are estimated using control regions with a \({\mathrm {Z}}\) or \({\mathrm {W}}\) boson decaying to well identified charged leptons, in association with the same dijet topology used for the signal region. We then extrapolate from the control regions to the signal region using factors obtained from MC simulation. The background due to QCD multijet processes, where the \(E_{\mathrm {T}}^{\text {miss}}\) arises from mismeasurement, is also estimated from data. Minor SM backgrounds, arising from \({\mathrm {t}\overline{\mathrm {t}}}\), single-top, diboson, and DrellΓÇôYan\((\ell \ell )\text {+jets}\) processes are estimated from MC simulation.

We use the observed yield in the signal region, together with the estimated background, to perform a single-bin counting experiment.

5.2 Event selection

We use events collected with a trigger that requires \(E_{\mathrm {T}}^{\text {miss}}>65\) \(\text {GeV}\), in association with a pair of jets with \({p_{\mathrm {T}}}^\mathrm {j1}, {p_{\mathrm {T}}}^\mathrm {j2} >40\,\text {GeV}\), in a VBF-like topology. The jets are required to be in opposite forward/backward halves of the detector, well separated in pseudorapidity (\(\Delta \eta _\mathrm {jj}= {| \eta _\mathrm {j1} - \eta _\mathrm {j2} |} > 3.5\)), and with high invariant mass (\({M_\mathrm {jj}}>800\) \(\text {GeV}\)). For robustness against pileup, any pair of jets satisfying these criteria is accepted by the trigger. At the trigger level, the \(E_{\mathrm {T}}^{\text {miss}}\) calculation does not include muons, allowing control samples of \({\mathrm {W}}(\mu \nu )\text {+jets}\) and \({\mathrm {Z}}(\mu \mu )\text {+jets}\) events to be taken on the same trigger. The trigger efficiency is measured in events recorded on a single-muon trigger, as a function of \({p_{\mathrm {T}}}^\mathrm {j2}\) (since the leading jet, j1, is effectively always above threshold for the regions considered), \({M_\mathrm {jj}}\), and \(E_{\mathrm {T}}^{\text {miss}}\), and the measured efficiency is applied to all MC samples.

The offline selection then proceeds as follows. We reject backgrounds from \({\mathrm {Z}}\) and W bosons by vetoing any event with an identified electron [49] or muon [56] with \({p_{\mathrm {T}}}>10\) \(\text {GeV}\). The VBF tag jet pair is then identified as the leading jet pair. This pair is required to pass tightened versions of the trigger selection, specifically \({p_{\mathrm {T}}}^\mathrm {j1},{p_{\mathrm {T}}}^\mathrm {j2} > 50\) \(\text {GeV}\), \({|\eta |} < 4.7\), \(\eta _\mathrm {j1}, \eta _\mathrm {j2} < 0\), \(\Delta \eta _\mathrm {jj}>4.2\), and \({M_\mathrm {jj}}>1{,}100\) \(\text {GeV}\). The missing-energy requirement is \(E_{\mathrm {T}}^{\text {miss}}> 130\) \(\text {GeV}\). Multijet backgrounds are reduced to a low level by requiring the azimuthal separation between the tag jets to be small, \(\Delta \phi _\mathrm {jj}< 1.0\) radians, since the background peaks at \(\Delta \phi _\mathrm {jj}= \pi \) radians while the signal is roughly flat in \(\Delta \phi _\mathrm {jj}\). Finally, we apply a central-jet veto (CJV) to any event that has an additional jet with \({p_{\mathrm {T}}}>30\) \(\text {GeV}\) and pseudorapidity between those of the two tag jets.

The lepton and central jet veto thresholds are set to low values at which reconstruction is known to be reliable, while the remaining thresholds are determined by optimizing the selection to give the best signal significance, calculated using a profile likelihood method that incorporates all systematic uncertainties, for a Higgs boson with \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and 100 % invisible branching fraction. The thresholds on jet \({p_{\mathrm {T}}}\), \({M_\mathrm {jj}}\), and \(E_{\mathrm {T}}^{\text {miss}}\) are constrained to be above the point where the trigger is 95 % efficient. This constraint effectively determines the jet \({p_{\mathrm {T}}}\)  and \(E_{\mathrm {T}}^{\text {miss}}\) thresholds, since signal significance only worsens when these thresholds are raised above this point. Distributions of \({M_\mathrm {jj}}\), \(\Delta \eta _\mathrm {jj}\), \(\Delta \phi _\mathrm {jj}\), and central jet \({p_{\mathrm {T}}}\) in background and signal MC simulation are shown in Fig. 2, along with the thresholds applied after optimization of the selection.

Fig. 2
figure 2

Distributions of \({M_\mathrm {jj}}\), \(\Delta \eta _\mathrm {jj}\) (top left), \(\Delta \phi _\mathrm {jj}\) (top right), and central jet \({p_{\mathrm {T}}}\) (bottom right) in background and signal MC simulation. The distributions are shown after requiring two jets with \({p_{\mathrm {T}}}^\mathrm {j1},{p_{\mathrm {T}}}^\mathrm {j2} > 50\) \(\text {GeV}\), \({|\eta |} < 4.7\), \(\eta _\mathrm {j1}, \eta _\mathrm {j2} < 0\), \({M_\mathrm {jj}}>150\) \(\text {GeV}\), and \(E_{\mathrm {T}}^{\text {miss}}> 130\) \(\text {GeV}\). The arrows correspond to the thresholds applied for the final selection, after optimization

After all selection requirements, an hypothetical signal equivalent to 125 \(\text {GeV}\)   Higgs boson with \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})=100\) % and produced via the VBF process with SM couplings, is reconstructed with an efficiency of \((6.8 \pm 0.3) \times 10^{-3}\), corresponding to a yield of \(210 \pm 29\text {(syst)}\) events. The requirements on the VBF tag jet \({p_{\mathrm {T}}}\) and topology, \({M_\mathrm {jj}}\), and \(E_{\mathrm {T}}^{\text {miss}}\) are all correlated and affect the signal efficiency by comparable amounts. A small signal yield from the gluon-fusion process is also expected, where the VBF requirements may be satisfied by initial-state radiation. Based on powheg simulation, we estimate this to be \(14 \pm 10\text {(syst)}\) events.

5.3 Background estimation

The \({\mathrm {Z}}(\nu \nu )\text {+jets}\) background is estimated from data using observable \({\mathrm {Z}}(\mu \mu )\) decays. We define a \({\mathrm {Z}}\) control region as for the signal region, with the following changes to the event selection: the lepton veto is replaced with a requirement of an oppositely charged pair of well reconstructed and isolated muons each with \({p_{\mathrm {T}}}> 20 \) \(\text {GeV}\), and invariant mass \(60 < M_{\mu \mu }<120\) \(\text {GeV}\), a veto is applied on any additional leptons with \({p_{\mathrm {T}}}>10\) \(\text {GeV}\), and the \(E_{\mathrm {T}}^{\text {miss}}\) is recomputed after removing the muons from the \({\mathrm {Z}}\) boson decay. The number of \({\mathrm {Z}}(\nu \nu )\) events in the signal region is then predicted using:

$$\begin{aligned} N^\mathrm {s}_{\nu \nu } = (N^\mathrm {c}_{\mu \mu \text {obs}} - N^\mathrm {c}_\text {bkg}) \cdot \frac{\sigma ({\mathrm {Z}}\rightarrow \nu \nu )}{\sigma ({\mathrm {Z}}/\gamma ^{*} \rightarrow \mu \mu )} \cdot \frac{\varepsilon ^\mathrm {s}_{{\mathrm {Z}}\mathrm {MC}}}{\varepsilon ^\mathrm {c}_{{\mathrm {Z}}\mathrm {MC}}}. \end{aligned}$$
(1)

The ratio of cross sections, \(\sigma ({\mathrm {Z}}\rightarrow \nu \nu ) / \sigma ({\mathrm {Z}}/\gamma ^{*} \rightarrow \mu \mu ) = 5.651 \pm 0.023\text {(syst)}\) is calculated with mcfm [57] for \(m_{{\mathrm {Z}}/\gamma ^{*}}>50\) \(\text {GeV}\), the mass range of the MC sample. The selection efficiencies in the signal region, \(\varepsilon ^\mathrm {s}_{{\mathrm {Z}}\mathrm {MC}} = (1.65 \pm 0.27\text {(syst)}) \times 10^{-6}\), and the control region, \(\varepsilon ^\mathrm {c}_{{\mathrm {Z}}\mathrm {MC}}=(1.11 \pm 0.17\text {(syst)}) \times 10^{-6}\), are estimated from DY(\(\ell \ell \))+jets simulation, ignoring the muons when computing the efficiency in the signal region. The observed yield in the control region is \(N^\mathrm {c}_{\mu \mu \text {obs}} = 12\) events. The background in the control region—estimated from \({\mathrm {t}\overline{\mathrm {t}}}\), diboson and single-top MC samples—is \(N^\mathrm {c}_\text {bkg}=0.23 \pm 0.15\text {(syst)}\) events. The resulting estimate of the \({\mathrm {Z}}(\nu \nu )\) background in the signal region is \(99 \pm 29\text {(stat)}\pm 25\text {(syst)}\) events. The source of systematic uncertainty in the background estimates will be described in Sect. 5.4. Figure 3 shows the \(E_{\mathrm {T}}^{\text {miss}}\) and dijet invariant mass, \({M_\mathrm {jj}}\), distributions with a relaxed set of criteria for the \({\mathrm {Z}}\) control region, with \({M_\mathrm {jj}}>1{,}000\) \(\text {GeV}\) and no requirements on \(\Delta \eta _\mathrm {jj}\), \(\Delta \phi _\mathrm {jj}\), or CJV. In this figure, the simulated background is normalized to the data. It should be noted that our estimates of the dominant V+jets background are insensitive to the overall normalization of the simulation, which cancels in the ratio.

Fig. 3
figure 3

The \(E_{\mathrm {T}}^{\text {miss}}\) (top) and \({M_\mathrm {jj}}\) (bottom) distributions in the relaxed \({\mathrm {Z}}\) control region of the VBF search, with no requirements on \(\Delta \eta _\mathrm {jj}\), \(\Delta \phi _\mathrm {jj}\), or CJV, and with the \({M_\mathrm {jj}}\) requirement relaxed to 1,000 \(\text {GeV}\). The simulated background from different processes is shown cumulatively, and normalized to the data, with its systematic uncertainty shown as a hatched region. The lower panels show the ratio of data to the simulated background, again with the systematic uncertainty shown as a hatched region

The \({\mathrm {W}}(\mathrm {e}\nu )\text {+jets}\) and \({\mathrm {W}}(\mu \nu )\text {+jets}\) backgrounds are estimated from single-lepton control samples. We define \({\mathrm {W}}(\mu \nu )\) and \({\mathrm {W}}(\mathrm {e}\nu )\) control regions in a similar way to the \({\mathrm {Z}}\) boson background. In the \({\mathrm {W}}(\mu \nu )\) region, the lepton veto is replaced with a single \(\mu \) requirement and a veto on any additional leptons, and the \(E_{\mathrm {T}}^{\text {miss}}\) is recomputed after removing the muon from the \({\mathrm {W}}\) boson decay. The \({\mathrm {W}}(\mathrm {e}\nu )\) region is defined similarly, with a single electron requirement and additional lepton veto, but here the \(E_{\mathrm {T}}^{\text {miss}}\) is not recomputed, since the electron energy is already included in the \(E_{\mathrm {T}}^{\text {miss}}\) at trigger level. The number of \({\mathrm {W}}(\ell \nu )\) (where \({\ell =\mathrm {e},\mu }\)) events in the signal region, \(N^\mathrm {s}_{\ell }\) is then estimated using:

$$\begin{aligned} N^\mathrm {s}_\ell = (N^\mathrm {c}_{\ell \text {obs}} - N^\mathrm {c}_\text {bkg}) \cdot \frac{N^\mathrm {s}_{{\mathrm {W}}\mathrm {MC}}}{N^\mathrm {c}_{{\mathrm {W}}\mathrm {MC}}}, \end{aligned}$$
(2)

where \(N^\mathrm {s}_{{\mathrm {W}}\mathrm {MC}}\) and \(N^\mathrm {c}_{{\mathrm {W}}\mathrm {MC}}\) are the number of events in the signal and control regions in the \({\mathrm {W}}(\ell \nu )\text {+jets}\) MC simulation. The ratio \(N^\mathrm {s}_{{\mathrm {W}}\mathrm {MC}}/N^\mathrm {c}_{{\mathrm {W}}\mathrm {MC}}\) is equal to \(0.347 \pm 0.045\text {(syst)}\) for \({\mathrm {W}}(\mu \nu )\) and \(1.08 \pm 0.21\text {(syst)}\) for \({\mathrm {W}}(\mathrm {e}\nu )\). In the \({\mathrm {W}}(\mu \nu )\) control region the observed yield is 223 events, with a background of \(30.4 \pm 7.0\text {(syst)}\) events. The observed yield in the \({\mathrm {W}}(\mathrm {e}\nu )\) control region is 65 events, with a background of \(7.1 \pm 4.7\text {(syst)}\) events. The \({\mathrm {W}}(\mu \nu )\) background in the signal region is then estimated to be \(66.8 \pm 5.2\text {(stat)}\pm 15.7\text {(syst)}\) events, and the \({\mathrm {W}}(\mathrm {e}\nu )\) background to be \(62.7 \pm 8.7\text {(stat)}\pm 18.1\text {(syst)}\) events.

The background arising from \({\mathrm {W}}(\tau \nu )\text {+jets}\), where the tau lepton decays hadronically (\(\mathrm {\tau }_\mathrm {h}\)) is estimated using a slightly different method, since a tau lepton veto is not applied in the invisible Higgs boson signal selection. Hadronically decaying taus are reconstructed using the “hadron plus strips” algorithm [58]. This uses charged hadrons and neutral electromagnetic objects (photons) to reconstruct hadronic tau decay modes with one or three charged particles, in the range \({|\eta |} < 2.3\). A control region is defined, requiring one hadronic tau with \({p_{\mathrm {T}}}>20\) \(\text {GeV}\) and \({|\eta |}<2.3\), no additional leptons, and the remaining signal region selection. However, in the \({\mathrm {W}}(\mathrm {\tau }_\mathrm {h}\nu )\) control region, the CJV is not applied in order to increase the yield. The number of \({\mathrm {W}}(\mathrm {\tau }_\mathrm {h}\nu )\) events in the signal region, \(N^\mathrm {s}_{\mathrm {\tau }_\mathrm {h}}\), is then estimated from the control region in the same way as the \({\mathrm {W}}(\mu \nu )\) and \({\mathrm {W}}(\mathrm {e}\nu )\) backgrounds. A yield of 32 events is observed in the control region, with the background estimated from the MC simulation to be \(15.2 \pm 3.6\text {(syst)}\) events, giving an estimate of the \({\mathrm {W}}(\mathrm {\tau }_\mathrm {h}\nu )\) background in the signal region of \(53 \pm 18\text {(stat)}\pm 18\text {(syst)}\) events.

In order to cross check the backgrounds from V+jets processes (where V represents either a \({\mathrm {W}}\) or a \({\mathrm {Z}}\) boson), which dominate in the signal region, the \({\mathrm {W}}(\mu \nu )\) control region and MC simulation is used to compute yields in other control regions. For example, the yield in the \({\mathrm {Z}}(\mu \mu )\) region is given by:

$$\begin{aligned} N^\mathrm {c}_{\mu \mu } = (N^\mathrm {c}_{\mu \text {obs}} - N^\mathrm {c}_\text {bkg}) \cdot \frac{N^\mathrm {c}_{{\mathrm {Z}}\mathrm {MC}}}{N^\mathrm {c}_{{\mathrm {W}}\mathrm {MC}}}, \end{aligned}$$
(3)

Similar expressions are used to estimate yields in the \({\mathrm {W}}(\mathrm {e}\nu )\) and \({\mathrm {W}}(\mathrm {\tau }_\mathrm {h}\nu )\) control regions. In all cases, the predictions from data agree with the observed yield within the uncertainty.

The QCD multijet background in the signal region is estimated using the fractions of events passing the \(E_{\mathrm {T}}^{\text {miss}}\) and CJV requirements. We define regions A, B, C, and D as follows, after the full remaining selection:

  • A: fail \(E_{\mathrm {T}}^{\text {miss}}\) selection, fail CJV selection;

  • B: pass \(E_{\mathrm {T}}^{\text {miss}}\) selection, fail CJV selection;

  • C: fail \(E_{\mathrm {T}}^{\text {miss}}\) selection, pass CJV selection;

  • D: pass \(E_{\mathrm {T}}^{\text {miss}}\) selection, pass CJV selection.

We estimate the QCD multijet component in regions A, B, and C from data, after subtracting the electroweak backgrounds using estimations from simulation. The QCD multijet component in the signal region D can then be estimated using \(N_\mathrm {D} = N_\mathrm { B}N_\mathrm {C} / N_\mathrm {A}\), where \(N_{i}\) is the number of events in region \(i\). This method is based on the assumption that the \(E_{\mathrm {T}}^{\text {miss}}\) and the CJV are uncorrelated, which has been checked by comparing the \(E_{\mathrm {T}}^{\text {miss}}\) distribution, below the 130 \(\text {GeV}\) threshold, in events passing and failing the CJV. The maximum difference in the \(E_{\mathrm {T}}^{\text {miss}}\) distribution between these two samples is 40 %, which is assigned as a systematic uncertainty of the method. We predict the QCD background in the signal region to be \(30.9 \pm 4.8\text {(stat)}\pm 23.0\text {(syst)}\) events. Furthermore, the method is tested on a high statistics sample with selections equivalent to those in the signal region, but dominated by QCD multijet events by changing the \(\Delta \phi _\mathrm {jj}\) requirement to \(\Delta \phi _\mathrm {jj}>2.6\) radians. In this sample, we observe \(2{,}551 \pm 57\text {(stat)}\) events in the pseudo-signal region after subtraction of backgrounds, which are estimated from MC simulation. The QCD multijet component is predicted to be \(2959 \pm 58\text {(stat)}\), which is compatible with the observation within the systematic uncertainty. To give further confidence in this estimate, we perform a cross-check using an ABCD method based on the \(E_{\mathrm {T}}^{\text {miss}}\) and \(\Delta \phi _\mathrm {jj}\) variables, which gives a prediction consistent with the main method.

The remaining SM backgrounds in the signal region—due to \({\mathrm {t}\overline{\mathrm {t}}}\), single-top, VV and DY(\(\ell \ell \))+jets—are estimated from MC simulation to be \(20.0 ^{+6.0}_{-8.2}\text {(syst)}\) events. The total expected background is \(332 \pm 36\text {(stat)}\pm 45\text {(syst)}\). The background estimates are summarised in Table 1 along with the expected yield for a signal with \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})=100\) %.

Table 1 Summary of the estimated number of background and signal events, together with the observed yield, in the VBF search signal region. The signal yield is given for \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})=100\) %

5.4 Systematic uncertainty

The V+jets background estimates are affected by large statistical uncertainties, ranging from 5–30 %, due to control samples in data. The systematic uncertainty in the V+jet background estimates is dominated by the statistical uncertainty in the MC samples used to calculate the control-to-signal region translation factors. Additional important uncertainties arise due to jet and \(E_{\mathrm {T}}^{\text {miss}}\) energy scale and resolution. These are estimated by varying the scales and resolutions associated with jets and unclustered energy within their uncertainties and recomputing the \(E_{\mathrm {T}}^{\text {miss}}\), resulting in a 13 % systematic uncertainty in the signal acceptance; 7–15 % in the V+jets background estimates; and 60 % uncertainty in the QCD multijet background estimate. We assign a further 40 % uncertainty to the QCD background estimate, as described in Sect. 5.3. Although the uncertainty on the QCD background is large, it is a small component of the total background. Small uncertainties in the muon and electron efficiency arise from uncertainties on the scale factors used to correct MC simulation to data, mentioned in Sect. 3. For the minor backgrounds estimated from MC, the dominant uncertainties are those associated with the cross sections, which are set according to the corresponding CMS cross section measurements, and the jet/\(E_{\mathrm {T}}^{\text {miss}}\) scale uncertainties. We consider theoretical uncertainties in the vector boson fusion signal yield resulting from PDF uncertainties and factorization and renormalization scale uncertainties. The uncertainty in the gluon fusion signal yield is dominated by MC modelling of initial-state radiation, amongst other effects, and is estimated to be 60 % by comparing different MC generators. This has a modest overall effect since the gluon fusion yield is small. These uncertainties are summarized in Table 2, where they are quoted with respect to the total background or signal yield. The combined effect of all background uncertainties results in a relative increase of about 65 % in the expected upper limit on the \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})\).

Table 2 Summary of the uncertainties in the total background and signal yields in the VBF channel. All uncertainties affect the normalization of the yield, and are quoted as the change in the total background or signal estimate, when each systematic effect is varied according to its uncertainties. The signal uncertainties are given for \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})=100\) %

5.5 Results

As shown in Table 1, we observe 390 events the signal region in data, compatible with the background only prediction. Figure 4 shows the \(E_{\mathrm {T}}^{\text {miss}}\) and \({M_\mathrm {jj}}\) distributions in data and simulated backgrounds in the signal region. The simulated V+jets backgrounds shown in this figure are normalized to the estimates from data given in Table 1.

Fig. 4
figure 4

The \(E_{\mathrm {T}}^{\text {miss}}\) (top) and \({M_\mathrm {jj}}\) (bottom) distributions in data and MC after the full selection in the VBF search signal region. The simulated background from different processes is normalized to the estimates obtained from control samples in data, and shown cumulatively, with the total systematic uncertainty shown as a hatched region. Note that the QCD multijet background is not shown due to limited MC statistics, which results in a small apparent discrepancy between data and the backgrounds shown at low values of \(E_{\mathrm {T}}^{\text {miss}}\) and \({M_\mathrm {jj}}\). The cumulative effect of a signal from a Higgs boson with SM VBF production cross section, \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})\) = 100 % is also shown

6 Search for \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\)

6.1 Search strategy

The final state in the \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\) channel consists of a pair of high-\({p_{\mathrm {T}}}\) isolated leptons from the \({\mathrm {Z}}\) boson decay, high \(E_{\mathrm {T}}^{\text {miss}}\) from the undetectable Higgs boson decay products, and limited jet activity. Since the signal cross section is orders of magnitude lower than those for inclusive DY+jets, \({\mathrm {W}}\text {+jets}\), and \({\mathrm {t}\overline{\mathrm {t}}}\), stringent requirements are needed to isolate the signal. We apply an event selection that is optimized for \(m_{\mathrm {H}}\)= 125 \(\text {GeV}\) while still being suitable for the other Higgs boson mass values considered. After this selection, the dominant backgrounds arise from ZZ and \({\mathrm {W}}\) \({\mathrm {Z}}\) processes, which are modelled using MC simulation. Smaller background contributions, from DY+jets, \({\mathrm {t}\overline{\mathrm {t}}}\), \({\mathrm {W}}\) \({\mathrm {W}}\), and \({\mathrm {W}}\)+jets, are modelled using control regions in data. For each value of the Higgs boson mass, the final background and signal yields used to calculate limits are obtained from a fit to the two-dimensional distribution of the transverse mass, \({m_\mathrm {T}}\), of the dilepton-\(E_{\mathrm {T}}^{\text {miss}}\) system, and the azimuthal separation of the two leptons.

6.2 Event selection

We use dielectron and dimuon triggers with \({p_{\mathrm {T}}}>17\) \(\text {GeV}\) (\({p_{\mathrm {T}}}\!>\!8\) \(\text {GeV}\)) thresholds for the leading (subleading) lepton, together with single-muon triggers that allow recovery of some residual trigger inefficiencies. For data taken during periods when the instantaneous luminosity was low enough to allow it, we also use a dimuon trigger with a \({p_{\mathrm {T}}}>7\) \(\text {GeV}\) threshold for each muon.

The offline selection starts by requiring two well-identified, isolated leptons of the same flavor and opposite sign (\(\mathrm {e}^+\mathrm {e}^-\) or \(\mathrm {\mu ^+}\mathrm {\mu ^-}\)), each with \({p_{\mathrm {T}}}> 20\,\text {GeV}\). The invariant mass of the pair must be within \(\pm \)15 \(\text {GeV}\) of the \(\mathrm {Z}\) boson mass. To reduce the large potential background from DY(\(\ell \ell \))+jets events, where the \(E_{\mathrm {T}}^{\text {miss}}\) arises from mismeasurement, any event containing two or more jets with \({p_{\mathrm {T}}}>30\,\text {GeV}\) is rejected. The remaining zero- and one-jet samples are treated separately in the analysis because of their significantly different signal-to-background ratios.

The top-quark background is further suppressed by rejecting events containing a bottom-quark decay identified by either the presence of a soft-muon or by the CSV b-tagging algorithm described in Sect. 2. The tagged b jet is required to have \({p_{\mathrm {T}}}>\)20 \(\text {GeV}\) and to be reconstructed within the tracker acceptance volume (i.e. \({|\eta |} < 2.5\)). The soft-muon is required to have \({p_{\mathrm {T}}}> 3\,\text {GeV}\).

To reduce the \({\mathrm {W}}\mathrm {Z}\) background in which both bosons decay leptonically, events containing additional electrons or muons with \({p_{\mathrm {T}}}> 10\,\text {GeV}\) are rejected. After all selection requirements, most of the remaining \({\mathrm {W}}\mathrm {Z}\) background is from the decay mode \(({\mathrm {W}}\rightarrow \tau \nu )({\mathrm {Z}}\rightarrow \ell \ell )\).

The remaining event selection uses three variables: \(E_{\mathrm {T}}^{\text {miss}}\), \(\Delta \phi ({\ell \ell ,E_{\mathrm {T}}^{\text {miss}}})\), and \(|E_{\mathrm {T}}^{\text {miss}}-{p_{\mathrm {T}}}^{\ell \ell }|/{p_{\mathrm {T}}}^{\ell \ell }\), where \({p_{\mathrm {T}}}^{\ell \ell }\) is the transverse momentum of the dilepton system. The last two variables effectively suppress reducible background processes like DY(\(\ell \ell \))+jets and top-quark production. We optimized the selection criteria applied to these variables, in order to obtain the best expected exclusion limits at 95 % CL for \(m_{\mathrm {H}}\)  = 125 \(\text {GeV}\). For each possible set of selections, we repeat the full analysis, including the shape fits described in Sect. 6.5 below, the estimation of backgrounds from control data samples, and the systematic uncertainties. The final selection criteria obtained after optimization are: \(E_{\mathrm {T}}^{\text {miss}}> 120\,\text {GeV}\), \(\Delta \phi ({\ell \ell ,E_{\mathrm {T}}^{\text {miss}}}) > 2.7\) and \({|E_{\mathrm {T}}^{\text {miss}}-{p_{\mathrm {T}}}^{\ell \ell } |}/{p_{\mathrm {T}}}^{\ell \ell } < 0.25\). The efficiency of the full selection for the \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\) signal at \(m_{\mathrm {H}}=125\) \(\text {GeV}\) is 5.6 %, estimated from MC simulation.

6.3 Background estimation

After the full selection, the dominant backgrounds arise from WZ and ZZ processes, which are modeled using MC simulation. The pre-fit normalization of these backgrounds is obtained from their respective NLO cross sections computed with mcfm.

The DY(\(\ell \ell \))+jets background is modeled from an orthogonal control sample of events with a single isolated photon produced in association with jets (\(\mathrm {\gamma }+\text {jets}\)). This choice has the advantage of providing a large statistics sample, which resembles \(\mathrm {Z}\) boson production in all important aspects: production mechanism, underlying event conditions, pileup scenario, and hadronic recoil [59]. The kinematic distributions and overall normalization of the \(\mathrm {\gamma }+\text {jets}\) events are matched to \(\mathrm {Z}(\ell \ell )+\text {jets}\) in data through event weights, determined as a function of the \(\mathrm {Z}\) boson \({p_{\mathrm {T}}}\) measured from data. This procedure takes into account the dependence of the \(E_{\mathrm {T}}^{\text {miss}}\) on the associated hadronic activity.

Further discrepancies can arise due to differences in the pileup distribution of the \(\mathrm {\gamma }+\text {jets}\) sample due to the fact that photon data was collected with triggers whose prescales varied as a function of photon threshold and data-taking period. These are taken into account by further weighting events in the control sample, according to the distribution of number of reconstructed vertices in the signal sample. The electroweak backgrounds to the control sample, involving photons and neutrinos, are subtracted using predictions from MC simulation.

This procedure yields an accurate model of the \(E_{\mathrm {T}}^{\text {miss}}\) distribution in DY(\(\ell \ell \))+jets events, as shown in Fig. 5 (left), which compares the \(E_{\mathrm {T}}^{\text {miss}}\) distribution of the weighted \(\mathrm {\gamma }\text {+jets}\) events, summed with other backgrounds, to the \(E_{\mathrm {T}}^{\text {miss}}\) distribution of the dilepton events in data. Figure 5 also compares the distributions of (center) \(\Delta \phi (\ell \ell ,E_{\mathrm {T}}^{\text {miss}})\) and (right) \(|E_{\mathrm {T}}^{\text {miss}}-{p_{\mathrm {T}}}^{\ell \ell }|/{p_{\mathrm {T}}}^{\ell \ell }\) obtained from this background model to the same distributions in the dilepton sample. The difference between data and background predictions is less than 10 % in these distributions, which is negligible compared to the estimated systematic uncertainties after the final selection. The uncertainties in the electroweak background to the photon control sample yield a 100 % uncertainty in the normalization of the residual DY(\(\ell \ell \))+jets background. However, since the Drell–Yan background after the full selection is very small, the large uncertainty has negligible impact on the final results.

Fig. 5
figure 5

The distributions of \(E_{\mathrm {T}}^{\text {miss}}\) (left), \(\Delta \phi ({\ell \ell ,E_{\mathrm {T}}^{\text {miss}}})\) (center), and \(|E_{\mathrm {T}}^{\text {miss}}-{p_{\mathrm {T}}}^{\ell \ell }|/{p_{\mathrm {T}}}^{\ell \ell }\) (right) in data compared to the estimated background from simulation (\({\mathrm {W}}\) \({\mathrm {Z}}\) and \(\mathrm {Z}\mathrm {Z}\)) or data (all other channels), before the optimization of the selection. The expected distributions from different background processes are displayed cumulatively, while a signal corresponding to \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})\) = 100 % is superimposed separately. The arrows correspond to the cuts applied for the final selection as described at the end of Sect. 6.2. The statistical uncertainty in the background estimate is shown as a hatched region. The plots show the electron and muon channels combined. The lower panels show the ratio of data to the simulated background, again with the statistical uncertainty in the background shown as a hatched region

The remaining background processes do not involve \({\mathrm {Z}}\) boson production, and are referred to as non-resonant backgrounds. Such backgrounds arise mainly from leptonic \({\mathrm {W}}\) boson decays in \({\mathrm {t}\overline{\mathrm {t}}}\), \(\mathrm {t}{\mathrm {W}}\) decays and \({\mathrm {W}}{\mathrm {W}}\) events. Also included in the estimate of non-resonant backgrounds are small contributions from single-top-quark events produced from \(s\)- and \(t\)-channel processes, \({\mathrm {W}}\text {+jets}\) production, and \({\mathrm {Z}}\rightarrow \mathrm {\tau }\mathrm {\tau }\) events in which \(\mathrm {\tau }\) leptons produce electrons/muons and \(E_{\mathrm {T}}^{\text {miss}}\).

We estimate these backgrounds using a control sample in data, consisting of events with opposite-charge different-flavor dilepton pairs (\(\mathrm {e}^{\pm }\mathrm {\mu }^{\mp }\)) that otherwise pass the full selection. The backgrounds in the \(\mathrm {e}^+\mathrm {e}^-\) and \(\mathrm {\mu ^+}\mathrm {\mu ^-}\) final states are then estimated by applying scale factors (\(\alpha _{\mathrm {e}\mathrm {e}}\), \(\alpha _{\mathrm {\mu }\mathrm {\mu }}\)) to the number of events in the control sample, \(N_{\mathrm {e}\mathrm {\mu }}\):

$$\begin{aligned} N_{\mathrm {e}\mathrm {e}} = \alpha _{\mathrm {e}\mathrm {e}} \times N_{\mathrm {e}\mathrm {\mu }}, \qquad N_{\mathrm {\mu }\mathrm {\mu }} = \alpha _{\mathrm {\mu }\mathrm {\mu }} \times N_{\mathrm {e}\mathrm {\mu }}. \end{aligned}$$
(4)

We compute the two factors \(\alpha _{\mathrm {e}\mathrm {e}}\) and \(\alpha _{\mathrm {\mu }\mathrm {\mu }}\) in the sidebands (SB) of the \(\mathrm {Z}\) peak (\(40< m_{\ell \ell }<70\) \(\text {GeV}\) and \(110< m_{\ell \ell }< 200\) \(\text {GeV}\)) by using the following relations:

$$\begin{aligned} \alpha _{\mathrm {e}\mathrm {e}} = \frac{N_{\mathrm {e}\mathrm {e}}^\mathrm {SB}}{N_{\mathrm {e}\mathrm {\mu }}^\mathrm {SB}}, \qquad \alpha _{\mathrm {\mu }\mathrm {\mu }} = \frac{N_{\mathrm {\mu }\mathrm {\mu }}^\mathrm {SB}}{N_{\mathrm {e}\mathrm {\mu }}^\mathrm {SB}}, \end{aligned}$$
(5)

where \(N_{\mathrm {e}\mathrm {e}}^\mathrm {SB}\), \(N_{\mathrm {\mu }\mathrm {\mu }}^\mathrm {SB}\), and \(N_{\mathrm {e}\mathrm {\mu }}^\mathrm {SB}\) are the number of events in the \(\mathrm {Z}\) sidebands counted in a top-quark-enriched sample of \(\mathrm {e}^+\mathrm {e}^-\), \(\mathrm {\mu ^+}\mathrm {\mu ^-}\), and \(\mathrm {e}^{\pm }\mathrm {\mu }^{\mp }\) final states, respectively. The requirements for this sample are \(E_{\mathrm {T}}^{\text {miss}}>65\) \(\text {GeV}\), \({p_{\mathrm {T}}}^{\ell \ell } >50\) \(\text {GeV}\), \(0.4 <E_{\mathrm {T}}^{\text {miss}}/{p_{\mathrm {T}}}^{\ell \ell } < 1.8\), and a \(\mathrm {b}\)-tagged jet. The kinematic requirements are looser than in the signal region, in order to reduce the statistical uncertainties in the scale factors. The measured values of these factors with the corresponding statistical uncertainties are \(\alpha _{\mathrm {e}\mathrm {e}}^{7\,\text {TeV}\ } = 0.42 \pm 0.04\), \(\alpha _{\mathrm {\mu }\mathrm {\mu }}^{7\,\text {TeV}\ }= 0.64 \pm 0.06\) and \(\alpha _{\mathrm {e}\mathrm {e}}^{8\,\text {TeV}\ } = 0.43 \pm 0.02\), \(\alpha _{\mathrm {\mu }\mathrm {\mu }}^{8\,\text {TeV}\ } = 0.69 \pm 0.03\). The validity of the procedure for computing the scale factor is checked by closure tests on simulated samples. This method accounts for possible differences in probability for electrons and muons to pass the trigger and selection requirements. We also cross-check the methods by calculating \(\alpha _{\mathrm {e}\mathrm {e}}\) and \(\alpha _{\mathrm {\mu }\mathrm {\mu }}\) from the \(\mathrm {Z}\) peak region as follows:

$$\begin{aligned} \alpha _{\mathrm {e}\mathrm {e}} = \frac{1}{2}\sqrt{\frac{N_{\mathrm {e}\mathrm {e}}^\text {peak}}{N_{\mathrm {\mu }\mathrm {\mu }}^\text {peak}}}, \qquad \alpha _{\mathrm {\mu }\mathrm {\mu }} = \frac{1}{2}\sqrt{\frac{N_{\mathrm {\mu }\mathrm {\mu }}^\text {peak}}{N_{\mathrm {e}\mathrm {e}}^\text {peak}}}, \end{aligned}$$
(6)

where \(N_{\mathrm {e}\mathrm {e}}^\text {peak}\), \(N_{\mathrm {\mu }\mathrm {\mu }}^\text {peak}\), are the number of dielectron and dimuon events in a \(\mathrm {Z}\) control sample. This method takes advantage of the equality between the production rates for \({\mathrm {Z}}\rightarrow \mathrm {e}\mathrm {e}\) and \({\mathrm {Z}}\rightarrow \mathrm {\mu }\mathrm {\mu }\) and equates the ratio of observed dilepton counts to the square of the ratio of efficiencies. From the comparison of methods and the closure tests, we derive an uncertainty of 25 % on the normalization of the non-resonant background in addition to the contribution from the statistical uncertainties on the control samples. The background in the signal region, estimated using the methods described above, are shown in Table 3, along with the expected yield for a signal with \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and 100 % invisible branching fraction.

Table 3 Observed yields, background estimates and signal predictions at \(\sqrt{s}=7\) \(\text {TeV}\) and 8 \(\text {TeV}\) in the \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\) channel. The signal yields are given for \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})=100\) %

6.4 Systematic uncertainty

Table 4 lists the systematic uncertainties affecting this search. The most important uncertainties are those associated with theory, affecting both the signal acceptance and the dominant \({\mathrm {W}}{\mathrm {Z}}\) and \({\mathrm {Z}}{\mathrm {Z}}\) backgrounds. The uncertainties arising from missing higher-order QCD corrections are estimated by scaling the renormalization and factorization scales up and down by a factor of two, while those associated with PDFs are estimated using the PDF4LHC prescription [35, 36].

Table 4 Summary of systematic uncertainties in the \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\) channel. The numbers indicate the change in the total background estimate or in the total signal acceptance when each systematic effect is varied according to its uncertainties. Those uncertainties designated as “Norm.” only affect the normalization of the contributions, while those designated “Shape” also affect the shapes of the \({m_\mathrm {T}}\) and/or \(\Delta \phi (\ell \ell )\) distributions. In the case of shape variations, the numbers indicate the range of changes across the bins of the distributions. Signal uncertainties are quoted for \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})=100\,\%\)

The uncertainties related to jet and \(E_{\mathrm {T}}^{\text {miss}}\) energy scale and resolution, lepton \({p_{\mathrm {T}}}\) scale, and reconstruction efficiency affect the signal and all backgrounds, and are estimated as for the search in the VBF mode (see Sect. 5.4).

Uncertainties of approximately \(100\) %, which are derived from the data by comparing different estimation methods and conducting closure tests, are assigned to the non-resonant backgrounds. Due to the small size of the control samples, the relative uncertainties are large, but absolute contribution of these backgrounds is small.

The combined signal efficiency uncertainty is estimated to be \(\sim \)12 %, and the total uncertainty in the background estimations is about \(\sim \)15 %, dominated by the theoretical uncertainties mentioned above. The combined effect of all systematic uncertainties results in a relative increase of about 35 % in the expected upper limit on the \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})\).

6.5 Results

As shown in Table 3, the total number of observed events is 134 with an estimated background of about 138 events, while the expected signal yield is 35 events. The final limits on a signal are determined using a profile likelihood fit to the normalizations and the shapes of selected distributions in the signal region. For the 8 \(\text {TeV}\) data, we use the two-dimensional distribution of the azimuthal dilepton separation (\(\Delta \phi _{\ell \ell }\)) and the \({m_\mathrm {T}}\) of the dilepton-\(E_{\mathrm {T}}^{\text {miss}}\) system. For the 7 \(\text {TeV}\) data, due to lower number of events in the control samples, we use a one-dimensional fit to \({m_\mathrm {T}}\) alone. The expected ratio of signal to background increases at high values of \({m_\mathrm {T}}\) and low values of \(\Delta \phi _{\ell \ell }\), giving the shape analysis greater sensitivity than a limit obtained from event counts alone. The transverse mass \({m_\mathrm {T}}\) is given by the formula

$$\begin{aligned} {m_\mathrm {T}}= \sqrt{2 {p_{\mathrm {T}}}^{\ell \ell } E_{\mathrm {T}}^{\text {miss}}[ 1-\cos \Delta \phi ({\ell \ell ,E_{\mathrm {T}}^{\text {miss}}})]}. \end{aligned}$$
(7)

This definition of \({m_\mathrm {T}}\), which treats both the lepton pair and the recoiling undetected system as massless, is found to yield the best separation between the signal and the backgrounds from \({\mathrm {W}}{\mathrm {W}}\), \({\mathrm {W}}\mathrm {Z}\), and \(\mathrm {Z}\mathrm {Z}\).

The two center-of-mass energies (7 and 8 \(\text {TeV}\)), two lepton flavors (\(\mathrm {e}\) and \(\mathrm {\mu }\)), and two jet multiplicities (0 and 1), define eight disjoint samples that are treated separately in the likelihood calculation. The shapes and normalizations of the signal and of each background component are allowed to vary within their uncertainties, and correlations in the sources of systematic uncertainty are taken into account. The \({m_\mathrm {T}}\) distribution in the 7 \(\text {TeV}\) data, and the \(\Delta \phi _{\ell \ell }\) distribution in the 8 \(\text {TeV}\) data, in the signal region are shown in Fig. 6 for illustration. As can be seen, the observed data are consistent with the predicted backgrounds.

Fig. 6
figure 6

Distributions used for setting limits in the \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\) analysis. The expected distributions from different background processes are displayed cumulatively, while a signal corresponding to \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})\) = 100 % is superimposed separately. The total statistical and systematic uncertainty in the total background is shown as a hatched region. The limits for 7 \(\text {TeV}\) use the shape of the \({m_\mathrm {T}}\) distribution (left) while the limits for 8 \(\text {TeV}\) use both the \({m_\mathrm {T}}\) (center) and \(\Delta \phi _{\ell \ell }\) (right) shapes. The distributions are shown with electron and muon channels and 0- and 1-jet channels combined

7 Search for \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\)

7.1 Search strategy

The \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) search closely follows the strategy of the CMS search for SM \(\mathrm {Z}(\nu \overline{\nu })\mathrm {H}(\mathrm {b}\overline{\mathrm {b}})\) [60], sharing the same \(E_{\mathrm {T}}^{\text {miss}}+\mathrm {b}\overline{\mathrm {b}}\) final state, though the \(\mathrm {b}\overline{\mathrm {b}}\) resonances have different masses. The event selection requires large \(E_{\mathrm {T}}^{\text {miss}}\), equivalent to the boost of the Higgs boson [61], and a jet pair consistent with a \(\mathrm {Z}\rightarrow \mathrm {b}\overline{\mathrm {b}}\) decay. The signal yield after the final selection is estimated using a BDT trained on simulated background and signal MC samples, by fitting BDT output for background and signal to that obtained from data.

The backgrounds in this channel arise from production of W and \({\mathrm {Z}}\) bosons in association with jets (V+jets), \({\mathrm {t}\overline{\mathrm {t}}}\), single-top-quark, diboson (VV), and QCD multijet production. The SM Higgs process, \(\mathrm {Z}(\nu \overline{\nu })\mathrm {H}(\mathrm {b}\overline{\mathrm {b}})\), has a negligible effect on this search, due to the different mass of the \(\mathrm {b}\overline{\mathrm {b}}\) resonance and good di-jet mass resolution, which is about 10 %. The \(\mathrm {Z}(\nu \overline{\nu })\mathrm {H}(\mathrm {b}\overline{\mathrm {b}})\) process is therefore treated as an independent background process.

Since the VV production cross section is only a small factor larger than that of standard model VH, and given the nearly identical final state for VZ with \({\mathrm {Z}}(\mathrm {b}\overline{\mathrm {b}})\), the VV process has been used as a benchmark to validate the search strategy used here [60].

7.2 Trigger

A suite of four \(E_{\mathrm {T}}^{\text {miss}}\) triggers is used for this search, due to the challenge of maintaining acceptance as the instantaneous luminosity increases. A trigger with \(E_{\mathrm {T}}^{\text {miss}}\! >\!150\) \(\text {GeV}\) is used for the full 8 \(\text {TeV}\) data set. To increase acceptance at lower \(E_{\mathrm {T}}^{\text {miss}}\), we also use triggers requiring jets in addition to \(E_{\mathrm {T}}^{\text {miss}}\). For the early data-taking period, a trigger requiring \(E_{\mathrm {T}}^{\text {miss}}\!>\!80\) \(\text {GeV}\) together with two jets with \({|\eta |}\!<\!2.5\) and \({p_{\mathrm {T}}}\!>\!30\) \(\text {GeV}\) was used. However, as the average instantaneous luminosity reached \(3\times 10^{33}\text {cm}^{-2}\,\text {s}^{-1}\), this was replaced with a trigger requiring \(E_{\mathrm {T}}^{\text {miss}}\!>\!100\) \(\text {GeV}\), two jets with individual \({p_{\mathrm {T}}}\) above 60 and 25 \(\text {GeV}\) respectively, the vector sum of the two jet \({p_{\mathrm {T}}}\) to be above 100 \(\text {GeV}\), and finally a veto on any jet with \({p_{\mathrm {T}}}\!>\!40\) \(\text {GeV}\) and closer than 0.5 radians in \(\phi \) to the \(E_{\mathrm {T}}^{\text {miss}}\) direction. Finally, a trigger was used that requires \(E_{\mathrm {T}}^{\text {miss}}>80\) \(\text {GeV}\), together with two jets having \({|\eta |}\!<\!2.5\) and \({p_{\mathrm {T}}}\!>\!20\) \(\text {GeV}\) or \({p_{\mathrm {T}}}\!>\!30\) \(\text {GeV}\), depending on the luminosity conditions, and at least one of the jets tagged by the online CSV b-tagging algorithm [55].

For \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) events with \(E_{\mathrm {T}}^{\text {miss}}>170\) \(\text {GeV}\), the combined trigger efficiency is near 100 % with respect to the offline event reconstruction and selection, described in the next section. For events with \(E_{\mathrm {T}}^{\text {miss}}\) between 130 and 170 \(\text {GeV}\) (100 and 130 \(\text {GeV}\)) the corresponding efficiency is about 98 % (85 %).

7.3 Event selection

The event selection in this channel is designed to enhance heavy-flavor production and a Higgs boson with high Lorentz boost, with reasonable kinematic thresholds consistent with the trigger selection, and to provide sufficient statistics to perform the BDT training properly. The event selection is summarized in Table 5. Backgrounds to the signal are substantially reduced by a large \(E_{\mathrm {T}}^{\text {miss}}\) requirement. In this regime, where the Higgs boson has substantial boost, the Z and Higgs bosons are separated by a large azimuthal opening angle, we therefore require \(\Delta \phi (\mathrm {Z},\mathrm {H})>2.0\) radians. We define “low”, “intermediate”, and “high” \(E_{\mathrm {T}}^{\text {miss}}\) regions to have \(100<E_{\mathrm {T}}^{\text {miss}}<130\) \(\text {GeV}\), \(130<E_{\mathrm {T}}^{\text {miss}}<170\) \(\text {GeV}\), and \(E_{\mathrm {T}}^{\text {miss}}>170\) \(\text {GeV}\), respectively.

Table 5 Selection criteria for the \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) search, in the 3 \(E_{\mathrm {T}}^{\text {miss}}\) regions. The variables used are either described in the text or in Table 6

The QCD multijet background is reduced to negligible levels by imposing three requirements which ensure that the \(E_{\mathrm {T}}^{\text {miss}}\) does not originate from mismeasured jets. First, we cut on the azimuthal separation, \({\Delta \phi ({E_{\mathrm {T}}^{\text {miss}}},\mathrm {j})}\), between the \(E_{\mathrm {T}}^{\text {miss}}\) direction and the closest jet with \({|\eta |}<2.5\) and \({p_{\mathrm {T}}}\) \(>25\) \(\text {GeV}\). For the high-\(E_{\mathrm {T}}^{\text {miss}}\) region we require \({\Delta \phi ({E_{\mathrm {T}}^{\text {miss}}},\mathrm {j})}>0.5\) radians, while for the intermediate- and low-\({p_{\mathrm {T}}}(\mathrm {V})\) regions this requirement is increased to \({\Delta \phi ({E_{\mathrm {T}}^{\text {miss}}},\mathrm {j})}>0.7\) radians. Second, we calculate the \(E_{\mathrm {T}}^{\text {miss}}\) from charged tracks only, using tracks originating from the primary vertex with \({p_{\mathrm {T}}}\) \(>0.5\) \(\text {GeV}\) and \({|\eta |}<2.5\), and require the separation in azimuth from the standard \(E_{\mathrm {T}}^{\text {miss}}\) satisfies \({\Delta \phi ({E_{\mathrm {T}}^{\text {miss}}},{E_{\mathrm {T}}^{\text {miss}}}{}_\text {trk})}<0.5\) radians. Third, in the low-\(E_{\mathrm {T}}^{\text {miss}}\) region only, we require the \(E_{\mathrm {T}}^{\text {miss}}\) significance, defined as the ratio of the \(E_{\mathrm {T}}^{\text {miss}}\) and the square root of the scalar sum of transverse energy of all particle-flow objects, to be greater than three.

To reduce the \({\mathrm {t}\overline{\mathrm {t}}}\) and WZ backgrounds, events with isolated leptons with \({p_{\mathrm {T}}}\) \(>15\) \(\text {GeV}\) are rejected.

The \({\mathrm {Z}}\) boson candidate is defined to be the pair of central (\({|\eta |}<2.5\)) jets, above minimum \({p_{\mathrm {T}}}\) thresholds given in Table 5, that has the greatest vector sum of transverse momenta, \({p_{\mathrm {T}}}^\mathrm {jj}\). Each event is required to pass minimum requirements on \({p_{\mathrm {T}}}^\mathrm {jj}\) as well as the invariant mass of the jet pair, \({M_\mathrm {jj}}\). In the low-\(E_{\mathrm {T}}^{\text {miss}}\) category, events with two or more jets in addition to this pair are vetoed. Each jet in the \({\mathrm {Z}}\) boson pair are required to be tagged by the CSV algorithm. Separate thresholds are applied to the jets with higher (\(\mathrm {CSV}_{\mathrm {max}}\)), and lower (\(\mathrm {CSV}_{\text {min}}\)), values of the CSV discriminator. The background from V+jets and VV processes is reduced significantly through b tagging, leaving the background in the signal region dominated by sub-processes where the two jets originate from genuine b quarks.

The Z boson mass resolution is improved by roughly 10 % by applying regression techniques similar to those used by the CDF Collaboration [62] and in the \(\mathrm {V}\mathrm {H}(\mathrm {b}\overline{\mathrm {b}})\) search by the CMS Collaboration [60]. This results in a resolution of approximately 10 %, after all event selection criteria are applied, with a few percent bias on the mass.

The selection is optimized to give the best signal significance, for a signal with \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})=100\) %. After all selection criteria, the efficiency for a signal with \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})=100\) % is 4.8 %, while for the most sensitive region of the BDT distribution, defined in Sect. 7.5, it is 1.75 %. The effect of the selection on signal and background can be seen in Fig. 7 which shows the \({M_\mathrm {jj}}\) and CSV\(_{\mathrm {min}}\) distributions after all other selection requirements.

Fig. 7
figure 7

Distributions of \({M_\mathrm {jj}}\) (top) and CSV\(_{\mathrm {min}}\) (bottom) in the high-\(E_{\mathrm {T}}^{\text {miss}}\) category of the \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) search, after all other selection requirements. The simulated background contributions are displayed cumulatively, and the uncertainty in the total background is shown as a hatched region. The arrows correspond to the cuts applied for the final selection as described in Table 5. The panels below both distributions show the ratio of observed data to expected background events

As mentioned above, a BDT is used in the final stage of the analysis to discriminate signal from backgrounds. The BDT is trained using simulated samples for signal and all background processes after the full selection described above. This is performed separately for each Higgs boson mass hypothesis, which cover the range \(105 < m_{\mathrm {H}}< 145\)  \(\text {GeV}\) in 10  \(\text {GeV}\) steps. The set of input variables to the BDT is chosen by iterative optimization from a larger number of potentially discriminating variables, and is listed in Table 6.

Table 6 Input variables to the \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) BDT

7.4 Background estimation

All backgrounds are modeled using MC simulation. Control regions in data are used to validate the simulated distributions used as input to the BDT. These control regions are also used to obtain scale factors to correct the pre-fit normalizations of the dominant \({\mathrm {Z}}\)+jets, \({\mathrm {W}}\)+jets and \({\mathrm {t}\overline{\mathrm {t}}}\) backgrounds. We use the same control regions as defined in Ref. [60] for the \(\mathrm {Z}(\nu \overline{\nu })\mathrm {H}(\mathrm {b}\overline{\mathrm {b}})\) search. For \({\mathrm {W}}\) backgrounds, the control region is defined using the same kinematic selection as the signal region apart from the lepton veto, which is inverted. For \({\mathrm {Z}}\) backgrounds we require a mass veto around the Higgs boson mass hypothesis. In addition we split the \({\mathrm {Z}}\) and \({\mathrm {W}}\) backgrounds into heavy-flavor enriched regions, by requiring the same b-tag as the signal region, and light-flavor enriched regions, by inverting the b-tag definition of the signal region. For the \({\mathrm {t}\overline{\mathrm {t}}}\) background, the control region is defined by inverting the lepton veto and additional jet criteria, with respect to the signal region definition.

To obtain the scale factors by which the simulated event yields are adjusted, a set of binned likelihood fits are performed to the CSV\(_{\mathrm {min}}\) distributions of events in the control regions. These fits are done simultaneously in all control regions, and the normalization of each background process is allowed to vary independently. Fits to several other variables are also performed, to verify consistency. The scale factors account not only for cross section discrepancies, but also residual differences in physics object selection. For the \({\mathrm {Z}}\) and \({\mathrm {W}}\) backgrounds, separate sets of scale factors are obtained for each process according to how many of the two jets selected in the \({\mathrm {Z}}\) boson reconstruction originate from a b quark. These are labelled: V+udscg for the case where none of the jets originates from a b-quark, V+b for the case where only one of the jets is from a b quark, and V+bb for the case where both jets originate from b quarks. The scale factors obtained are all close to and compatible with unity, except the V+b background where the scale factor is closer to 2, as seen in Ref. [60].

Table 7 shows the expected signal and background yields, estimated from MC simulation as described above. Figure 8 shows the distribution of CSV b-tag discriminant and dijet \({p_{\mathrm {T}}}\) in the \(\mathrm {Z}\mathrm {+}\mathrm {b}\overline{\mathrm {b}}\) and \({\mathrm {W}}\mathrm {+}\mathrm {b}\overline{\mathrm {b}}\) enriched regions, respectively. The high-\(E_{\mathrm {T}}^{\text {miss}}\) category is shown, after the data/MC scale factors are applied.

Fig. 8
figure 8

Distributions in the high-\(E_{\mathrm {T}}^{\text {miss}}\) category of the \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) search: second best CSV among the dijet daughters in the \(\mathrm {Z}\mathrm {+}\mathrm {b}\overline{\mathrm {b}}\) enriched region (top), and dijet \({p_{\mathrm {T}}}\) in the \({\mathrm {W}}\mathrm {+}\mathrm {b}\overline{\mathrm {b}}\) enriched region (bottom). The simulated background contributions are displayed cumulatively, and the uncertainty in the total background is shown as a hatched region. The panels below both distributions show the ratio of observed data to expected background events. An overflow bin is displayed in the right plot

Table 7 Background estimates and signal predictions, together with the observed yields in data, for the most sensitive region in the \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) BDT analysis. The signal predictions are given for \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})=100\) %

7.5 Systematic uncertainty

Table 8 lists the uncertainties considered in this channel. The values quoted are for the most sensitive region of the analysis (\(\mathrm {S}/\mathrm {B} >3.5\,\%\)), which corresponds to requirements on the BDT output of \(>\)0.8, \(>\)0.7, and \(>\)0.2 in the low, intermediate, and high-\(E_{\mathrm {T}}^{\text {miss}}\) categories, respectively.

Table 8 Summary of the uncertainties in the \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) channel. The numbers indicate the change in the total background estimate or in the total signal acceptance when each systematic effect is varied according to its uncertainties. Those uncertainties designated as “Norm.” only affect the normalization of the contributions, while those designated “Shape” also affect the shapes of the BDT output. In the case of shape variations, the numbers indicate the range of changes across the bins of the distributions. Signal uncertainties are quoted for \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})=100\) %. Due to correlations, the total systematic uncertainty is less than the sum in quadrature of the individual uncertainties. The effect is evaluated in the most sensitive region of the BDT output

Important theoretical uncertainties arise in the signal yield estimation from factorization and renormalization scales, as well as PDF uncertainties, and are estimated as for the \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\) and VBF searches. In addition, uncertainties arising from the QCD NNLO and electroweak NLO corrections discussed in Sect. 3 are included.

The background estimates are unaffected by theoretical uncertainties, since they are corrected using data/MC scale factors, as discussed in Sect. 7.4. However, uncertainties in the background normalization arising from the scale factors themselves are accounted for, by propagating other systematic uncertainties (jet energy scale, jet energy resolution, b tagging efficiency) to the control regions and repeating the fit procedure. Cross section uncertainties of 15 % each are assigned to the single-top-quark backgrounds in the t- and tW-channels, resulting in approximately 1 % uncertainty in the sum of all backgrounds. For the diboson backgrounds, a 7 % cross section uncertainty is assigned, consistent with the CMS measurement of this process [64], which results in an uncertainty of approximately 4 % in the total background.

As indicated in Table 8, uncertainties affecting the shape of the BDT output are also considered: trigger efficiency, jet energy scale and resolution, unclustered energy, b-tagging efficiency, MC event statistics, lepton momentum scale and pileup. The jet energy scale and resolution uncertainties are estimated as for the \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\) search, resulting in yield uncertainties of 2–4 and 4–6 %, respectively. The uncertainty associated with b-tagging is taken from uncertainty in the weights applied to MC simulation, mentioned in Sect. 4. The measured uncertainties for the b-tagging scale factors are: 3 % per b tag, 6 % per charm tag, and 15 % per mistagged jet, originating from gluons and light u, d, s quarks [55]. These translate into yield uncertainties in the 3–5 % range, depending on the channel and the specific process. The shape of the BDT output distribution is also affected by the shape of the CSV distribution, and is therefore recomputed as the CSV distribution is varied within its uncertainties. The shape uncertainty due to MC modelling of backgrounds is estimated by comparing MadGraph and herwig++ results for the V+jets backgrounds, and comparing MadGraph with powheg for \({\mathrm {t}\overline{\mathrm {t}}}\).

The combined effect of all systematic uncertainties results in a relative increase of about 20 % in the expected upper limit on the \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})\).

7.6 Results

The number of events observed in data are shown alongside the background estimates in Table 7, for the most sensitive regions of the analysis as defined in the previous section. The BDT output distributions of the three \(E_{\mathrm {T}}^{\text {miss}}\) categories are shown in Fig. 9. In the \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) search, limits are determined using a fit to the BDT output distribution. This is performed separately for each Higgs boson mass hypothesis, every 10 GeV in the range 105–145 GeV. In the fit, the shape and normalization for signal and each background component are allowed to vary within the systematic and statistical uncertainties described in Sect. 7.5. These uncertainties are treated as nuisance parameters in the fit, with appropriate correlations taken into account. All nuisance parameters, including the scale factors described in Sect. 7.4 are adjusted by the fit.

Fig. 9
figure 9

Distributions of the \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) BDT output in the high-\(E_{\mathrm {T}}^{\text {miss}}\) bin (left), intermediate-\(E_{\mathrm {T}}^{\text {miss}}\) bin (center), and low-\(E_{\mathrm {T}}^{\text {miss}}\) bin (right) after all selection criteria have been applied. The simulated background contributions are displayed cumulatively, while a signal corresponding to \(m_{\mathrm {H}}=125\) \(\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})\) = 100 % is superimposed. The uncertainty in the background is shown as a hatched region. The panels below each distribution show the ratio of observed data to expected background events. These distributions are used to extract 95 % CL upper limits on the signal

8 Cross section limits

No evidence for a signal is observed in any of the three searches. We set 95 % CL upper limits on the Higgs boson production cross section times invisible branching fraction, \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})\), for the VBF and ZH production modes separately. Limits are calculated using a \({\mathrm {CL}_\mathrm {s}}\) method [65, 66], based on asymptotic formulae from Ref. [67], following the standard CMS Higgs boson searches combination technique [3, 68]. Systematic uncertainties are incorporated as nuisance parameters and treated according to the frequentist paradigm described in Ref. [68]. We also present 95 % CL limits on Higgs boson production cross section times invisible branching fraction normalised to the SM production cross section [38, 39], which we will denote \(\xi = \sigma \cdot \mathcal {B}(\mathrm {H}\rightarrow \text {inv})/ \sigma _\mathrm {SM}\). We present limits on \(\xi \) for the VBF and ZH modes separately and from the combination of all channels. It should be noted that the assumption of SM production cross sections is an arbitrary choice, as a sizeable invisible width would indicate physics beyond the SM, which may also modify the production cross-section. However, an alternative choice of model for Higgs boson production would essentially scale the limits and provide no further information.

Under the assumption of SM production cross sections and acceptances, we may interpret limits on \(\xi \) as limits on the invisible branching fraction of the 125 \(\text {GeV}\) Higgs boson.

Figure 10 (\(\text {top}\)) shows the observed and median expected 95 % CL limits on the Higgs boson production cross section times invisible branching fraction, as a function of the Higgs boson mass, for the VBF production mode. Figure 10 (\(\text {bottom}\)) shows the corresponding limit on \(\xi \). Assuming the SM VBF production cross section and acceptance, this corresponds to an observed (expected) upper limit on \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})\) of 0.65 (0.49) for \(m_{\mathrm {H}}=125\) \(\text {GeV}\).

Fig. 10
figure 10

Expected and observed 95 % CL upper limits on the VBF production cross section times invisible branching fraction (top), and normalized to the SM Higgs boson VBF production cross section (bottom)

The 95 % CL observed and median expected upper limits on the Higgs boson production cross section times invisible branching fraction for the ZH production mode are shown in Fig. 11 (\(\text {top}\)). As for the VBF search, limits on \(\xi \) are also shown, in Fig. 11 (\(\text {bottom}\)). For a Higgs boson with \(m_{\mathrm {H}}= 125\,\text {GeV}\), the observed (expected) upper limit on \(\xi \) obtained from the \(\mathrm {Z}(\ell \ell )\mathrm {H}(\text {inv})\) search alone is 0.83 (0.86), and from the \(\mathrm {Z}(\mathrm {b}\overline{\mathrm {b}})\mathrm {H}(\text {inv})\) search alone is 1.82 (1.99). Assuming the SM production cross section and acceptance, we interpret these results as an observed (expected) 95 % CL upper limit on \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})\) of 0.81 (0.83) for \(m_{\mathrm {H}}= 125\,\text {GeV}\).

Fig. 11
figure 11

Expected and observed 95 % CL upper limits on the ZH production cross section times invisible branching fraction (top), and normalized to the SM Higgs boson ZH production cross section (bottom)

By assuming production cross sections as for the SM Higgs boson, the results of the three individual searches may be combined and interpreted as a limit on the invisible branching fraction of the 125 \(\text {GeV}\) Higgs boson. The statistical combination fully accounts for correlations between nuisance parameters in the individual searches. The most important correlations are unsurprisingly those associated with the signal uncertainty in the ZH searches, due to PDF and renormalization/factorization scale variation uncertainties. The most important correlated uncertainties are, in decreasing order of importance, the jet energy scale uncertainty, those associated with the signal uncertainty, due to PDF and renormalization/factorization scale variation uncertainties, the total integrated luminosity uncertainty, the lepton momentum scale uncertainties, the jet energy resolution uncertainty and the \(E_{\mathrm {T}}^{\text {miss}}\) energy scale and resolution uncertainties. The resulting 95 % CL limit on \(\xi \) is shown in Fig. 12 and summarised in Table 9. Assuming the SM production cross section and acceptance, the 95 % CL observed upper limit on the invisible branching fraction for \(m_{\mathrm {H}}= 125\,\text {GeV}\) is 0.58, with an expected limit of 0.44. The corresponding observed (expected) upper limit at 90 % CL is 0.51 (0.38). These limits significantly improve on the indirect 95 % CL limit of \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})<0.89\) obtained from visible decays [3].

Fig. 12
figure 12

Expected and observed 95 % CL upper limits on \(\sigma \cdot \mathcal {B}(\mathrm {H}\rightarrow \text {inv})/ \sigma (\mathrm {SM})\)

Table 9 Summary of 95 % CL upper limits on \(\sigma \cdot \mathcal {B}(\mathrm {H}\rightarrow \text {inv})/ \sigma _\mathrm {SM}\) obtained from the VBF search, the combined ZH searches, and the combination of all three searches

9 Dark matter interactions

We now interpret the experimental upper limit on \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})\), under the assumption of SM production cross section, in the context of a Higgs-portal model of DM interactions [79]. In these models, a hidden sector can provide viable stable DM particles with direct renormalizable couplings to the Higgs sector of the SM. In direct detection experiments, the elastic interaction between DM and nuclei exchanged through the Higgs boson results in nuclear recoil which can be reinterpreted in terms of DM mass, \(M_\chi \), and DM-nucleon cross section. If the DM candidate has a mass below \(m_{\mathrm {H}}/2\), the invisible Higgs boson decay width, \(\varGamma _{\text {inv}}\), can be directly translated to the spin-independent DM-nucleon elastic cross section, as follows for scalar (S), vector (V), and fermionic (f) DM, respectively [8]:

$$\begin{aligned}&\sigma ^\mathrm {SI}_{\mathrm {S}-\mathrm {N}} = \frac{4\varGamma _{\text {inv}}}{m_{\mathrm {H}}^3v^2\beta } \frac{m_\mathrm {N}^4f_\mathrm {N}^2}{(M_\chi +m_\mathrm {N})^2},\end{aligned}$$
(8)
$$\begin{aligned}&\sigma ^\mathrm {SI}_{\mathrm {V}-\mathrm {N}} = \frac{16\varGamma _{\text {inv}}M_\chi ^4}{m_{\mathrm {H}}^3 v^2 \beta (m_{\mathrm {H}}^4-4M_\chi ^2 m_{\mathrm {H}}^2+12M_\chi ^4)} \frac{m_\mathrm {N}^4f_\mathrm {N}^2}{(M_\chi +m_N)^2},\nonumber \\\end{aligned}$$
(9)
$$\begin{aligned}&\sigma ^\mathrm {SI}_{\mathrm {f}-\mathrm {N}} = \frac{8\varGamma _{\text {inv}}M_\chi ^2}{m_{\mathrm {H}}^5v^2\beta ^3}\frac{m_\mathrm {N}^4f_\mathrm {N}^2}{(M_\chi +m_\mathrm {N})^2}. \end{aligned}$$
(10)

Here, \(m_\mathrm {N}\) represents the nucleon mass, taken as the average of proton and neutron masses, 0.939 \(\text {GeV}\), while \(\sqrt{2}v\) is the Higgs vacuum expectation value of 246 \(\text {GeV}\), and \(\beta =\sqrt{1-4M^2_\chi /{m_{\mathrm {H}}}^2}\). The dimensionless quantity \(f_\mathrm {N}\) [8] parameterizes the Higgs-nucleon coupling; we take the central values of \(f_\mathrm {N}=0.326\) from a lattice calculation [69], while we use results from the MILC Collaboration [70] for the minimum (0.260) and maximum (0.629) values. We convert the invisible branching fraction to the invisible width using \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})= \varGamma _{\text {inv}}/ (\Gamma _\mathrm {SM} + \varGamma _{\text {inv}})\), where \(\Gamma _\mathrm {SM}= 4.07\) \(\text {MeV}\).

Figure 13 shows upper limits at 90 % CL on the DM-nucleon cross section as a function of the DM mass, derived from the experimental upper limit on \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})\) for \(m_{\mathrm {H}}=125\) \(\text {GeV}\), in the scenarios where the DM candidate is a scalar, a vector, or a Majorana fermion.

Fig. 13
figure 13

Upper limits on the spin-independent DM-nucleon cross section \(\sigma ^\mathrm {SI}_{\chi -\mathrm {N}}\) in Higgs-portal models, derived for \(m_{\mathrm {H}}= 125\,\text {GeV}\) and \(\mathcal {B}(\mathrm {H}\rightarrow \text {inv})< 0.51\) at 90 % CL, as a function of the DM mass. Limits are shown separately for scalar, vector and fermion DM. The solid lines represent the central value of the Higgs-nucleon coupling, which enters as a parameter, and is taken from a lattice calculation, while the dashed and dot-dashed lines represent lower and upper bounds on this parameter. Other experimental results are shown for comparison, from the CRESST [71], XENON10 [72], XENON100 [73], DAMA/LIBRA [74, 75], CoGeNT [76], CDMS II [77], COUPP [78], LUX [79] Collaborations

10 Summary

A search for invisible decays of Higgs bosons has been performed, using the vector boson fusion and associated ZH production modes, with \({\mathrm {Z}}\rightarrow \ell \ell \) or \({\mathrm {Z}}\rightarrow \mathrm {b}\overline{\mathrm {b}}\). No evidence for a signal is observed in any channel. Using a \({\mathrm {CL}_\mathrm {s}}\) method, upper limits are placed on the Higgs boson production cross section times invisible branching fraction, for the VBF and ZH channels separately and combined. These results improve the exclusion in terms of \(\sigma \cdot \mathcal {B}(\mathrm {H}\rightarrow \text {inv})/ \sigma _\mathrm {SM}\) for \(m_{\mathrm {H}}>113\) \(\text {GeV}\) with respect to the limits obtained at LEP [11]. By assuming standard model production cross sections, and combining all channels, the upper limit on the invisible branching fraction of a Higgs boson for \(m_{\mathrm {H}}=125\) \(\text {GeV}\), is found to be 0.58, with an expected limit of 0.44, at 95 % confidence level. These limits assume the signal acceptance of a SM Higgs boson. These constraints are more stringent than the indirect limits obtained from visible Higgs boson decays. Finally, the result is interpreted in a Higgs-portal model of dark matter [9]. Strong limits, beyond those from direct searches, are obtained on the dark matter nucleon cross section for light dark matter.