1 Introduction

The discovery of a Higgs boson (\({\mathrm{H}}_{\mathrm{}}^{\mathrm{}}\)) [1,2,3] by the ATLAS and CMS Collaborations at the CERN LHC, with properties consistent with expectations from the standard model (SM) of particle physics, has emphasized the hierarchy problem of the SM. In the SM, the measured \({\mathrm{H}}_{\mathrm{}}^{\mathrm{}}\) mass of 125\(\,\text {Ge}\text {V}\)  [4, 5], given its fundamental scalar nature [6, 7], requires extreme fine tuning of quantum corrections, suggesting that the SM may be incomplete. Many different exotic models, such as the little Higgs [8,9,10] and composite Higgs [11,12,13] models, predict the existence of new resonances decaying to a vector boson (\({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} = {{\mathrm{W}}_{\mathrm{}}^{\mathrm{}}}, {{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \)) and a Higgs boson [14,15,16,17,18].

Heavy vector triplet (HVT) models [19] introduce new heavy vector bosons (\({\mathrm{{{\mathrm{W}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\), \({\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\)) that couple to the Higgs and SM gauge bosons with the parameters \(c_\text {H}\) and \(g_\text {V}\), and to the fermions via the combination \((g^2/g_\text {V}) c_\text {F} \), where \(c_\text {F}\) is the fermion coupling and g is the SM \(\text {SU}(2)_\text {L}\) gauge coupling. The HVT couplings are expected to be of order unity in most models. Three benchmark models, denoted as models A, B, and C are considered in this paper.

In model A, the coupling strengths to fermions and gauge bosons are comparable and the heavy resonances decay predominantly to fermions, as is the case in some extensions of the SM gauge group [20]. In model B, the fermionic couplings are suppressed, as in composite Higgs models. In model C, the fermionic couplings are set to zero, so the resonances are produced only through vector boson fusion (VBF) and decay exclusively to a pair of SM bosons. The parameters used for model A are \(g_\text {V} = 1\), \(c_\text {H} = -0.556\), and \(c_\text {F} = -1.316\); for model B, \(g_\text {V} = 3\), \(c_\text {H} = -0.976\), and \(c_\text {F} = 1.024\); and for model C, \(g_\text {V} = 1\), \(c_\text {H} = 1\), \(c_\text {F} = 0\).

Previous searches for a heavy resonance decaying to a Higgs boson and a vector boson have been carried out at \(\sqrt{s}= 13\,\text {Te}\text {V} \) in the semileptonic final state [14, 15, 21] and in the fully hadronic final state [22,23,24] by the CMS and ATLAS Collaborations. The most stringent lower limit on the \({\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\) mass at 95% confidence level using the semileptonic (fully hadronic) final state is 2.65 (2.2)\(\,\text {Te}\text {V}\) in HVT model A and 2.83 (2.65)\(\,\text {Te}\text {V}\) in HVT model B [15, 24].

This paper describes a search for a heavy resonance (denoted as \({\mathrm{X}}_{\mathrm{}}^{\mathrm{}}\) for the reconstructed quantity and \({\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\) for the particle predicted by the theory) decaying to a \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson and a Higgs boson. The \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson is identified via a pair of electrons or muons, or a large amount of missing transverse momentum (\({\vec p}_{\mathrm {T}}^{\text {miss}}\)) measured in the detector due to the presence of at least two neutrinos. The Higgs boson is identified via its hadronic decays, either directly to a pair of heavy quarks, or via cascade decays dominated by \({{\mathrm{W}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{W}}_{\mathrm{}}^{\mathrm{}}} \) and \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \). We explore the regime where the Higgs boson has a large Lorentz boost and is reconstructed as a single, large-radius jet, referred to as \(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}\), with characteristic substructure and identified via its mass and possible presence of \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) quark subjets. If a heavy resonance couples exclusively to the SM bosons, it can be produced dominantly through VBF. Dedicated categories are defined in order to enhance the sensitivity to this production mode, exploiting the presence of two jets with large transverse momenta (\(p_{\mathrm {T}}\)) in the forward region of the detector, which are remnants of the initial-state quarks participating in the VBF interaction. The Feynman diagrams for the signal processes are depicted in Fig. 1.

Fig. 1
figure 1

The leading order Feynman diagrams of the heavy resonance \({\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\) production through \({{\mathrm{q}}_{\mathrm{}}^{\mathrm{}}} {{\overline{{{{\mathrm{q}}_{\mathrm{}}^{\mathrm{}}}}}}} \) annihilation (upper) and vector boson fusion (lower), decaying to a \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson (\({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\)) and a Higgs boson (\({\mathrm{H}}_{\mathrm{}}^{\mathrm{}}\))

The search is performed by examining the distribution of the reconstructed mass (\(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\)) or transverse mass (\(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\)) of the heavy resonance for a localized excess of events. The main background normalization is determined from data in sideband regions (SBs) of the \(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}\) mass distribution, and extrapolated to the signal region (SR) through analytical functions derived from simulation.

2 The CMS detector

The CMS detector features a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. These detectors reside within a superconducting solenoid, which provides a magnetic field of 3.8\(\,\text {T}\). Forward calorimeters extend the pseudorapidity \(\eta \) coverage up to \(|\eta | < 5.2\). Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. A detailed description of the CMS detector, together with a definition of the coordinate system and the kinematic variables, can be found in Ref. [25].

Events of interest are selected using a two-tiered trigger system [26]. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100\(\,\text {kHz}\) within a fixed time interval of about 4\(\,\upmu \text {s}\). The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1\(\,\text {kHz}\) before data storage.

3 Data and simulated samples

The data samples used in this search were collected during the period 2016–2018, with the CMS detector at the LHC in proton–proton (\({{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} \)) collisions at a center-of-mass energy of 13\(\,\text {Te}\text {V}\), resulting in a combined integrated luminosity of 137\(\,\text {fb}^{-1}\).

The signal samples are generated at leading order (LO) through \({{\mathrm{q}}_{\mathrm{}}^{\mathrm{}}} {{\overline{{{{\mathrm{q}}_{\mathrm{}}^{\mathrm{}}}}}}} \) annihilation, taking the cross sections from HVT models A and B [19], or through VBF with the cross section from HVT model C, using the MadGraph 5_amc@nlo 2.4.2 [27] generator and the MLM matching scheme [28]. Different hypotheses for the heavy resonance mass in the range of 800–5000\(\,\text {Ge}\text {V}\) are considered, with the natural width of the resonance being negligible compared to the 4% detector resolution (the narrow-width approximation). The heavy resonance is forced to decay to a \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson and a Higgs boson, with the former decaying into a pair of charged leptons (\(\ell = {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \) or \({\upmu {}{}} \)) or neutrinos, including cascade decays involving tau leptons. There is no restriction on the decay channels for the Higgs boson and its decay particles, which decay according to the SM branching fractions.

The SM background for this search is dominated by \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) production, with the \({\mathrm{V}}_{\mathrm{}}^{\mathrm{}}\) boson decaying as \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \rightarrow \nu \nu \), \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \bar{{{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}}},{\upmu {}{}} \bar{{\upmu {}{}}},{\uptau {}{}} \bar{{\uptau {}{}}}\), or \({{\mathrm{W}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} \nu ,{\upmu {}{}} \nu ,{\uptau {}{}} \nu \). The \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background sample is produced with the MadGraph 5_amc@nlo generator at LO. The sample is further normalized to account for next-to-LO (NLO) in electroweak (EW) and next-to-NLO (NNLO) in quantum chromodynamics (QCD) corrections to the cross section from Ref. [29]. The top quark pair (\({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\)) and single top quark t-channel and \({{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}{{\mathrm{W}}_{\mathrm{}}^{\mathrm{}}} \) production are generated at NLO in QCD with the powheg 2.0 generator [30,31,32,33,34,35]. The \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\) samples are normalized to the cross section computed with Top++ 2.0 [36] at NNLO in QCD with next-to-next-to-leading logarithmic soft gluon resummation accuracy. The single top quark s-channel, \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \), and \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \) samples are simulated at NLO in QCD with the MadGraph 5_amc@nlo generator.

The NNPDF 3.0 [37] set of parton distribution functions (PDF) is used to simulate the hard process in all simulated samples for the 2016 data and the NNPDF 3.1 [38] set is used for 2017 and 2018. Parton showering and hadronization processes are performed with pythia 8.226 [39] with the CUETP8M1 [40, 41] underlying event tune for 2016, and pythia 8.230 with the CP5 [42] event tune for 2017 and 2018. The CUETP8M2 underlying event tune [43] is used to simulate \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\) production for 2016 samples. The CMS detector response simulation is performed with Geant4  [44]. Simulated samples are reconstructed with the same software as used for collision data. The data samples contain additional \({{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} \) interactions in the same or nearby bunch crossings (pileup). The simulated pileup description is reweighted to match the distribution of the pileup multiplicity measured in data.

4 Event reconstruction

Events in the CMS detector are reconstructed using the particle-flow (PF) algorithm [45], which combines information from all subdetectors in order to reconstruct stable particles (muons, electrons, photons, neutral and charged hadrons). Jets are reconstructed from PF candidates clustered with the anti-\(k_{\mathrm {T}}\) algorithm [46], with a distance parameter of 0.4 (AK4 jets) or 0.8 (AK8 jets), using the FastJet 3.0 package [47, 48]. Several vertices are reconstructed per bunch crossing. The candidate vertex with the largest value of summed physics-object \(p_{\mathrm {T}} ^2\) is taken to be the primary \({{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{p}}_{\mathrm{}}^{\mathrm{}}} \) interaction vertex. Here the physics objects are the AK4 jets, clustered using the jet finding algorithms with the tracks assigned to candidate vertices as inputs, and the associated \({\vec p}_{\mathrm {T}}^{\text {miss}}\) taken as the negative vector \(p_{\mathrm {T}}\) sum of those jets. Two different methods to remove contributions from pileup are used: for the AK4 jets, pileup is accounted for via the charged-hadron subtraction algorithm [49] in conjunction with the jet area method [50], while for the AK8 jets the pileup-per-particle identification algorithm [51] is employed. The jet energy resolution, after the application of corrections to the jet energy, is 4% at 1\(\,\text {Te}\text {V}\)  [52]. For the AK4 jets, \(p_{\mathrm {T}} > 30\,\text {Ge}\text {V} \) and \(|\eta | < 2.4\) are required, and jets within a cone of \(\varDelta R(j,\ell )=\sqrt{\smash [b]{\varDelta \eta (j,\ell )^2+\varDelta \phi (j,\ell )^2}}>0.4\) around isolated leptons are removed, where \(\phi \) is the azimuthal angle. The AK8 jets must satisfy \(p_{\mathrm {T}} > 200\) \(\,\text {Ge}\text {V}\) and \(|\eta | < 2.4\). The vector \({\vec p}_{\mathrm {T}}^{\text {miss}}\) is computed as the negative vector \(p_{\mathrm {T}}\) sum of all the PF candidates in an event. The \({\vec p}_{\mathrm {T}}^{\text {miss}}\) is corrected for adjustments to the energy scale of the reconstructed AK4 jets in the event, and its magnitude is denoted as \(p_{\mathrm {T}} ^\text {miss}\)  [53]. The observable \(H_{\mathrm {T}}^{\text {miss}}\) is defined as the magnitude of the vector \(p_{\mathrm {T}}\) sum of all AK4 jets with \(p_{\mathrm {T}} > 30\,\text {Ge}\text {V} \) and \(|\eta | < 3.0\).

For each AK8 jet a groomed jet mass (\(m_j\)) is calculated, after applying a modified mass-drop algorithm [54, 55]. The mass-drop algorithm used here is known as the soft-drop algorithm [56], with parameters \(\beta =0\), \(z_\text {cut}=0.1\), and \(R_0 = 0.8\). Subjets are obtained by reverting the last step of the jet clustering and selecting the two with the highest \(p_{\mathrm {T}}\). The groomed jet mass is calibrated in a \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\) sample enriched in hadronically decaying \({\mathrm{W}}_{\mathrm{}}^{\mathrm{}}\) bosons [57].

The identification of jets that originate from \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) quarks is performed with the DeepCSV algorithm [58], which is based on a deep neural network with information on tracks and secondary vertices associated with the jet as inputs. The DeepCSV algorithm is applied to AK4 jets and the two highest \(p_{\mathrm {T}}\) AK8 subjets. A jet is considered as \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tagged if the output discriminator value is larger than a defined threshold, corresponding to a 75% \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tagging efficiency with a probability for mistagging jets originating from the hadronization of gluons or \({\mathrm{u}}_{\mathrm{}}^{\mathrm{}}\)/\({\mathrm{d}}_{\mathrm{}}^{\mathrm{}}\)/\({\mathrm{s}}_{\mathrm{}}^{\mathrm{}}\) quarks of about 3%. The simulated samples are reweighted to account for small differences in the \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tagging efficiency from values obtained in data.

Electrons are reconstructed from ECAL energy deposits in the range \(|\eta |<2.5\) that are matched to tracks reconstructed in the silicon tracker. The electrons are identified taking into account the distribution of energy deposited along the electron trajectory, the direction and momentum of the track, and its compatibility with the primary vertex [59]. Electrons are required to pass an isolation requirement. The isolation is defined as the \(p_{\mathrm {T}}\) sum of all particles within a cone of \(\varDelta R = 0.3\) around the electron track, after the contributions from the electron itself, other nearby electrons, and pileup are removed. The electron reconstruction efficiency is larger than 88%.

Muons are reconstructed within the acceptance of \(|\eta | < 2.4\) by matching tracks in the silicon tracker and charge deposits (hits) in the muon spectrometer. Muon candidates are identified via selection criteria based on the compatibility of tracks reconstructed from only silicon tracker information with tracks reconstructed from a combination of the hits in both the tracker and muon detector. Additional requirements are based on the compatibility of the trajectory with the primary vertex, and on the number of hits observed in the tracker and muon systems. Muons are required to be isolated by imposing a limit on the \(p_{\mathrm {T}}\) sum of all the reconstructed tracks within a cone \(\varDelta R = 0.4\) around the muon direction, excluding the tracks attributed to muons, divided by the muon \(p_{\mathrm {T}}\). The efficiency to reconstruct and identify muons is larger than 96% [60].

Hadronically decaying \(\tau \) leptons (\({\uptau {}{}} _\mathrm {h}\)) are reconstructed by combining one or three charged particles with up to two neutral pion candidates. The selection criteria for the \({\uptau {}{}} _\mathrm {h}\) candidates, which are used to veto various backgrounds, are \(p_{\mathrm {T}} > 18\,\text {Ge}\text {V} \), \(|\eta | < 2.3\), and \(\varDelta R > 0.4\), where \(\varDelta R\) is a candidate’s separation from isolated electrons and muons in the event [61].

5 Event selection

Events are divided into categories depending on the number and flavor of the reconstructed leptons, the number of \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\)-tagged subjets of the Higgs candidate jet (\(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}\)), and the presence of forward jets consistent with originating from VBF processes. In total, 12 categories are defined and listed in Table 1.

Table 1 List of the 12 event categories used in the analysis

The highest \(p_{\mathrm {T}}\) AK8 jet in the event is assigned to \(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}\), and is required to have a transverse momentum \(p_{\mathrm {T}} ^{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}> 200\,\text {Ge}\text {V} \) and \(|\eta | < 2.4\). This is the correct jet choice in 96% of the simulated signal events. The minimal separation between \(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}\) and isolated leptons from the \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson decay is required to satisfy \(\varDelta R(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}},\ell ) > 0.8\). The mass of the \(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}\) jet is required to be compatible with the \({\mathrm{H}}_{\mathrm{}}^{\mathrm{}}\) mass (\(105<m_{j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}} <135\,\text {Ge}\text {V} \)). It can have 0, 1, or 2 subjets that pass the \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tagging selection. If both subjets are \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tagged, the event belongs to the 2\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag category, otherwise it is assigned to the \(\le \)1\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag category.

The \(0\ell \) categories require \(p_{\mathrm {T}} ^\text {miss} > 250\,\text {Ge}\text {V} \), originating from the Lorentz-boosted \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson decaying to two neutrinos, which leave the detector unobserved. Data are collected using trigger selections that require \(p_{\mathrm {T}} ^\text {miss} > 110\,\text {Ge}\text {V} \), calculated with or without considering muons, or \(H_{\mathrm {T}}^{\text {miss}} > 110\,\text {Ge}\text {V} \). The minimal azimuthal angular separation between all AK4 jets and the \({\vec p}_{\mathrm {T}}^{\text {miss}}\) vector has to satisfy \(\varDelta \phi (j,{\vec p}_{\mathrm {T}}^{\text {miss}}) > 0.5\) in order to suppress multijet production. The azimuthal angular separation between \(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}\) and \({\vec p}_{\mathrm {T}}^{\text {miss}}\) must satisfy \(\varDelta \phi (j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}},{\vec p}_{\mathrm {T}}^{\text {miss}}) > 2\). Events arising from detector noise are removed by requiring that the fractional contribution of charged hadron candidates to the \({\mathrm{H}}_{\mathrm{}}^{\mathrm{}}\) momentum be larger than 0.1, and the ratio \(p_{\mathrm {T}} ^\text {miss}/p_{\mathrm {T}} ^{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \) be larger than 0.6. Events with isolated leptons with \(p_{\mathrm {T}} > 10\,\text {Ge}\text {V} \) or hadronically decaying \(\tau \) leptons with \(p_{\mathrm {T}} > 18\) \(\,\text {Ge}\text {V}\) are removed in order to reduce the contribution from other SM processes. The \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\) contribution is reduced by removing events with an additional \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\)-tagged AK4 jet not overlapping with \(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}\) such that \(\varDelta R(j,j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}})>1.2\) is satisfied. Since the resonance mass cannot be reconstructed because of the presence of undetected decay products, the \(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}\) momentum and the \({\vec p}_{\mathrm {T}}^{\text {miss}}\) are used to compute the transverse mass \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}} = \sqrt{\smash [b]{2p_{\mathrm {T}} ^\text {miss} p_{\mathrm {T}} ^{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} (1-\cos \varDelta \phi ({\vec p}_{\mathrm {T}}^{\text {miss}},{\vec p}_{\mathrm {T}} ^{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}))}}\). In the VBF category, the condition \(|\eta _{j_{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}} |<1.1\) is applied on the \(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}} \) to reduce the contribution of events where the measured \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}} \) is significantly below \(m_{{{\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}}} \).

For the 2\({\mathrm{e}}_{\mathrm{}}^{\mathrm{}}\) categories, data are collected using an electron trigger that requires either an isolated electron with \(p_{\mathrm {T}} > 35\,\text {Ge}\text {V} \) or a nonisolated electron with \(p_{\mathrm {T}} > 115\,\text {Ge}\text {V} \). In the 2\(\upmu \) categories, a muon trigger that requires a nonisolated muon with \(p_{\mathrm {T}} > 50\,\text {Ge}\text {V} \) is used to collect data. For both the 2\({\mathrm{e}}_{\mathrm{}}^{\mathrm{}}\) and 2\(\upmu \) categories, the two selected leptons must have opposite charge, \(p_{\mathrm {T}} > 55\) and 20\(\,\text {Ge}\text {V}\), respectively, and should be isolated from other activity in the event, except for each other. The \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson candidates are required to have a dilepton invariant mass in the range 70–110\(\,\text {Ge}\text {V}\), and \(p_{\mathrm {T}} > 200\,\text {Ge}\text {V} \). The \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson mass window is large compared with the dilepton mass resolution, which is 3 (4)% for an electron (muon) pair. A more stringent selection would decrease both the signal and the \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background selection efficiency by the same amount, thus reducing the signal sensitivity. The separation between the \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson candidate and \(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}\) is required to be \(\varDelta R(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}},Z) > 2\) for all categories, and \(|\varDelta \eta (j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}},Z) | < 1.7\) additionally for the non-VBF categories, to further reduce the \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background.

Candidate VBF events are selected in both the \(0\ell \) and \(2\ell \) categories by requiring two additional AK4 jets (j) with \(|\eta _j |<5\) that satisfy \(\varDelta R(j,j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}})>1.2\) in order to avoid overlap with the \(j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}\), have \(\eta _j\) values of opposite sign, a dijet mass \(m_{jj} > 500\,\text {Ge}\text {V} \), and that satisfy a separation \(\varDelta \eta _{jj} > 4\). The two AK4 jets with the highest dijet mass are selected.

A further requirement is to have either \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\) or \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\) larger than 1200\(\,\text {Ge}\text {V}\) for the \(\le \)1\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag, non-VBF categories, and larger than 750\(\,\text {Ge}\text {V}\) for the other categories to ensure the smoothness of the background model. The product of the signal geometrical acceptance and the selection efficiency, reported in Fig. 2, is calculated for the \(0\ell \) category with the denominator being the \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) decay to neutrinos, and for the \(2\ell \) categories with the denominator being the \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) decay to electrons, muons and tau leptons.

Fig. 2
figure 2

The product of signal acceptance and efficiency in the \(0\ell \) (left column) and \(2\ell \) (right column) categories for the signal produced via \({{\mathrm{q}}_{\mathrm{}}^{\mathrm{}}} {{\overline{{{{\mathrm{q}}_{\mathrm{}}^{\mathrm{}}}}}}} \) annihilation (upper row) and vector boson fusion (lower row)

6 Background estimation and signal modeling

The most important SM background is vector boson production in association with \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\)-tagged jets (\({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\)). The \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background is estimated using control samples in data to reduce the dependence on simulation. Minor SM backgrounds are \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\) and single top quark processes, SM diboson production (\({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \)), and SM \({\mathrm{H}}_{\mathrm{}}^{\mathrm{}}\) production in association with a vector boson (\({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \)), all of which are estimated based on simulation. The SM \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \) production is considered as a background in this analysis. However, this process can be distinguished from the signal because of the non-resonant distribution in the \({{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \) invariant mass and by the softer \(p_{\mathrm {T}}\) spectra of the \({\mathrm{H}}_{\mathrm{}}^{\mathrm{}}\) and \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) bosons. The jet mass distribution is split into a signal-enriched region (SR) with \(105<m_{j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}} <135\,\text {Ge}\text {V} \), and low-mass and high-mass sidebands (SB) with \(30<m_{j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}} <65\,\text {Ge}\text {V} \) (LSB) and \(135<m_{j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}} <250\,\text {Ge}\text {V} \) (HSB), respectively. The jet mass range \(65<m_{j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}} <105\,\text {Ge}\text {V} \), a region enriched with boosted vector bosons (VR), is excluded and kept blinded in order to avoid potential contamination from a \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \) resonant signal, which is the subject of dedicated searches [16, 62, 63]. The background estimation consists of two separate steps to determine, first, the number of events and, second, the distribution of the main background in the SR.

Table 2 Scale factors derived for the normalization of the \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\) and single top quark backgrounds for different event categories. Uncertainties due to the limited size of the event samples (stat.) and systematic effects (syst.) are reported as well. The scale factors of the 2\({\mathrm{e}}_{\mathrm{}}^{\mathrm{}}\) and 2\(\upmu \) categories are derived using the \(1{{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} 1{\upmu {}{}} \) top quark control region as described in the text

6.1 Background normalization

The three groups of backgrounds (\({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\), \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\) and single top quark, and \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \) and \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \)) are considered separately, since each group has different physical properties leading to a different shape of the jet mass distribution. An appropriate analytical function is chosen to describe the background in each case. The \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background’s Higgs candidate jet mass has a smoothly falling shape with no peaks, therefore Chebyshev polynomials of order 1–4 are chosen to model the distribution observed in data. The \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \) and \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \) backgrounds have two peaks in the jet mass distribution, corresponding to the \({\mathrm{W}}_{\mathrm{}}^{\mathrm{}}\) and \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) bosons, and the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \) background an additional peak due to the Higgs boson. The \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\) and single top quark backgrounds are considered together, because they both have two peaks corresponding to \({{\mathrm{W}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {{\mathrm{q}}_{\mathrm{}}^{\mathrm{}}} {{\overline{{{{\mathrm{q}}_{\mathrm{}}^{\mathrm{}}}}}}} '\) decays and all-hadronic top quark decays \({{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}\rightarrow {{\mathrm{W}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{b}}_{\mathrm{}}^{\mathrm{}}} \rightarrow {{\mathrm{q}}_{\mathrm{}}^{\mathrm{}}} {{\overline{{{{\mathrm{q}}_{\mathrm{}}^{\mathrm{}}}}}}} '{{\mathrm{b}}_{\mathrm{}}^{\mathrm{}}} \).

Fig. 3
figure 3

Fit to the \(m_{j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}}\) distribution in data in the 2\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag (left column) and \(\le \)1\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag (right column) non-VBF categories, for \(0\ell \) (upper row), 2\({\mathrm{e}}_{\mathrm{}}^{\mathrm{}}\) (middle row), and 2\(\upmu \) (lower row). The shaded bands around the total background estimate represent the uncertainty from the fit to data in the jet mass SBs. The observed data are indicated by black markers. The vertical shaded band indicates the VR region, which is blinded and not used in the fit to avoid potential contamination from \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \) resonant signals. The dashed vertical lines separate the LSB, VR, SR, and HSB. The bottom panel shows \((N^{\text {data}}-N^{\text {bkg}})/\sigma \) for each bin, where \(\sigma \) is the statistical uncertainty in data. In the \(\le \)1\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag, non-VBF categories, \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\) or \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\) are required to be larger than 1200\(\,\text {Ge}\text {V}\) to ensure the smoothness of the background model

Fig. 4
figure 4

Fit to the \(m_{j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}}\) distribution in data in the 2\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag (left column) and \(\le \)1\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag (right column) VBF categories, for \(0\ell \) (upper row), 2\({\mathrm{e}}_{\mathrm{}}^{\mathrm{}}\) (middle row), and 2\(\upmu \) (lower row). The shaded bands around the total background estimate represent the uncertainty from the fit to data in the jet mass SBs. The observed data are indicated by black markers. The observed data are indicated by black markers. The vertical shaded band indicates the VR region, which is blinded and not used in the fit to avoid potential contamination from \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \) resonant signals. The dashed vertical lines separate the LSB, VR, SR, and HSB. The bottom panel shows \((N^{\text {data}}-N^{\text {bkg}})/\sigma \) for each bin, where \(\sigma \) is the statistical uncertainty in data

The normalization of the simulated top quark background is corrected with a scale factor (SF) determined in high-purity top quark control regions. In the \(0\ell \) category, the control region is defined by the veto on the additional \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\)-tagged AK4 jet being inverted. In the \(2\ell \) categories, control region data are collected using the same trigger as for the 2\({\mathrm{e}}_{\mathrm{}}^{\mathrm{}}\) signal region, with a requirement that lepton flavors and charges are different, resulting in a \(1{{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} 1{\upmu {}{}} \) region, where the leptons must have a combined invariant mass \(m_{{{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu {}{}}} > 110\) \(\,\text {Ge}\text {V}\) and a vector sum \(p_{\mathrm {T}} ^{{{\mathrm{e}}_{\mathrm{}}^{\mathrm{}}} {\upmu {}{}}} > 120\) \(\,\text {Ge}\text {V}\). Multiplicative SFs are calculated from the ratio of the event yield between data and simulation and are applied to the simulated samples in the SR. The uncertainties in the top quark SFs originate from the limited event count in the top quark control region and the extrapolation from the top quark control region to the SR. The systematic uncertainty in the \(0\ell \) category is derived by varying the \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tagging SF. For the \(2\ell \) categories the uncertainties in the electron and muon identification are taken into account. The electron and muon trigger uncertainties only affect the 2\(\upmu \) and not the 2\({\mathrm{e}}_{\mathrm{}}^{\mathrm{}}\) category because the electron trigger is used to provide the control region while the muon trigger is used to select the signal region. A normalization uncertainty is applied to the VBF categories to account for the limited event counts in these control regions. The normalization uncertainty is taken as the deviation of the top quark SF from unity as shown in Table 2.

The background model, composed of the sum of the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\), \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\) and single top quark, and the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \) and \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \) templates is fitted to the SBs of the jet mass distribution in data. The analytical function parameters and the normalization of the top quark and \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \) backgrounds are fixed from the fit to simulation, but the shape parameters from the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background are not. The number of parameters for the fit to data is determined by a Fisher F-test [64]. The number of expected events is derived from the integral of the fitted model in the SR. The choice of the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) fit function induces a systematic uncertainty, which can be determined by fitting the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background shape with an alternative function, consisting of the sum of an exponential and a Gaussian function, and considering the difference between the integrals of the two fit models in the SR as a systematic uncertainty. Figures 3 and 4 show the fits to the jet mass in the different categories. Table 3 summarizes the expected background yield in the SR.

Table 3 The expected and observed numbers of background events in the signal region for all event categories. The \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background uncertainties originate from the variation of the parameters within the fit uncertainties (fit) and the difference between the nominal and alternative function choice for the fit to \(m_{j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}}\) (alt). The \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\) and single top quark uncertainties arise from the \(m_{j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}}\) modeling, the statistical component of the top quark SF uncertainties, and the extrapolation uncertainty from the control region to the SR. The \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \) and \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \) normalization uncertainties come from the \(m_{j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}}\) modeling
Fig. 5
figure 5

Distributions in data in the 2\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag (left column) and \(\le \)1\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag (right column) non-VBF categories, of \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\) for \(0\ell \) (upper row), and \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\) for 2\({\mathrm{e}}_{\mathrm{}}^{\mathrm{}}\) (middle row), and 2\(\upmu \) (lower row). The distributions are shown up to 4000\(\,\text {Ge}\text {V}\), which corresponds to the event with the highest \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\) or \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\) observed in the SR. The shaded bands represent the uncertainty from the background estimation. The observed data are represented by black markers, and the potential contribution of a resonance produced in the context of the HVT model B at \(m_{{{\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}}} =2000\,\text {Ge}\text {V} \) is shown as a dotted red line. The bottom panel shows \((N^{\text {data}}-N^{\text {bkg}})/\sigma \) for each bin, where \(\sigma \) is the statistical uncertainty in data

6.2 Background distribution

The \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\) and \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\) distributions are estimated using the data in the jet mass SBs. An \(\alpha \) function is then defined as the ratio of the two functions describing the simulated \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\) (or \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\)) shape in the SR and SB region of the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background:

$$\begin{aligned} \alpha (m) = \frac{N_{\text {SR}}^{{{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}}(m)}{N_{\text {SB}}^{{{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}}(m)}, \end{aligned}$$
(1)

where N denotes the function and m represents either \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\) or \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\). The functions are normalized to the number of events derived in Sect. 6.1 and shown in Table 3.

The \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background shape in the SR is thus estimated as the product of \(\alpha (m)\) and the shape in the data SBs after subtracting the corresponding top quark and \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \) contributions:

$$\begin{aligned} N_{\text {SR}}^{{{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}}(m) = \left[ N_{\text {SB}}^{\text {data}}(m) - N_{\text {SB}}^{\text {top}}(m) - N_{\text {SB}}^{\text {VV}}(m) \right] \alpha (m). \end{aligned}$$
(2)

Finally, the expected number of background events in the SR is derived by adding the top quark and \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \) contributions to the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background distribution and taking the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) normalization from the fit to data in the jet mass SBs:

$$\begin{aligned} N_{\text {SR}}^{\text {bkg}}(m) = N_{\text {SR}}^{{{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}}(m) + N_{\text {SR}}^{\text {top}}(m) + N_{\text {SR}}^{\text {VV}}(m). \end{aligned}$$
(3)

The observed data, along with the expected backgrounds, are reported for each category in Figs. 5 and 6.

The background estimation method is validated by splitting the LSB in two regions: \(30<m_{j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}} <50\,\text {Ge}\text {V} \) and \(50<m_{j_{{{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}}}} <65\,\text {Ge}\text {V} \). The first one is used as a new LSB and the second one as a proxy for the SR. The data yields and distributions are found to be compatible with the expectation in all categories.

Fig. 6
figure 6

Distributions in data in the 2\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag (left column) and \(\le \)1\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag (right column) VBF categories, of \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\) for \(0\ell \) (upper row), and \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\) for 2\({\mathrm{e}}_{\mathrm{}}^{\mathrm{}}\) (middle row), and 2\(\upmu \) (lower row). The distributions are shown up to 4000\(\,\text {Ge}\text {V}\), which corresponds to the event with the highest \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\) or \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\) observed in the SR. The shaded bands represent the uncertainty from the background estimation. The observed data are represented by black markers, and the potential contribution of a resonance produced in the context of the HVT model C at \(m_{{{\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}}} =2000\,\text {Ge}\text {V} \) is shown as a dotted red line. The bottom panel shows \((N^{\text {data}}-N^{\text {bkg}})/\sigma \) for each bin, where \(\sigma \) is the statistical uncertainty in data

6.3 Signal modeling

In order to build a template for the signal extraction, the simulated signal mass points are fitted in the SR with the Crystal Ball function [65], which consists of a Gaussian core and a power-law function that describes the low-end tail below a certain threshold. The parameterization for intermediate mass points is determined by linearly interpolating the shape parameters derived by fitting the generated mass points.

7 Systematic uncertainties

The systematic uncertainty in the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background is dominated by the statistical uncertainty of the number of data events in the SBs. The systematic uncertainties in the shape of the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background are estimated from the covariance matrix of the simultaneous fit of the \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\) and \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\) distributions in data in the SBs, and in simulated \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \text {+jets}\) background events in the signal and SB regions. Most of the effect of the uncertainties is correlated among the SB and SR, and cancels out in the \(\alpha \) ratio. The \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\) and \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \) background shape uncertainties are propagated from the covariance matrix of the fit to the simulation in the SR. The statistical treatment is consistent with Ref. [16].

The uncertainty in the top quark background normalization originates from a limited event count in data and simulated event samples in the control regions, and from the variations on the requirements of lepton selection, \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tagging SFs, and the VBF selection used to select events in the control region. The uncertainties are reported in Table 2. The uncertainties in the trigger, identification, and isolation efficiencies of leptons affect the normalization and shape of the simulated signal and diboson background. The uncertainties are evaluated by moving the SFs, derived as the efficiency in data over the efficiency in simulation, up and down by one standard deviation, and amount to 1–7%.

The lepton scale and resolution affect both shape and normalization of the signal, leading to an uncertainty of 1–3%. The uncertainty from the effect of the \(p_{\mathrm {T}} ^\text {miss}\) scale and resolution on the normalization of the signal and \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \),\({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \) background is 1%. The jet energy scale and resolution uncertainties amount to a 1% systematic uncertainty in the normalization and a shape variation in the distribution of the signal and diboson background events. The uncertainty in the jet mass scale (resolution) adds a contribution of 0.6 (9.0)%) to the uncertainty in the signal and the diboson background normalization. The jet mass scale and resolution depend on the choice of the parton shower model, which affects the Higgs boson tagging and leads to an additional uncertainty of 6% in the signal normalization. The uncertainty was evaluated by using herwig++ 2.7.1 [66] as an alternative showering algorithm. The impact of the \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tagging systematic uncertainty in the signal efficiency depends on the mass of the resonance and has a range of 4–15% for the 2\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag categories and 1–6% for the \(\le \)1\({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag categories. The uncertainty is treated as anti-correlated between the two \({\mathrm{b}}_{\mathrm{}}^{\mathrm{}}\) tag categories.

The event yields and acceptances are affected by the choice of the parton distribution functions (PDFs) and the QCD factorization and renormalization scale uncertainties. The effects of the PDF choice on the acceptance and normalization of the \({\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\) signal are derived according to the PDF4LHC recommendations [67] and amount to 0.5% in the acceptance and 8–30% in the normalization of the signal, 0.2% in the acceptance and 4.7% in the normalization of the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \),\({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \) background, and 0.1% in the acceptance and 0.1% in the normalization of the \({\mathrm{t}}_{\mathrm{}}^{\mathrm{}}\) \(\overline{{{{\mathrm{t}}_{\mathrm{}}^{\mathrm{}}}}}\) background. The factorization and renormalization scale uncertainties are 3–15%, depending on the resonance mass for the signal, 18.9% for the \({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} \),\({{\mathrm{V}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \) background, and 1% for the extrapolation of the top quark SFs to the SR.

The darkening of ECAL crystals, due to radiation damage, leads to a gradual timing shift, which was not properly propagated to the level 1 trigger for 2016 and 2017 [68]. This effect is accounted for by adding a 1% systematic uncertainty in the signal normalization. Additional systematic uncertainties come from estimations of the pileup contribution and the integrated luminosity [69,70,71]. A list of all systematic uncertainties is given in Table 4.

Table 4 Summary of systematic uncertainties for the background and signal samples. The entries labeled with \(\dagger \) are also propagated to the shapes of the distributions. Uncertainties marked with \(\ddagger \) impact the signal cross section. Uncertainties in the same line are treated as correlated. All uncertainties except for in the integrated luminosity are considered correlated across the three years of data taking

8 Results

Results are obtained from a combined profile likelihood fit to the unbinned \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\) and \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\) distributions of signal and background, shown in Figs. 5 and 6. Systematic uncertainties are treated as nuisance parameters and are profiled in the statistical interpretation [72,73,74]. The uncertainties in the signal normalization that are derived from the signal cross section are not profiled in the likelihood, and are reported separately as the uncertainty band of the theoretical cross section. The statistical methods, including the treatment of the nuisance parameters, are described in more detail in Ref. [16].

The background-only hypothesis is tested against a hypothesis also considering \({{\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}} \rightarrow {{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} \) signal in all categories. A modified frequentist method is used to determine 95% confidence level (CL) upper limits on the product of cross section and branching fraction as a function of \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\), in which the distribution of the profile likelihood test statistic is derived using an asymptotic approximation [75].

The exclusion limits on the product of resonance cross section and branching fraction \(\mathcal {B}({{\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}} \rightarrow {{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} )\) are reported as a function of the resonance mass in Fig. 7 for all categories, separately for the non-VBF and the VBF signals. The \(2\ell \) categories dominate the sensitivity for heavy resonance masses smaller than 1\(\,\text {Te}\text {V}\) because of the smaller backgrounds combined with the better experimental resolution; at larger masses, the \(0\ell \) categories are more sensitive thanks to the larger branching fraction of the \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson to neutrinos. The exclusion limits are shown up to 4.6\(\,\text {Te}\text {V}\), which corresponds to the event with the highest \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}\) or \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}}^{\text {T}}\) observed either in the SB or SR.

Fig. 7
figure 7

Observed and expected 95% CL upper limit on \(\sigma \mathcal {B}({{\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}} \rightarrow {{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}} {{\mathrm{H}}_{\mathrm{}}^{\mathrm{}}} )\) with all categories combined, for the non-VBF signal (upper) and VBF signal (lower), including all statistical and systematic uncertainties. The inner green band and the outer yellow band indicate the regions containing 68 and 95%, respectively, of the distribution of expected limits under the background-only hypothesis. The solid curves and their shaded areas correspond to the product of the cross section and the branching fractions predicted by the HVT models A and B (upper) and HVT model C (lower), and their relative uncertainties. The CMS search for a heavy resonance using 2016 data and the same final state [14] is shown as a comparison

The largest excess for the non-VBF signal, corresponding to a local significance of 3 standard deviations, is observed at \(m_{{{\mathrm{X}}_{\mathrm{}}^{\mathrm{}}}} =1\,\text {Te}\text {V} \). A \({\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\) boson with a mass smaller than 3.5\(\,\text {Te}\text {V}\) is excluded at 95% CL in HVT model A, and a \({\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\) with mass smaller than 3.7\(\,\text {Te}\text {V}\) is excluded in model B. The upper limit of the excluded mass range is increased by 0.85 (0.87)\(\,\text {Te}\text {V}\) and 1.3 (1.4)\(\,\text {Te}\text {V}\)) in HVT model A (model B) compared to searches using 2016 data and the same final state by the ATLAS and CMS Collaborations, respectively [14, 15]. If the \({\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\) couples only to the SM bosons and is produced exclusively through VBF as in HVT model C, the data set analyzed is not large enough to exclude any range of mass. Upper limits on the product of the cross section and branching fraction are set between 23 and 0.3\(\,\text {fb}\) for a \({\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\) mass between 0.8 and 4.6\(\,\text {Te}\text {V}\), respectively.

The exclusion limit of the non-VBF signal shown in Fig. 7 (upper) can be interpreted as a limit in the space of the HVT model parameters [\(g_\text {V} c_\text {H} \), \(g^2c_\text {F}/g_\text {V} \)]. Combining all categories, the excluded region in such a parameter space for narrow resonances is shown in Fig. 8. The region of parameter space where the natural resonance width is larger than the typical experimental resolution of 4%, for which the narrow width assumption is not valid, is shaded.

Fig. 8
figure 8

Observed exclusion limit in the space of the HVT model parameters [\(g_\text {V} c_\text {H} \), \(g^2c_\text {F}/g_\text {V} \)], described in the text, for three different mass hypotheses of 2.0, 3.0, and 4.0\(\,\text {Te}\text {V}\) for the non-VBF signal. The shaded bands indicate the side of each contour that is excluded. The benchmark scenarios corresponding to HVT models A and B are represented by a purple cross and a red point, respectively. The region of the parameter space where the natural resonance width (\(\varGamma _{Z'}\)) is larger than the typical experimental resolution of 4%, for which the narrow-width approximation is not valid, is shaded in grey

9 Summary

A search for a heavy resonance with a mass between 0.8 and 5.0\(\,\text {Te}\text {V}\), decaying to a \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson and a Higgs boson, has been described. The data samples were collected by the CMS experiment in the period 2016–2018 at \(\sqrt{s}=13\,\text {Te}\text {V} \) and correspond to an integrated luminosity of 137\(\,\text {fb}^{-1}\). In the final states explored the \({\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}\) boson decays leptonically, resulting in events with either zero or two electrons or muons. Higgs bosons with a large Lorentz boost are reconstructed via their decays to hadrons. For models with a narrow spin-1 resonance, a new heavy vector boson \({\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\) with mass below 3.5 and 3.7\(\,\text {Te}\text {V}\) is excluded at 95% confidence level in models where the heavy vector boson couples predominantly to fermions and bosons, respectively. These are the most stringent limits placed on the Heavy Vector Triplet \({\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\) model to date. If the heavy vector boson couples exclusively to standard model bosons, upper limits on the product of the cross section and branching fraction are set between 23 and 0.3\(\,\text {fb}\) for a \({\mathrm{{{\mathrm{Z}}_{\mathrm{}}^{\mathrm{}}}}}_{\mathrm{}}^{\mathrm{\prime }}\) mass between 0.8 and 4.6\(\,\text {Te}\text {V}\), respectively. This is the first limit set on a heavy vector boson coupling exclusively to standard model bosons in its production and decay.