1 Introduction

The discovery of a Higgs boson at the CERN LHC [1, 2] and the measurement of its mass, spin, parity, and couplings [3, 4] raises the question of whether the Higgs boson sector consists of only one scalar doublet, which results in a single physical Higgs boson as expected in the standard model (SM), or whether additional bosons are involved in electroweak (EW) symmetry breaking.

The two-Higgs-doublet model (2HDM) [5] provides an extension of the SM Higgs boson sector introducing a second scalar doublet. The 2HDM is incorporated in supersymmetric models [6], axion models [7], and may introduce additional sources of explicit or spontaneous CP violation that explain the baryon asymmetry of the universe [8]. Various formulations of the 2HDM predict different couplings of the two doublets to right-handed quarks and charged leptons: in the Type-I formulation, all fermions couple to only one Higgs doublet; in the Type-II formulation, the up-type quarks couple to a different doublet than the down-type quarks and leptons; in the “lepton-specific” formulation, the quarks couple to one of the Higgs doublets and the leptons couple to the other; and in the “flipped” formulation, the up-type fermions and leptons couple to one of the Higgs doublets, while the down-type quarks couple to the other.

The two Higgs doublets entail the presence of five physical states: two neutral and CP-even bosons (h  and H, the latter being more massive), a neutral and CP-odd boson (\(\text {A}\)), and two charged scalar bosons (\(\text {H}^\pm \)). The model has two free parameters, \(\alpha \) and \(\tan \beta \), which are the mixing angle and the ratio of the vacuum expectation values of the two Higgs doublets, respectively. If \(\tan \beta \lesssim 5\), the dominant \(\text {A}\) boson production process is via gluon–gluon fusion, otherwise associated production with a b  quark-antiquark pair becomes significant. The diagrams of the two production modes are shown in Fig. 1. At small \(\tan \beta \) values the heavy pseudoscalar boson \(\text {A}\) may decay with a large branching fraction to a Z  and an h  boson, if kinematically allowed [5]. These models can be probed either with indirect searches, by measuring the cross section and couplings of the SM Higgs boson [9], or by performing a direct search for an \(\text {A}\) boson.

Fig. 1
figure 1

Representative Feynman diagrams of the production in the 2HDM of a pseudoscalar \(\text {A}\) boson via gluon–gluon fusion (upper) and in association with b  quarks (lower)

This paper describes a search for a heavy pseudoscalar \(\text {A}\) boson that decays to a Z  and an h  boson, both on-shell, with the Z  boson decaying to \(\ell ^+\ell ^-\) (\(\ell \) being an electron or a muon) or to a pair of neutrinos, and the h  boson to \(\text {b}\bar{\text {b}}\). The h  boson is assumed to be the 125\(\,\text {GeV}\) boson discovered at the LHC. In this search, the candidate \(\text {A}\) boson is reconstructed from the invariant mass of the visible decay products in events when the Z  boson decays to charged leptons, or is inferred through a partial reconstruction of the mass using quantities measured in the transverse plane when the Z  boson decays to neutrinos. The signal would emerge as a peak above the SM continuum of the four-body invariant mass (\(m_{\text {Z}\text {h}}\)) spectrum for the former decay mode and the transverse mass (\(m_{\text {Z}\text {h}}^{\text {T}}\)) for the latter. The signal sensitivity is maximized by exploiting the known value of the h  boson mass to rescale the jet momenta and significantly improve the \(m_{\text {Z}\text {h}}\) resolution. In addition, selections based on multivariate discriminators, exploiting event variables such as angular distributions, are used to optimize the signal efficiency and background rejection. This search is particularly sensitive to a pseudoscalar \(\text {A}\) boson with a mass smaller than twice the top quark mass and for small \(\tan \beta \) values. In this region of the 2HDM parameter space, the \(\text {A}\) boson cross section is larger than 1 pb, and the \(\text {A}\) boson decays predominantly to \(\text {Z}\text {h}\) [5].

With respect to the CMS search performed at \(\sqrt{s}=8\,\text {Te}\text {V} \) [10], this analysis benefits from the increased center-of-mass energy and integrated luminosity, includes final states with invisible decays of the Z  boson, increases the sensitivity to b  quark associated production, and extends the \(\text {A}\) boson mass (\(m_{\text {A}}\)) range from 600 to 1000\(\,\text {GeV}\). At larger \(m_{\text {A}}\), the angular separation between the b  quarks becomes small, and the Higgs boson is reconstructed as a single large-cone jet; the corresponding CMS analysis presents limits on the 2HDM from 800\(\,\text {GeV}\) to 2\(\,\text {Te}\text {V}\)  [11]. The ATLAS Collaboration has published a search probing \(\text {Z}\text {h}\) resonances with similar event selections based on a comparable data set, observing a mild excess near 440\(\,\text {GeV}\) in categories with additional b  quarks [12].

2 The CMS detector

A detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [13].

The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.

The silicon tracker measures charged particles within the pseudorapidity range \(|\eta | < 2.5\). It consists of 1440 silicon pixel and 15,148 silicon strip detector modules. For nonisolated particles with transverse momenta of \(1< p_{\mathrm {T}} < 10\,\text {GeV} \) and \(|\eta | < 1.4\), the track resolutions are typically 1.5% in \(p_{\mathrm {T}}\) and 25–90 (45–150)\(\,\upmu \text {m}\) in the transverse (longitudinal) impact parameter [14]. The ECAL provides coverage up to \(|\eta | < 3.0\), and the energy resolution for unconverted or late-converting electrons and photons in the barrel section is about 1% for particles that have energies in the range of tens of \(\,\text {GeV}\). The dielectron mass resolution for \(\text {Z}\rightarrow \hbox {e}^{+}\hbox {e}^{-}\) decays when both electrons are in the ECAL barrel is 1.9%, and is 2.9% when both electrons are in the endcaps [15]. The muon detectors covering the range \(|\eta |< 2.4\) make use of three different technologies: drift tubes, cathode strip chambers, and resistive-plate chambers. Combining muon tracks with matching tracks measured in the silicon tracker results in a \(p_{\mathrm {T}}\) resolution of 2–10% for muons with \(0.1< p_{\mathrm {T}} < 1\,\text {Te}\text {V} \) [16].

The first level of the CMS trigger system [17], composed of custom hardware processors, uses information from the calorimeters and muon detectors to select the most interesting events in a fixed time interval of less than 4\(\,\upmu \text {s}\). The high-level trigger (HLT) processor farm decreases the event rate from around 100 kHz to about 1 kHz, before data storage.

3 Event reconstruction

A global event reconstruction is performed with a particle-flow (PF) algorithm [18], which uses an optimized combination of information from the various elements of the detector to identify stable particles reconstructed in the detector as an electron, a muon, a photon, a charged or a neutral hadron. The PF particles have to pass the charged-hadron subtraction (CHS) algorithm [19], which discards charged hadrons not originating from the primary vertex, depending on the longitudinal impact parameter of the track. The primary vertex is selected as the vertex with the largest value of summed \(p_{\mathrm {T}} ^2\) of the PF particles, including charged leptons, neutral and charged hadrons clustered in jets, and the associated missing transverse momentum \({\vec {p}}_{\mathrm {T}}^{\text {miss}}\), which is the negative vector sum of the \({\vec {p}}_{\mathrm {T}}\) of those jets.

Electrons are reconstructed in the fiducial region \(|\eta |<2.5\) by matching the energy deposits in the ECAL with charged particle trajectories reconstructed in the tracker [15]. The electron identification is based on the distribution of energy deposited along the electron trajectory, the direction and momentum of the track, and its compatibility with the primary vertex of the event. Electrons are further required to be isolated from other energy deposits in the detector. The electron relative isolation parameter is defined as the sum of transverse momenta of all the PF candidates, excluding the electron itself, divided by the electron \(p_{\mathrm {T}}\). The PF candidates are considered if they lie within \(\varDelta R = \sqrt{\smash [b]{(\varDelta \eta )^2+(\varDelta \phi )^2}} < 0.3\) around the electron direction, where \(\phi \) is the azimuthal angle in radians, and after the contributions from pileup and other reconstructed electrons are removed [15].

Muons are reconstructed within the acceptance of the CMS muon systems using tracks reconstructed in both the muon spectrometer and the silicon tracker [16]. Additional requirements are based on the compatibility of the trajectory with the primary vertex, and on the number of hits observed in the tracker and muon systems. Similarly to electrons, muons are required to be isolated. The muon isolation is computed from reconstructed PF candidates within a cone of \(\varDelta R< 0.4\) around the muon direction, ignoring the candidate muon, and divided by the muon \(p_{\mathrm {T}}\)  [16].

Hadronically decaying \(\tau \) leptons are used to reject \(\hbox {W}\rightarrow \tau \nu \) background events, and are reconstructed by combining one or three hadronic charged PF candidates with up to two neutral pions, the latter also reconstructed by the PF algorithm from the photons arising from the \(\pi ^{0} \rightarrow \gamma \gamma \) decay [20].

Jets are clustered using the anti-\(k_{\mathrm {T}}\) algorithm [21, 22] with a distance parameter of 0.4. The contribution of neutral particles originating from pileup interactions is estimated to be proportional to the jet area derived using the FastJet package [22, 23], and subtracted from the jet energy. Jet energy corrections, extracted from both simulation and data in multijet, \(\gamma \)+jets, and \(\text {Z}\)+jets events, are applied as functions of the \(p_{\mathrm {T}}\) and \(\eta \) of the jet to correct the jet response and to account for residual differences between data and simulation. The jet energy resolution amounts typically to 15–20% at 30\(\,\text {GeV}\), 10% at 100\(\,\text {GeV}\), and 5% at 1\(\,\text {Te}\text {V}\)  [24].

Jets that originate from b  quarks are identified with a combined secondary vertex b-tagging algorithm [25] that uses the tracks and secondary vertices associated with the jets as inputs to a neural network. The algorithm provides a b  jet tagging efficiency of 70%, and a misidentification rate in a sample of quark and gluon jets of about 1%. The b  tagging efficiency is corrected to take into account a difference at the few percent level in algorithm performance for data and simulation [25].

4 Data and simulated samples

The data sample analyzed in this search corresponds to an integrated luminosity of 35.9\(\,\text {fb}^{-1}\) of proton–proton (pp) collisions at a center-of-mass energy of 13\(\,\text {Te}\text {V}\) collected with the CMS detector at the LHC. Data are collected using triggers that require either the presence of at least one isolated electron or isolated muon with \(p_{\mathrm {T}} >27\,\text {GeV} \), or alternatively a \(p_{\mathrm {T}} ^\text {miss}\) or \(H_{\mathrm {T}}^{\text {miss}}\) larger than 90–110\(\,\text {GeV}\), the value depending on the instantaneous luminosity. The \(p_{\mathrm {T}} ^\text {miss}\) is the magnitude of \({\vec {p}}_{\mathrm {T}}^{\text {miss}}\), and \(H_{\mathrm {T}}^{\text {miss}}\) is defined as the momentum imbalance of the jets in the transverse plane [17].

The pseudoscalar boson signal is simulated at leading order (LO) with the MadGraph 5_amc@nlo 2.2.2 matrix element generator [26] in both the gluon–gluon fusion and b  quark associated production modes according to the 2HDM [5], assuming a narrow signal width. The h  boson mass is set to 125\(\,\text {GeV}\), and the \(\text {A}\) boson mass ranges between 225 and 1000\(\,\text {GeV}\). The \(\text {A} \rightarrow \text {Z}\text {h}\)  decay is simulated with MadSpin [27]. The Higgs boson is forced to decay to \(\text {b}\bar{\text {b}}\),  and the vector boson to a pair of electrons, muons, \(\tau \) leptons, or neutrinos. In the gluon–gluon fusion production mode, up to one additional jet is included in matrix element calculations, and only the top quark contributes to the loop shown in Fig. 1 (upper). The 2HDM cross sections and branching fractions are computed at next-to-next-to-leading order (NNLO) with 2hdmc 1.7.0 [28] and SusHi 1.6.1 [29], respectively. The parameters used in the models are: \(m_{\text {h}} =125\,\text {GeV} \), \(m_{\text {H}} =m_{\text {H}^\pm }=m_{\text {A}} \), the discrete \(\mathrm {Z}_2\) symmetry is broken as in the minimal supersymmetric standard model (MSSM), and CP is conserved at tree level in the 2HDM Higgs sector [5]. The branching fractions of the Z  boson are taken from the measured values [30].

The SM backgrounds in this search consist of the inclusive production of a vector boson in association with other jets (\(\text {V+jets}\), with \(\text {V} =\hbox {W}\) or \(\hbox {Z}\), and \(\text {V}\) decaying to final states with charged leptons and neutrinos), and top quark pair production (\(\hbox {t}\bar{\hbox {t}}\)). \(\text {V+jets} \) events are simulated at LO with MadGraph 5_amc@nlo with up to four partons included in the matrix element calculations and using the MLM matching scheme [31]. The event yield is normalized to the NNLO cross section computed with fewz v3.1 [32]. The \(\text {V}\) boson \(p_{\mathrm {T}}\) spectra are corrected to account for next-to-leading order (NLO) quantum chromodynamics (QCD) and EW contributions [33]. The \(\hbox {t}\bar{\hbox {t}}\)  and single top quark in the t channel and \(t\hbox {W}\) production are simulated at NLO with powheg v2 generator [34,35,36]. The number of events for the top quark pair production process is rescaled according to the cross section computed with Top++ v2.0 [37] at NNLO+NNLL, and the transverse momenta of top quarks are corrected to match the distribution observed in data [38]. Other SM processes, such as SM vector boson pair production (\(\text {V} \text {V} \)), SM Higgs boson production in association with a vector boson (\(\text {V} \text {h}\)), single top quark (\(\hbox {t}+\hbox {X}\)) production in the s channel, and top quark production in association with vector bosons, are simulated at NLO in QCD with MadGraph 5_amc@nlo using the FxFx merging scheme [39]. The multijet contribution, estimated with the use of samples generated at LO with the same generator, is negligible after analysis selections.

All the simulated processes use the NNPDF 3.0 [40] parton distribution functions (PDFs), and are interfaced with pythia  8.205 [41, 42] for the parton showering and hadronization. The CUETP8M1 underlying event tune [43] is used in all samples, except for top quark pair production, which adopts the CUETP8M2T4 tune [44].

Additional minimum bias pp interactions within the same or adjacent bunch crossings (pileup) are added to the simulated processes, and events are weighted to match the observed average number of interactions per bunch crossing. Generated events are processed through a full CMS detector simulation based on Geant4 [45] and reconstructed with the same algorithms used for collision data.

5 Event selection

Events are classified into three independent categories (\(0\ell \), \(2\hbox {e}\), and \(2\mu \)), based on the number and flavor of the reconstructed leptons. Events are required to have at least two jets with \(p_{\mathrm {T}} >30\,\text {GeV} \) and \(|\eta |<2.4\) to be suitable candidates for the reconstruction of the \(\text {h}\rightarrow \text {b}\bar{\text {b}}\)   decay. If more than two jets fulfill the requirements, the ones with the largest b  tagging discriminator value are used to reconstruct the Higgs boson candidate. The efficiency of the correct assignment of the reconstructed jets to initial quarks originating from the Higgs boson decay varies between 80 and 97%, after applying the event selections, depending on the category and final state.

In the \(0\ell \) category, no isolated electron or muon with \(p_{\mathrm {T}} >10\,\text {GeV} \) is allowed. Events containing isolated hadronic decays of the \(\tau \) leptons with \(p_{\mathrm {T}} >18\,\text {GeV} \) are vetoed as well. A selection is applied on the reconstructed \(p_{\mathrm {T}} ^\text {miss}\), which is required to be larger than 200\(\,\text {GeV}\), such that the \(p_{\mathrm {T}} ^\text {miss}\) trigger is at least 95% efficient. In order to select a topology where the Z  boson recoils against the Higgs boson, a Lorentz boost requirement of \(200\,\text {GeV} \) on the \(p_{\mathrm {T}}\) of the Higgs boson candidate, \(p_{\mathrm {T}} ^{\text {b}\bar{\text {b}}}\), is applied.

Multijet production is suppressed by requiring that the minimum azimuthal angular separation between all jets and the missing transverse momentum vector must satisfy \(\varDelta \phi \text {(jet, } {\vec {p}}_{\mathrm {T}}^{\text {miss}}) > 0.4\). The multijet simulation is validated in a region obtained by inverting the \(\varDelta \phi \) selection, finding a good description of data. When the Z  boson decays to neutrinos, the resonance mass \(m_{\text {A}}\) cannot be reconstructed directly. In this case, \(m_{\text {A}}\) is estimated by computing the transverse mass from the \({\vec {p}}_{\mathrm {T}}^{\text {miss}}\) and the four-momenta of the two jets used to reconstruct the Higgs boson candidate, defined as \(m_{\text {Z}\text {h}}^{\text {T}} = \sqrt{\smash [b]{2 p_{\mathrm {T}} ^\text {miss} p_{\mathrm {T}} ^{\text {h}}\, [1-\cos {\varDelta \phi (\text {h}, {\vec {p}}_{\mathrm {T}}^{\text {miss}}) }]}}\), which has to be larger than 500\(\,\text {GeV}\). The efficiency of these selections for signal events with \(m_{\text {A}} \lesssim 500\,\text {GeV} \) is small, because the \(p_{\mathrm {T}}\) of the Z  boson is not sufficient to produce a \(p_{\mathrm {T}} ^\text {miss}\) large enough to pass the selection; thus, the contribution of the \(0\ell \) category is significant only for large \(m_{\text {A}}\).

In the \(2\hbox {e}\) and \(2\mu \) categories, events are required to have at least two isolated electrons or muons within the detector geometrical acceptance. The \(p_{\mathrm {T}}\) threshold on the lepton is referred to as \(p_{\mathrm {T}} ^\ell \), and is set to 30\(\,\text {GeV}\) for the lepton with highest \(p_{\mathrm {T}}\), and to 10\(\,\text {GeV}\) for the lepton with next-highest \(p_{\mathrm {T}}\). The Z  boson candidate is formed from the two highest \(p_{\mathrm {T}}\), opposite charge, same-flavor leptons, and must have an invariant mass \(m_{\ell \ell }\) between 70 and 110\(\,\text {GeV}\). The \(m_{\ell \ell }\) selection lowers the contamination from \(\hbox {t}\bar{\hbox {t}}\)  dileptonic decays, and significantly reduces the contribution from \(\text {Z}\rightarrow \tau \tau \) decays. The reconstructed \(p_{\mathrm {T}} ^\text {miss}\) also has to be smaller than 100\(\,\text {GeV}\) to reject the \(\hbox {t}\bar{\hbox {t}}\)  background. In order to maximize the signal acceptance, no Lorentz boost requirement is applied to the Z  and h  boson candidates in the dileptonic categories. The \(\text {A}\) boson candidate is reconstructed from the invariant mass \(m_{\text {Z}\text {h}}\) of the Z  and h  boson candidates.

If the two jets originate from a Higgs boson, their invariant mass is expected to peak close to 125\(\,\text {GeV}\). Events with a dijet invariant mass \(m_\mathrm {jj}\) between 100 and 140\(\,\text {GeV}\) enter the signal regions (SRs); otherwise, if \(m_\mathrm {jj} <400\,\text {GeV} \), they fall in dijet mass sidebands, which are used as control regions (CRs) to estimate the contributions of the main backgrounds. Signal regions are further divided by the number of jets passing the b  tagging requirement (1, 2, or at least 3 b  tags). The 3 b  tag category has been defined to select the additional b  quarks from b  quark associated production. In this region, at least one additional jet, other than the two used to reconstruct the h  boson, has to pass the kinematic selections and b  tagging requirements. The fraction of signal events passing the \(m_\mathrm {jj}\) selection in the SR is 66–82% and 45–65% in the 1 and 2 b  tag categories, respectively. Control regions for the Z+jets background share the same selections as the corresponding SR, except for the \(m_\mathrm {jj}\) mass window.

Dedicated CRs are defined to estimate the \(\hbox {t}\bar{\hbox {t}}\)  and \(\text {W+jets}\) backgrounds, which may enter the \(0\ell \) SR if the lepton originating from the W decay is outside the detector geometrical acceptance or is not reconstructed. Two \(\text {W+jets}\) CRs share the same selection as in the \(0\ell \) categories, but require exactly one electron or one muon passing the same trigger and selections of the leading lepton in the \(2\ell \) categories. In order to mimic the kinematics of leptonic W decays, where the lepton is outside the geometrical acceptance or is not reconstructed in the detector, the \(p_{\mathrm {T}} ^\text {miss}\) is recalculated by removing the contribution of the lepton. The \(\min (\varDelta \phi )\) requirement is removed, and the dijet invariant mass selection is not applied, as the signal is absent in \(1\ell \) final states. Events are required to have three or fewer jets, none of them b  tagged, to reduce the \(\hbox {t}\bar{\hbox {t}}\)  contribution.

Four different CRs associated with the production of events containing top quarks are defined by inverting specific selections with respect to the SR definition. Dileptonic \(\hbox {t}\bar{\hbox {t}}\)  control regions require the same selections as the \(2\hbox {e}\) and \(2\mu \) categories with two b  tags, but the dilepton invariant mass region around the nominal Z  boson mass is vetoed (\(50<m_{\ell \ell } <70\,\text {GeV} \) or \(m_{\ell \ell } >110\,\text {GeV} \)), and the \(m_\mathrm {jj}\) selection is dropped. Two additional top quark CRs are defined specifically for \(\hbox {t}\bar{\hbox {t}}\)  events where only one of the two W bosons decays into an electron or a muon, and the lepton is not reconstructed. These events contribute to the \(\hbox {t}\bar{\hbox {t}}\)  contamination in the \(0\ell \) categories. The two single-lepton top quark CRs have the same selections as the two \(\text {W+jets}\) CRs, but in this case the jet and b  tag vetoes are inverted to enrich the \(\hbox {t}\bar{\hbox {t}}\)  composition.

An important feature of the signal is that the two b  jets originate from the decay of the h  boson, whose mass is known with better precision than that provided by the \(\text {b}\bar{\text {b}}\)  invariant mass resolution. The measured jet \(p_{\mathrm {T}}\) values are therefore scaled according to their corresponding uncertainty given by the jet energy scale corrections to constrain the dijet invariant mass to \(m_\mathrm {jj} =125\,\text {GeV} \). The kinematic constraint on the h  boson mass improves the relative four-body invariant mass resolution from 5–6 to 2.5–4.5% for the smallest and largest values of \(m_{\text {A}}\), respectively. Similarly, in the \(2\ell \) channels, the electron and muon \(p_{\mathrm {T}}\) are scaled to a dilepton invariant mass \(m_{\ell \ell } = m_{\text {Z}}\). The effect on the \(m_{\text {A}}\) resolution of the kinematic constraint on the leptons is much smaller than the one of the jets, because of their better momentum resolution.

In the \(2\hbox {e}\) and \(2\mu \) categories, the \(\text {A}\) boson decay chain yields an additional characteristic, which helps distinguish it from SM background. Five helicity-dependent angular observables fully describe the kinematics of the \(\text {A} \rightarrow \text {Z}\text {h}\rightarrow \ell \ell \text {b}\bar{\text {b}}\) decay: the angle between the directions of the Z  boson and the beam in the rest frame of the \(\text {A}\) boson (\(\cos \theta ^*\)); the decay angle between the direction of the negatively charged lepton relative to the Z  boson momentum vector in the rest frame of the Z  boson (\(\cos \theta _1\)), which is sensitive to the transverse polarization of the Z  boson along its momentum vector; the angle between a jet from the h  boson and the h  boson momentum vector in the h  boson rest frame (\(\cos \theta _2\)); the angle between the Z  and h  boson decay planes in the rest frame of the \(\text {A}\) boson (\(\varPhi \)); the angle between the h  boson decay plane and the plane where the h  boson and the beam directions lie in the \(\text {A}\) boson rest frame (\(\varPhi _1\)). The discriminating power and low cross-correlation make these angles suitable as input to a likelihood ratio multivariate discriminator. This angular discriminant is defined as:

$$\begin{aligned} \mathcal {D} (x_1, \dots , x_N) = \frac{\displaystyle \prod _{i=1}^{N} s_i (x_i)}{\displaystyle \prod _{i=1}^{N} s_i (x_i)+\prod _{i=1}^{N} b_i (x_i)} \end{aligned}$$
(1)

where the index i runs from 1 to 5 and corresponds to the number N of angular variables \(x_i\), and \(s_i\) and \(b_i\) are the signal and Z+jets background probability density functions of the i-th variable, respectively. A selection of \(\mathcal {D}>0.5\) is applied in all \(2\hbox {e}\) and \(2\mu \) SRs and CRs, except those with three b  tags due to the low event count. This working point retains 80% of the signal efficiency and rejects 50% of the Z+jets background.

Considering that top quark pair production may be as large as 50% of the total background in certain regions of the parameter space, a second likelihood ratio discriminator is built specifically to reject the \(\hbox {t}\bar{\hbox {t}}\)  events. This discriminator uses only the \(m_{\ell \ell }\) and \(p_{\mathrm {T}} ^\text {miss}\) variables. The background probability density function considers only the top quark background in order to achieve the maximum separation between events with a genuine leptonically decaying Z  boson recoiling against a pair of jets and the more complex topologies such as \(\hbox {t}\bar{\hbox {t}}\)  decays. Selecting events with a discriminator output larger than 0.5 rejects 75% of the \(\hbox {t}\bar{\hbox {t}}\)  events with a signal efficiency of 85%. This selection is applied to the dileptonic SRs and to the Z+jets CRs.

The SRs and CRs selections are summarized in Table 1. The product of the signal acceptance and selection efficiency as a function of \(m_{\text {A}}\) is presented in Fig. 2 separately for the gluon–gluon fusion and b  quark associated production modes.

Table 1 Definition of the signal and control regions. In \(2\ell \) regions, the leptons are required to have opposite electric charge. The entries marked with \(\dagger \) indicate that the \(p_{\mathrm {T}} ^\text {miss}\) is calculated subtracting the four momentum of the lepton
Fig. 2
figure 2

Product of the signal acceptance and selection efficiency \(\varepsilon \) for an \(\text {A}\) boson produced via gluon–gluon fusion (left) and in association with b  quarks (right) as a function of \(m_{\text {A}}\). The number of events passing the signal region selections is denoted as \(N^\mathrm {SR}\), and \(N^\mathrm {gen}\) is the number of events generated before applying any selection

6 Systematic uncertainties

The uncertainties in the trigger efficiency and the electron, muon, and \(\tau \) lepton reconstruction, identification, and isolation efficiencies are evaluated through studies of events with dilepton invariant mass around the Z  boson mass, and the variation of the event yields with respect to the expectation from simulation amount to approximately 2–3% for the categories with charged leptons, and 1% in the \(0\ell \) categories [15, 16, 20]. The impact of the lepton energy and momentum scale and resolution is small after the kinematic constraint on \(m_{\ell \ell }\). The jet energy scale and resolution [24] affect both the selection efficiencies and the shape of the \(p_{\mathrm {T}} ^\text {miss}\) and \(m_{\text {Z}\text {h}}^{\text {T}}\) distributions, and are negligible in the \(2\ell \) channels after the kinematic constraint on the dijet mass has been applied. The jet four-momentum is varied by the corresponding uncertainties, and the effect is propagated to the final distributions. The jet energy scale is responsible for a 2–6% variation in the numbers of background and signal events; the jet energy resolution contributes an additional 1–2% uncertainty. The effects of jet energy scale and resolution uncertainties, as well as the energy variation of the unclustered objects in the event, are propagated to the \(p_{\mathrm {T}} ^\text {miss}\) and \(m_{\text {Z}\text {h}}^{\text {T}}\) distributions. The b  tagging uncertainty [25] in the signal yield depends on the jet \(p_{\mathrm {T}}\) and thus on the mass of the resonance, and the impact on the event yield ranges from 2 to 4% in the 1 b  tag category, 4 to 8% in the 2 b  tag category, and 8 to 12% in the 3 b  tag category.

The signal and background event yields are affected by the uncertainties on the choice of PDFs [46] and the factorization and renormalization scale uncertainties. The former are derived with SysCalc [47], and the latter are estimated by varying the corresponding scales up and down by a factor of two [48]. The effect of both these uncertainties can be as large as 30% depending on the generated signal mass. The effect of the PDF uncertainties on the signal and background lepton acceptance is estimated to be an average of 3% per lepton. The top quark background is also affected by the uncertainty associated with the simulated \(p_{\mathrm {T}}\) spectrum of top quarks [38], which results in up to a 14% yield uncertainty. The \(\text {V+jets} \) backgrounds are affected by the uncertainties on the QCD and EW NLO corrections, as described in Sect. 4.

A systematic uncertainty is assigned to the interpolation between the two mass sidebands to the SR, defined as the difference in the ratio between data and simulated background in the lower and upper sidebands, and ranges between 2 and 10% depending on the channel. The extrapolation to the 3 b  tag regions is covered by a large uncertainty (20–46%) assigned to the overall background normalization, and derived by taking the ratio between data and the simulation in the 3 b  tag control regions. In the dilepton categories, a dedicated uncertainty is introduced to cover for minor mismodeling effects. The background distribution is reweighted with a linear function of the event centrality (defined as the ratio between the sums of the \(p_{\mathrm {T}}\) and the energy of the two leptons and two jets in the rest frame of the four objects) in all simulated events, and the effect is propagated to the \(m_{\text {Z}\text {h}}\) distributions as a systematic uncertainty.

Additional systematic uncertainties affect the event yields of backgrounds and signal come from pileup contributions and integrated luminosity [49]. The uncertainty from the limited number of simulated events is treated as in Ref. [50]. A summary of the systematic uncertainties is reported in Table 2.

Table 2 Summary of statistical and systematic uncertainties for backgrounds and signal. The uncertainties marked with \(\checkmark \) are also propagated to the \(m_{\text {Z}\text {h}}\) and \(m_{\text {Z}\text {h}}^{\text {T}}\) distributions

7 Results and interpretation

The signal search is carried out by performing a combined signal and background maximum likelihood fit to the number of events in the CRs, and the binned \(m_{\text {Zh}}\) or \(m_{\text {Zh}}^{\text {T}}\) distributions in the SRs. Systematic uncertainties are treated as nuisance parameters and are profiled in the statistical interpretation [51,52,53]. The asymptotic approximation [54] of the modified frequentist \(\text {CL}_\text {s}\) criterion [51, 52] is used to determine limits on the signal cross section at 95% confidence level (\(\text {CL}\)). The background-only hypothesis is tested against the combined signal+background hypothesis in the nine categories, split according to the number and flavor of the leptons and number of b-tagged jets. The normalizations of the main backgrounds (\(\text {Z+jets}\), \(\text {Z}\)+b,  \(\text {Z}\)+\(\text {b}\bar{\text {b}}\), \(\hbox {t}\bar{\hbox {t}}\),   \(\hbox {W}\)+jets) are allowed to float in the fit, and are constrained in the CRs. The multiplicative scale factors for the main backgrounds determined by the fit are reported in Table 3, and the overall event yields in the CRs are shown in Fig. 3 before and after the fit. The expected and observed number of events in the SRs are reported in Table 4, and the \(m_{\text {Zh}}\) and \(m_{\text {Zh}}^{\text {T}}\) distributions are shown in Fig. 4.

Table 3 Scale factors for the main backgrounds, as derived by the combined fit in the background-only hypothesis, with respect to the event yield from simulated samples
Fig. 3
figure 3

Pre- (dashed gray lines) and post-fit (stacked histograms) numbers of events in the different control regions used in the fit. The label in each bin summarizes the control region definition, the selection on the number and flavor of the leptons, and the number of b-tagged jets. The bottom panel depicts the ratio between the data and the SM backgrounds

Fig. 4
figure 4

Distributions of the \(m_{\text {Z}\text {h}}^{\text {T}}\) variable in the \(0\ell \) categories (left) and \(m_{\text {Z}\text {h}}\) in the \(2\ell \) categories (right), in the 1 b  tag (upper), 2 b  tag (center), and 3 b  tag (lower) SRs. In the \(2\ell \) categories, the contribution of the \(2\hbox {e}\) and \(2\mu \) channels have been summed. The gray dotted line represents the sum of the background before the fit; the shaded area represents the post-fit uncertainty. The hatched red histograms represent signals produced in association with b  quarks and corresponding to \(\sigma _{\text {A}}\mathcal {B}(\text {A} \rightarrow \text {Z}\text {h})\mathcal {B}(\text {h}\rightarrow \text {b}\bar{\text {b}})=0.1~{pb}\). The bottom panels depict the pulls in each bin, \((N^\text {data}-N^\text {bkg})/\sigma \), where \(\sigma \) is the statistical uncertainty in data

Table 4 Expected and observed event yields after the fit in the signal regions. The dielectron and dimuon categories are summed together. The “–” symbol represents backgrounds with no simulated events passing the selections. The signal yields refer to pre-fit values corresponding to a cross section multiplied by \(\mathcal {B}(\text {A} \rightarrow \text {Z}\text {h}) \, \mathcal {B}(\text {h}\rightarrow \text {b}\bar{\text {b}})\) of 0.1 pb (gluon–gluon fusion for \(m_{\text {A}} =300\,\text {GeV} \), and in association with b  quarks for \(m_{\text {A}} =1000\,\text {GeV} \))

The data are well described by the SM processes. Upper limits are derived on the product of the cross section for a heavy pseudoscalar boson \(\text {A}\) and the branching fractions for the decays \(\text {A} \rightarrow \text {Z}\text {h}\) and \(\text {h}\rightarrow \text {b}\bar{\text {b}}\). The limits are obtained by considering the \(\text {A}\) boson produced via the gluon–gluon fusion and b  quark associated production processes separately, in the approximation where the natural width of the \(\text {A}\) boson \(\varGamma _\text {A} \) is smaller than the experimental resolution, and are reported in Fig. 5. An upper limit at 95% \(\text {CL}\) on the number of signal events is set on \(\sigma _\text {A} \,\mathcal {B}(\text {A} \rightarrow \text {Z}\text {h}) \, \mathcal {B}(\text {h}\rightarrow \text {b}\bar{\text {b}})\), excluding above 1 pb for \(m_{\text {A}}\) near the kinematic threshold, \({\approx }0.3~{pb}\) for \(m_{\text {A}} \approx 2 m_{\mathrm{t}}\), and as low as 0.02 pb at the high end (\(1000\,\text {GeV} \)) of the considered mass range. The sensitivity of the analysis is limited by the amount of data, and not by systematic uncertainties. These results extend the search for a 2HDM pseudoscalar boson \(\text {A}\) for mass up to 1\(\,\text {Te}\text {V}\), which is a kinematic region previously unexplored by CMS in the 8\(\,\text {Te}\text {V}\) data analysis [10]. When \(m_{\text {A}}\) is larger than 1\(\,\text {Te}\text {V}\), the CMS analysis with merged jets [11] retains a better sensitivity. The sensitivity is comparable to the ATLAS search [12], which observed a mild local (global) excess of 3.6 (2.4) standard deviations corresponding to \(m_{\text {A}} \approx 440\,\text {GeV} \) in final states with \(2\mu \) and 3 or more b-tagged jets. A slight deficit is observed by CMS in the corresponding region.

Fig. 5
figure 5

Observed (solid black) and expected (dotted black) 95% \(\text {CL}\) upper limits on \(\sigma _\text {A} \,\mathcal {B}(\text {A} \rightarrow \text {Z}\text {h})\,\mathcal {B}(\text {h}\rightarrow \text {b}\bar{\text {b}})\) for an \(\text {A}\) boson produced via gluon–gluon fusion (left) and in association with b  quarks (right) as a function of \(m_{\text {A}}\). The blue dashed lines represent the expected limits of the \(0\ell \) and \(2\ell \) categories separately. The red and magenta solid curves and their shaded areas correspond to the product of the cross sections and the branching fractions and the relative uncertainties predicted by the 2HDM Type-I and Type-II for the arbitrary parameters \(\tan \beta =3\) and \(\cos (\beta -\alpha ) =0.1\)

The results are interpreted in terms of Type-I, Type-II, “lepton-specific”, and “flipped” 2HDM formulations [5]. In the scenario with \(\cos (\beta -\alpha ) =0.1\) and \(\tan \beta =3\), an \(\text {A}\) boson up to 380 and \(350\,\text {GeV} \) is excluded in 2HDM Type-I and Type-II, respectively, as depicted in Fig. 5. These exclusion limits are used to constrain the two-dimensional plane of the 2HDM parameters \([\cos (\beta -\alpha ), \tan \beta ]\) as reported in Fig. 6, with fixed \(m_\text {A} =300\,\text {GeV} \) in the range \(0.1\le \tan \beta \le 100\) and \(-1\le \cos (\beta -\alpha ) \le 1\), using the convention \(0<\beta -\alpha <\pi \). Because of the suppressed \(\text {A}\) boson cross section and \(\mathcal {B}(\text {A} \rightarrow \text {Z}\text {h})\), the region near \(\cos (\beta -\alpha ){\approx }0\) is not accessible in this search. On the other hand, \(\mathcal {B}(\text {h}\rightarrow \text {b}\bar{\text {b}})\) vanishes in the diagonal regions corresponding to \(\alpha \) close to 0 in Type-II and flipped 2HDM, and \(\alpha \rightarrow \pm \pi /2\) in Type-I and lepton-specific scenarios. The exclusion as a function of \(m_{\text {A}}\), fixing \(\cos (\beta -\alpha ) =0.1\), is also reported in Fig. 7.

Fig. 6
figure 6

Observed and expected (with \({\pm }1,\,{\pm }2\) standard deviation bands) exclusion limits for Type-I (upper left), Type-II (upper right), flipped (lower left), lepton-specific (lower right) models, as a function of \(\cos (\beta -\alpha )\) and \(\tan \beta \). Contours are derived from the projection on the 2HDM parameter space for the \(m_{\text {A}} = 300\,\text {GeV} \) signal hypothesis. The excluded region is represented by the shaded gray area. The regions of the parameter space where the natural width of the \(\text {A}\) boson \(\varGamma _\text {A} \) is comparable to the experimental resolution and thus the narrow width approximation is not valid are represented by the hatched gray areas

Fig. 7
figure 7

Observed and expected (with \({\pm }1,\,{\pm }2\) standard deviation bands) exclusion limits for Type-I (upper left), Type-II (upper right), flipped (lower left), lepton-specific (lower right) models, as a function of \(m_{\text {A}}\) and \(\tan \beta \), fixing \(\cos (\beta -\alpha ) = 0.1\). The excluded region is represented by the shaded gray area. The regions of the parameter space where the natural width of the \(\text {A}\) boson \(\varGamma _\text {A} \) is comparable to the experimental resolution and thus the narrow width approximation is not valid are represented by the hatched gray areas

8 Summary

A search is presented in the context of an extended Higgs boson sector for a heavy pseudoscalar boson \(\text {A}\) that decays into a Z  boson and an h  boson with mass of 125\(\,\text {GeV}\), with the Z  boson decaying into electrons, muons, or neutrinos, and the h  boson into \(\text {b}\bar{\text {b}}\). The SM backgrounds are suppressed by using the characteristics of the considered signal, namely the production and decay angles of the \(\text {A}\), Z,   and h  bosons, and by improving the \(\text {A}\) mass resolution through a kinematic constraint on the reconstructed invariant mass of the h  boson candidate. No excess of data over the background prediction is observed. Upper limits are set at 95% confidence level on the product of the \(\text {A}\) boson cross sections and the branching fractions \(\sigma _\text {A} \,\mathcal {B}(\text {A} \rightarrow \text {Z}\text {h}) \,\mathcal {B}(\text {h}\rightarrow \text {b}\bar{\text {b}})\), which exclude 1 to 0.01 pb in the 225–1000\(\,\text {GeV}\) mass range, and are comparable to the corresponding ATLAS search. Interpretations are given in the context of Type-I, Type-II, flipped, and lepton-specific two-Higgs-doublet model formulations, thereby reducing the allowed parameter space for extensions of the SM with respect to previous CMS searches.