1 Introduction

A future \(\mathrm{e}^{+} \mathrm{e}^{-} \) collider, such as the Compact Linear Collider (CLIC) [1] or the International Linear Collider (ILC) [2], would be complementary to the Large Hadron Collider (LHC) and the High-Luminosity LHC (HL-LHC), providing tests of beyond the Standard Model (SM) physics scenarios through a broad programme of highly precise measurements. A central part of the linear collider physics programme is the precise study of the properties of the Higgs boson. The LHC and HL-LHC will provide an impressive range of Higgs physics measurements, establishing the general properties of the Higgs boson, such as its mass and spin. The LHC will also provide measurements of the product of the Higgs production rate and Higgs decay branching fractions into different final states. Current estimates suggest that ratios of couplings can be measured to 2–7 % (depending on the final state) with 3000 fb\(^{-1}\) of data [3]. A number of recent studies, see for example [3, 4], have indicated that modifications of the Higgs couplings due to beyond the SM (BSM) physics are almost always less than 10 % and can be as small as 1–2 % in a number of models.

An \(\mathrm{e}^{+} \mathrm{e}^{-} \) collider would be a unique facility for precision Higgs physics [57], providing measurements of the Higgs boson branching ratios that may be an order of magnitude more precise than those achievable at the HL-LHC. Such measurements may be necessary to reveal BSM effects in the Higgs sector. Moreover, an \(\mathrm{e}^{+} \mathrm{e}^{-} \) collider provides the opportunity to make a number of unique measurements including: (i) absolute measurements of Higgs couplings, rather than ratios; (ii) a precise measurement of possible decays to invisible (long-lived neutral) final states; and (iii) a \({<}2~\%\) [7] measurement of the total Higgs decay width, \(\varGamma _{\mathrm{H}}\). In addition, an \(\mathrm{e}^{+} \mathrm{e}^{-} \) collider operating at 1 \(\text {TeV} \) or above, for example CLIC or an upgraded ILC, would have sensitivity to the Yukawa coupling of the top quark to the Higgs boson and the Higgs self-coupling parameter \(\lambda \), thus providing a direct probe of the Higgs potential.

This paper presents the first detailed study of the potential of making a model-independent measurement of \(g_{\mathrm{H} \mathrm{Z} \mathrm{Z}} \) from the recoil mass distribution in \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \) with \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \), denoted as \(\mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\). The studies were initially performed in the context of the CLIC accelerator operating at \(\sqrt{s} = 350~\text {GeV} \). The studies were repeated for the ILC operating at the same energy and for CLIC at \(\sqrt{s} =250~\text {GeV} \) and \(\sqrt{s} =420~\text {GeV} \).

1.1 Higgs production in \(e^+e^-\) collisions

In \(\mathrm{e}^{+} \mathrm{e}^{-} \) collisions at \(\sqrt{s} = 250{-}500\) GeV, the two main Higgs production mechanisms are the Higgsstrahlung process, \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \), and the \(\mathrm{W} \mathrm{W} \)-fusion process, \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} {{\upnu } _{\mathrm{e}}} {\bar{{\upnu }}}_{\mathrm{e}} \), shown in Fig. 1. For \(m_{\mathrm{H}} \sim 125~\)GeV, the cross section for the s-channel Higgsstrahlung process is maximal close to \(\sqrt{s} = 250~\text {GeV} \), whereas the cross section for the t-channel \(\mathrm{W} \mathrm{W} \)-fusion process increases with centre-of-mass energy, as indicated in Table 1.

Fig. 1
figure 1

Feynman diagrams for: (left) the Higgsstrahlung process \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \), which dominates at \(\sqrt{s} = 250\) GeV; and (right) the \(\mathrm{W} \mathrm{W} \)-fusion process \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow {{\upnu } _{\mathrm{e}}} {\bar{{\upnu }}}_{\mathrm{e}} \mathrm{H} \), which dominates at \(\sqrt{s} > 500\) GeV

Table 1 The leading-order Higgs cross sections for the Higgsstrahlung and \(\mathrm{W} \mathrm{W} \)-fusion processes for \(m_{\mathrm{H}} =125~\text {GeV} \) at three centre-of-mass energies. The cross sections are calculated [9] including initial-state radiation and are shown for unpolarised electron/positron beams and assuming the baseline ILC polarisation of \(P(\mathrm{e}^{-},\mathrm{e}^{+}) = (-0.8,~+0.3)\)

The total \(\mathrm{H} \mathrm{Z} \) cross section is proportional to the square of the coupling between the Higgs and \(\mathrm{Z} \) bosons, \(g_{\mathrm{H} \mathrm{Z} \mathrm{Z}} \),

$$\begin{aligned} \upsigma (\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z}) \propto g_{\mathrm{H} \mathrm{Z} \mathrm{Z}} ^2, \end{aligned}$$

and the cross sections for the exclusive final-state decays \(\mathrm{H} \rightarrow X\bar{X}\) can be expressed as

$$\begin{aligned} \upsigma (\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z}) \times \text {BR} (\mathrm{H} \rightarrow X\bar{X})\propto & {} \frac{g_{\mathrm{H} \mathrm{Z} \mathrm{Z}} ^2 \times g_{\mathrm{H} X X} ^2}{\varGamma _{\mathrm{H}}} \\ \upsigma (\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} {{\upnu } _{\mathrm{e}}} {\bar{{\upnu }}}_{\mathrm{e}}) \times \text {BR} (\mathrm{H} \rightarrow X\bar{X})\propto & {} \frac{g_{\mathrm{H} \mathrm{W} \mathrm{W}} ^2 \times g_{\mathrm{H} X X} ^2}{\varGamma _{\mathrm{H}}}. \end{aligned}$$

Once \(g_{\mathrm{H} \mathrm{Z} \mathrm{Z}} \) has been determined in a model-independent manner, the ratio of the Higgsstrahlung and \(\mathrm{W} \mathrm{W} \)-fusion cross sections for the same exclusive Higgs boson final state (e.g. \(\mathrm{H} \rightarrow \mathrm{b} {\overline{\mathrm{b}}} \)) yields \(g_{\mathrm{H} \mathrm{W} \mathrm{W}} \). Subsequently, the measurement of \(\upsigma (\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} {{\upnu } _{\mathrm{e}}} {\bar{{\upnu }}}_{\mathrm{e}})\) \(\times \) \(\text {BR} (\mathrm{H} \rightarrow \mathrm{W} \mathrm{W} ^*)\) and/or \(\upsigma (\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z})\) \(\times \) \(\text {BR} (\mathrm{H} \rightarrow \mathrm{W} \mathrm{W} ^*)\), which depend on \(g_{\mathrm{H} \mathrm{W} \mathrm{W}} ^4/\varGamma _{\mathrm{H}}\) and \(g_{\mathrm{H} \mathrm{W} \mathrm{W}} ^2g_{\mathrm{H} \mathrm{Z} \mathrm{Z}} ^2/\varGamma _{\mathrm{H}}\) respectively, provide a determination of \(\varGamma _{\mathrm{H}}\). At this point all measurements of exclusive Higgs decays provide absolute and model-independent determinations of the relevant coupling(s). In practice, all measurements will be used as the input to a global fit. Nevertheless, the determination of \(g_{\mathrm{H} \mathrm{Z} \mathrm{Z}} \) from the recoil mass distribution in \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \) lies at the heart of this scientific programme.

It has been pointed out [8] that off-shell effects in \(\mathrm{H} \rightarrow \text {V}\text {V}^*\) with \(\text {V} = \mathrm{W}, \mathrm{Z} \) can break the model independence of the Higgs cross section and branching ratio measurements. However, for the centre-of-mass energies considered in this paper, off-shell effects, which depend strongly on \(\sqrt{s} \), are found to be small compared to the expected statistical uncertainties from the recoil mass measurements.

1.2 The leptonic recoil mass measurement

The Higgsstrahlung process provides the opportunity to study the couplings of the Higgs boson in a model-independent manner. This is unique to an electron-positron collider, where the clean experimental environment and the relatively low SM cross sections for background processes allow \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \) events to be selected based solely on the measurement of the four-momentum of the \(\mathrm{Z} \) boson, regardless of how the Higgs boson decays. The clearest topologies occur for \(\mathrm{Z} \rightarrow \mathrm{e}^{+} \mathrm{e}^{-} \) and \(\mathrm{Z} \rightarrow {{\upmu }}^{+} {{\upmu }}^{-} \) decays, which can be identified by first requiring that the measured di-lepton invariant mass \(m_{\ell \ell }\) is consistent with \(m_{\mathrm{Z}} \). The four-momentum of the system recoiling against the \(\mathrm{Z} \) boson is obtained from \(E_\mathrm {rec} = \sqrt{s}- E_{\ell \ell }\) and \({\mathbf {p}}_\mathrm {rec} = -{\mathbf {p}}_{\ell \ell }\). In \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \) events, the invariant mass of this recoiling system, \(m_\mathrm {rec} \), will peak at \(m_{\mathrm{H}}\). Figure 2 shows the simulated recoil mass distribution in the ILD [10] detector concept for 250 \(\text {fb}^{-1}\) of ILC data at \(\sqrt{s} = 250~\text {GeV} \) with beam polarisation \(P(\mathrm{e}^{-},\mathrm{e}^{+}) = (-0.8,~+0.3)\). By combining both \(\mathrm{Z} \rightarrow {{\upmu }}^{+} {{\upmu }}^{-} \) and \(\mathrm{Z} \rightarrow \mathrm{e}^{+} \mathrm{e}^{-} \) decays, \(\upsigma (\mathrm{H} \mathrm{Z})\) can be measured to 2.6 %, leading to a determination of \(g_{\mathrm{H} \mathrm{Z} \mathrm{Z}} \) with a precision of 1.3 % [5, 10, 11]. This result assumes 250 \(\text {fb}^{-1}\) of ILC data at \(\sqrt{s} = 250~\text {GeV} \) with beam polarisation \(P(\mathrm{e}^{-},\mathrm{e}^{+}) = (-0.8,~+0.3)\).

Fig. 2
figure 2

The recoil mass distribution for the Higgsstrahlung process for \(\mathrm{Z} \rightarrow {{\upmu }}^{+} {{\upmu }}^{-} \) at \(\sqrt{s} = 250~\text {GeV} \). The distribution is shown for 250 \(\text {pb}^{-1}\) with a beam polarisation of \(P(\mathrm{e}^-,\mathrm{e}^+) = (-80~\%,+30~\%)\). Taken from [7]

1.3 Recoil mass measurement at different centre-of-mass energies

The narrowness of the recoil mass peak is an important factor in determining the precision to which \(g_{\mathrm{H} \mathrm{Z} \mathrm{Z}} \) can be measured. The measured recoil mass can be expressed as

$$\begin{aligned} m_{\text {rec}}^2= & {} (\sqrt{s}- E_{\ell \ell } )^2 -{\mathbf {p}}_{\ell \ell }^2 = s - 2\sqrt{s} ~E_{\ell \ell } + E^2_{\ell \ell } -{\mathbf {p}}_{\ell \ell }^2 \\= & {} s - 2\sqrt{s} \, (E_{\ell 1} + E_{\ell 2}) + m^2_{\ell \ell } , \end{aligned}$$

where \(\sqrt{s} \) is the centre-of-mass energy and \(E_{\ell 1}\) and \(E_{\ell 2}\) are the energies of the two leptons. Since \(m_{\ell \ell }\) will peak around \(m_{\mathrm{Z}} \), it can be seen that the contribution to the width of the recoil mass peak from the experimental resolution scales with both \(\sqrt{s} \) and the lepton energy (or momentum) resolution. For high-momenta muons, where multiple scattering in the tracking chambers is relatively unimportant, the fractional momentum resolution \(\upsigma _p/p\) will scale approximately as the transverse momentum \(p_\mathrm {T} \), thus \(\upsigma _{E_\ell }\) will scale quadratically as \(p\cdot p_\mathrm {T} \). Consequently, in the range \(\sqrt{s} = 250 {-} 500~\text {GeV} \), where the energy of the fermions from the \(\mathrm{Z} \) decay approximately scales as \(\sqrt{s} \), the width of the recoil mass distribution increases significantly with increasing centre-of-mass energy. This picture is complicated by the contributions to the width of the recoil mass peak from the natural width of the \(\mathrm{Z} \) and the intrinsic beam energy spread. Nevertheless, for the momentum resolutions assumed for the ILC detectors, the leptonic recoil mass analysis leads to a higher precision on \(g_{\mathrm{H} \mathrm{Z} \mathrm{Z}} \) for \(\sqrt{s} \sim 250~\text {GeV} \) [5], where the \(\upsigma (\mathrm{H} \mathrm{Z})\) is largest and the reconstructed recoil mass peak is relatively narrow, compared to higher centre-of-mass energies. This has been one of the strongest arguments for the initial operation of the ILC at a relatively low centre-of-mass energy. This argument does not apply to the recoil mass measurement with hadronic \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decays since the recoil mass resolution depends less strongly on \(\sqrt{s} \) than for leptonic final states because the jet energy resolution for the linear collider detector concepts scales linearly with energy, \(\upsigma _E \sim 0.03 E\) [12]. Although the hadronic recoil mass measurement has been considered previously [13], this paper presents the first detailed study of its potential.

2 Monte Carlo samples, detector simulation and event reconstruction

The CLIC results presented in this paper are based on detailed Monte Carlo (MC) simulation using: a full set of SM background processes; a detailed Geant4  [14, 15] simulation of the CLIC_ILD detector concept [16]; and a full reconstruction of the simulated events.

2.1 Monte Carlo event generation

The simulated SM event samples were generated using the WHIZARD 1.95 [9] program. The expected energy spectra for the CLIC beams, including the effects from beamstrahlung and the intrinsic machine energy spread, were used for the initial-state electrons and positrons. The process of fragmentation and hadronisation of final-state quarks and gluons was simulated using PYTHIA 6.4 [17] with a parameter set [18] that was tuned to OPAL \(\mathrm{e}^{+} \mathrm{e}^{-} \) data recorded at LEP. The decays of \(\uptau \) leptons were simulated using the TAUOLA package [19]. The mass of the Higgs boson was taken to be \(m_{\mathrm{H}} = 126~\text {GeV} \) and the decays of the Higgs boson were simulated using PYTHIA with the branching fractions of [20]. A dedicated sample of \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \) events with Higgs decays to “invisible” long-lived neutral particles was produced by artificially setting the Higgs boson lifetime to infinity. Because of the 0.5 ns bunch spacing in the CLIC beams, the pile-up of beam-induced backgrounds from the \({\upgamma } {\upgamma } \rightarrow \text {hadrons}\) process was included in the simulated event samples to ensure its effect on the event reconstruction was accounted for.

2.2 CLIC detector simulation and event reconstruction

The Geant4-based Mokka [21] program was used to simulate the detector response of the CLIC_ILD detector concept [16]. The QGSP_BERT physics list was used to model the hadronic interactions of particles in the detectors. The hit digitisation and the event reconstruction were performed using the Marlin  [22] software packages. Particle flow reconstruction was performed using PandoraPFA [12, 23]. An algorithm, using the individual reconstructed particles, was used to identify and remove approximately 90 % of the out-of-time background due to pile-up from \({{\upgamma } {\upgamma } \rightarrow \text {hadrons}} \); here the Loose particle flow object selection [16] was used.

Jet finding was performed using the FastJet  [24] package. Because of the presence of pile-up from \({\upgamma } {\upgamma } \rightarrow \text {hadrons}\), the ee_kt (Durham) algorithm employed at LEP is not effective as it clusters particles from pile-up into the reconstructed jets. Instead, the hadron-collider inspired \(k_t\) algorithm, with the distance parameter R based on \(\varDelta \eta \) and \(\varDelta \phi \), was used with \(R=\pi /2\). This algorithm allows particles to be clustered into “beam jets”, aligned with the beam axis, in addition to jets seeded by high-momentum particles. Background from the pile-up of \({{\upgamma } {\upgamma } \rightarrow \text {hadrons}} \) can, to a large degree, be removed by ignoring particles in the “beam jets”, largely mitigating the impact of the beam background.

The hadronic recoil mass study, presented in this paper, covers a wide range of \(\mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) final-state topologies ranging from two jets where Higgs decays to long-lived neutral particles, \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \rightarrow (\text {invis.})(\mathrm{q} {\overline{\mathrm{q}}} )\), to six-jet toplogies from, for example, \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \rightarrow (\mathrm{W} \mathrm{W} ^*\rightarrow \mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} )(\mathrm{q} {\overline{\mathrm{q}}} )\). For this reason, each reconstructed event is clustered into two-, three-, four-, five- and six-jet topologies, with “y-cut” variables used to indicate the underlying physical topology. For example, if an event is forced into a three-jet topology, \(y_{34}\) is the \(k_t\) value at which the event would be reconstructed as four jets and \(y_{23}\) is the \(k_t\) value at which the event would be reconstructed as two jets.

2.3 ILC detector simulation and event reconstruction

The event generation and reconstruction for the ILC studies, presented in Sect. 4, follows closely that described above. The main differences are: (i) the ILC beam spectrum, where the effects of beamstrahlung are less pronounced; (ii) the detector simulation used the ILD detector concept for the ILC, rather than the CLIC_ILD model adapted for CLIC; and (iii) the much longer ILC bunch spacing means that only in-time background from \({{\upgamma } {\upgamma } \rightarrow \text {hadrons}} \) needs to be included.

3 Hadronic recoil mass measurement at CLIC

In the process \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \) it is possible to cleanly identify \(\mathrm{Z} \rightarrow \mathrm{e}^{+} \mathrm{e}^{-} \) and \(\mathrm{Z} \rightarrow {{\upmu }}^{+} {{\upmu }}^{-} \) decays regardless of the \(\mathrm{H} \) decay mode. Consequently, the selection efficiency is almost independent of the nature of the \(\mathrm{H} \) decay. For \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decays, the selection efficiency will depend more strongly on the Higgs decay mode. For example, in \((\mathrm{H} \rightarrow \mathrm{b} {\overline{\mathrm{b}}})(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) events, the reconstruction of the \(\mathrm{Z} \) boson is complicated by mis-associations of particles to jets and by the threefold ambiguity in associating four jets to the \(\mathrm{Z} \) and \(\mathrm{H} \). These ambiguities will increase with the number of jets in the final state. For this reason, it is much more difficult to construct an event selection, based only on the reconstructed candidate \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay, with a selection efficiency that is independent of the Higgs decay mode. Nevertheless it is possible to minimise this dependence. The strategy adopted here is to: (i) separate all simulated events into candidates for Higgs decays to “invisible” long-lived neutral particles and decays to visible final states; (ii) identify the di-jet system that is the best candidate for the \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay; (iii) reject events consistent with a number of clear background topologies using the information from the whole event; (iv) identify \(\mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) events solely based on the properties from the candidate \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay, first for the candidate visible Higgs decays and then for the candidate invisible Higgs decays; and (v) combine the results into a single measurement of \(\upsigma (\mathrm{H} \mathrm{Z})\).

3.1 Separation into candidate visible and invisible Higgs decay samples

Hadronic events are selected by forcing each event into a two-jet topology and requiring at least three charged particles in each jet. The surviving events are then divided into candidates for either visible \(\mathrm{H} \) decays or invisible \(\mathrm{H} \) decays, in both cases produced in association with a \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \). Events are categorised as potential invisible \(\mathrm{H} \) decays on the basis of the y-cut values in the \(k_t\) jet-finding algorithm. For invisible \(\mathrm{H} \) decays, only the \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) system is visible in the detector, typically resulting in a two-jet topology (with the possibility that QCD radiation can increase the number of reconstructed jets). Consequently, invisible \(\mathrm{H} \) decays will have small values of \(y_{23}\) and \(y_{34}\), the variables respectively representing the \(k_t\) value at which an event transitions from two to three jets and from three to four jets, as indicated in Fig. 3. Events are categorised as candidate invisible \(\mathrm{H} \) decays if \(-\log _{10}(y_{23})> 2.0\) and \(-\log _{10}(y_{34})> 3.0\). Due to gluon radiation in the parton shower, only 74 % of the simulated \(\mathrm{H} \mathrm{Z} \, (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) events with invisible \(\mathrm{H} \) decays are placed in this two-jet topology candidate invisible \(\mathrm{H} \) decay sample. To improve the efficiency for correctly categorising SM Higgs decays with low-energy leptons, for example \(\mathrm{H} \rightarrow \mathrm{W} \mathrm{W} ^*\rightarrow {\uptau } {\upnu } {\uptau } {\upnu } \), events with \(-\log _{10}(y_{23})<2.5\) and \(-\log _{10}(y_{34})< 3.5\) are forced into three jets and are excluded from the invisible Higgs decay sample if the lowest-energy jet has fewer than four reconstructed tracks or contains an identified \(\mathrm{e}^\pm /{\upmu } ^\pm \) with energy \(E>5~\text {GeV} \). Only 2.2 % of simulated \(\mathrm{H} \mathrm{Z} \) events with SM Higgs decays end up in the candidate invisible Higgs sample.

Fig. 3
figure 3

The distributions of \(-\log _{10}(y_{23})\) and \(-\log _{10}(y_{34})\) for simulated \(\mathrm{H} \mathrm{Z} \,(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) events for visible and invisible Higgs decays at \(\sqrt{s} = 350~\text {GeV} \). The distributions are normalised to an integrated luminosity of 500 \(\text {fb}^{-1}\). The distribution for the invisible \(\mathrm{H} \) decays assumes a 100 % branching fraction into invisible decay modes. The vertical lines with arrows indicate the cut values used in this analysis

Fig. 4
figure 4

a The reconstructed hadronic recoil mass distribution for the candidate \(\mathrm{H} \mathrm{Z} \) events with \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) and \(\mathrm{H} \rightarrow \text {invis.}\) b The reconstructed hadronic recoil mass distributions for candidate \(\mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) and either \(\mathrm{H} \rightarrow \mathrm{b} {\overline{\mathrm{b}}} \), \(\mathrm{H} \rightarrow \mathrm{W} \mathrm{W} ^*\rightarrow \mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \) or \(\mathrm{H} \rightarrow {{\uptau }}^{+} {{\uptau }}^{-} \). In each case the distributions are normalised to unit area. An underflow (not shown) contains the small fraction of events where no good \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) candidate is identified

3.2 Recoil mass reconstruction

For each candidate \(\mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) event, the recoil mass is calculated from \(m_\mathrm {rec} ^2 = ( \sqrt{s}- E_{\mathrm{q} {\overline{\mathrm{q}}} } )^2 -{\mathbf {p}}_{\mathrm{q} {\overline{\mathrm{q}}} }^2\), where \(E_{\mathrm{q} {\overline{\mathrm{q}}} }\) and \({\mathbf {p}}_{\mathrm{q} {\overline{\mathrm{q}}} }\) are the summed energy and momentum of the di-jet system from the identified candidate \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay. In the case of the candidate invisible Higgs decay sample, the two jets are assumed to be from \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \). The resulting recoil mass distribution for candidate invisible Higgs decays, which is strongly peaked around \(m_\mathrm {rec} \sim m_{\mathrm{H}} \), is shown in Fig. 4a. In the case of the candidate visible Higgs decay sample, the situation is more complicated as this sample encompasses many different \(\mathrm{H} \mathrm{Z} \) event topologies. For example, \(\mathrm{H} \rightarrow \mathrm{b} {\overline{\mathrm{b}}} \) decays will result in a four-quark \(\mathrm{H} \mathrm{Z} \) final state, usually yielding four jets, whereas, \(\mathrm{H} \rightarrow \mathrm{W} \mathrm{W} ^*\rightarrow \mathrm{q} {\overline{\mathrm{q}}} {\ell } {\upnu } \) and \(\mathrm{H} \rightarrow \mathrm{W} \mathrm{W} ^*\rightarrow \mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \) decays will respectively usually yield five- and six-jet final states. In all cases gluon radiation in the parton shower can increase the reconstructed jet multiplicity relative to the tree-level expectation.

In order to achieve the desired (near) model independence of the analysis, it is necessary to have a similar quality of recoil mass reconstruction for all Higgs boson visible decay modes. This hinges on the correct identification and reconstruction of the \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) di-jet system. The first stage is to force events in the candidate visible Higgs decay sample into a four-jet topology. From the three possible di-jet combinations, the di-jet system with invariant mass \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) closest to \(m_{\mathrm{Z}} \) is identified as the candidate \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay and its energy and momentum are used to calculate the recoil mass \(m_\mathrm {rec} \). In selecting the candidate \(\mathrm{Z} \) decay, only jets containing more than three charged particles are considered. To improve the reconstruction of higher-jet-multiplicity final states, such as \(\mathrm{H} \rightarrow \mathrm{W} \mathrm{W} ^*\rightarrow \mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \), each event is also forced into five jets and the di-jet system with mass closest to \(m_{\mathrm{Z}} \) is again identified as the candidate \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay. The five-jet topology is only used if \(-\log _{10}(y_{45})>3.5\) and both \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) and \(m_\mathrm {rec} \) are respectively closer to \(m_{\mathrm{Z}} \) and \(m_{\mathrm{H}} \) than the corresponding values from the four-jet reconstruction. Even in the genuine six-parton topology \(\mathrm{H} \mathrm{Z} \rightarrow (\mathrm{W} \mathrm{W} ^*)\mathrm{q} {\overline{\mathrm{q}}} \rightarrow (\mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} )\mathrm{q} {\overline{\mathrm{q}}} \) only 13 % of events are reconstructed as five jets, for the remainder, the four-jet reconstruction is preferred. However, provided the jets from the \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay are correctly identified, there is no need to correctly reconstruct the recoiling system as only the properties of the \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay are used in the subsequent analysis. For this reason, allowing the possibility of reconstructing events as six jets was found not to improve the overall recoil mass reconstruction. Figure 4b shows the resulting recoil mass distribution for simulated \(\mathrm{H} \mathrm{Z} \) events with \(\mathrm{H} \rightarrow \mathrm{b} {\overline{\mathrm{b}}} \), \(\mathrm{H} \rightarrow \mathrm{W} \mathrm{W} ^*\rightarrow \mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \) and \(\mathrm{H} \rightarrow {{\uptau }}^{+} {{\uptau }}^{-} \). Despite the very different final states, similar recoil mass distributions are obtained.

3.3 Preselection

After dividing all events into either candidates for visible or invisible Higgs decays and having identified the two jets forming the candidate \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) system, preselection cuts are applied to reduce backgrounds from larger cross section SM processes such as \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) and \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \). Cuts are based on the invariant mass of the \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) candidate, \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \), and corresponding recoil mass, \(m_\mathrm {rec} \). In addition, the invariant mass of all the visible particles not originating from the candidate \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay, \(m^\prime _{\mathrm{q} {\overline{\mathrm{q}}}} \), is calculated. It is important to note that \(m^\prime _{\mathrm{q} {\overline{\mathrm{q}}}} \) is only used to reject specific background topologies in the preselection and is not used in the main selection; in \(\mathrm{H} \mathrm{Z} \) events \(m^\prime _{\mathrm{q} {\overline{\mathrm{q}}}} \) will depend strongly on the Higgs decay mode. The preselection cuts (most of which are common to the visible and invisible Higgs selections) are:

  • the event must be broadly consistent with being \(\mathrm{H} \mathrm{Z} \), \(70~\text {GeV} < m_{\mathrm{q} {\overline{\mathrm{q}}}} < 110~\text {GeV} \) and \(80~\text {GeV} < m_\mathrm {rec} < 200~\text {GeV} \).

  • background from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) is suppressed by removing events with net transverse momentum \(p_\mathrm {T} < 20~\text {GeV} \) and \(-\log _{10}(y_{34})>2.5\), indicating a final-state system consisting of fewer than four primary particles.

  • events in the invisible Higgs decay sample are rejected if \(|\cos \theta _{\text {mis}}|>0.7\), where \(\theta _{\text {mis}}\) is the polar angle of the missing momentum vector, almost completely eliminating the contribution from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) with unobserved initial-state radiation (ISR).

  • background from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) with unobserved ISR, including radiative return to the \(\mathrm{Z} \) resonance, is suppressed by rejecting events with net transverse momentum \(p_\mathrm {T} < 20 ~\text {GeV} \) and \(|\cos \theta _{\text {mis}}|>0.9\).

  • events in the invisible Higgs decay sample are rejected if there is an isolated identified \(\mathrm{e}^\pm /{\upmu } ^\pm \) with energy \(E_{\ell } >10~\)GeV, suppressing background from \(\mathrm{W} \mathrm{W} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} {\ell } {\upnu } \).

  • the background from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{W} \mathrm{W} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \) is suppressed by forcing events into four jets and selecting the di-jet pair with the mass \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) closest to \(m_{\mathrm{W}} \). Events are rejected if \(p_\mathrm {T} < 20 ~\text {GeV} \) and \(65~\text {GeV} < m_{\mathrm{q} {\overline{\mathrm{q}}}} < 100~\text {GeV} \) and \(65~\text {GeV} < m^\prime _{\mathrm{q} {\overline{\mathrm{q}}}} < 100~\text {GeV} \), where \(m^\prime _{\mathrm{q} {\overline{\mathrm{q}}}} \) is the measured invariant mass of the second di-jet pair.

  • the background from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{Z} \mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \) is suppressed in a similar manner. Events are forced into four jets and the di-jet pair with the \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) closest to \(m_{\mathrm{Z}} \) is identified. Events are rejected if \(p_\mathrm {T} < 20 ~\text {GeV} \) and \(70~\text {GeV} < m_{\mathrm{q} {\overline{\mathrm{q}}}} < 105~\text {GeV} \) and \(70~\text {GeV} < m^\prime _{\mathrm{q} {\overline{\mathrm{q}}}} < 105~\text {GeV} \), where \(m^\prime _{\mathrm{q} {\overline{\mathrm{q}}}} \) is the measured mass of the second di-jet pair.

The effects of the preselection cuts are summarised in Table 2. The events passing the preselection cuts are put forward as candidate \(\mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) events with either: (i) visible \(\mathrm{H} \) decay products; or (ii) invisible \(\mathrm{H} \) decay products, depending on whether the event was consistent with a two-jet topology or not. The first two cuts listed above result in the largest loss of signal efficiency for the visible Higgs decay selection. The \(\mathrm{q} {\overline{\mathrm{q}}} {\ell } {\upnu } \) background in the visible Higgs decay preselection could have been significantly reduced by rejecting events with visible high-energy isolated leptons, but this would have introduced a bias against \(\mathrm{H} \) decays with leptons in the final state.

Table 2 Summary of the effects of the preselection cuts for the visible and invisible recoil mass analyses. The efficiencies \(\varepsilon \) include the effects of the preselection cuts and the division into the candidate visible and invisible Higgs decay samples. The expected numbers of events passing the preselection cuts correspond to an integrated luminosity of \(500~\text {fb}^{-1}\) at CLIC, assuming unpolarised beams at \(\sqrt{s} =350~\text {GeV} \). The numbers shown for the invisible Higgs decay modes correspond to a 100 % branching ratio

3.4 Selection of HZ\(\rightarrow \)q\(\overline{\text {q}}\) with visible Higgs decays

After preselection, the main backgrounds in the visible Higgs decay analysis arise from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} {\ell } {\upnu } \) and \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \), dominated by \(\mathrm{W} \mathrm{W} \), single-\(\mathrm{W} \) (\(\mathrm{W} \mathrm{e}{{\upnu } _{\mathrm{e}}} \)) and \(\mathrm{Z} \mathrm{Z} \) processes. The event selection is based entirely on the reconstructed candidate \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay in the event. The properties of the remainder of the event (or the event as a whole) are not used as their inclusion would break the desired model independence of the selection. For example, the background from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} {\ell } {\upnu } \) can be significantly reduced by placing a lower bound on the total visible energy in the event, however such a cut would bias the selection against Higgs decays with missing energy, such as \({\mathrm{H} \rightarrow {{\uptau }}^{+} {{\uptau }}^{-} }\).

The event selection uses a relative likelihood approach with discriminant variables based on the properties of candidate \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay. Two event categories are considered: (a) the \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \rightarrow \mathrm{H} \mathrm{q} {\overline{\mathrm{q}}} \) signal; and (b) all non-Higgs background processes. The relative likelihood for an event being classified as signal is defined as

$$\begin{aligned} \mathcal{{L}} = \frac{L_{\text {signal}} }{L_{\text {signal}} + L_{\text {back}}}, \end{aligned}$$

where the individual absolute likelihood \(L_j\) for the event class j (signal or background) is formed from normalised probability distributions \(P^j_i(x_i)\) of the discriminant variables \(x_i\) for that event class j:

$$\begin{aligned} L_j = {\upsigma }^j_{\text {presel}} \times \prod _i^{N} P^j_i(x_i) , \end{aligned}$$

where \(\upsigma ^j_{\text {presel}}\) is the cross section after preselection for event class j.

The discriminant variables used in the likelihood selection, all of which are based on the candidate \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay, are: (i) the two-dimensional distribution of \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) and \(m_\mathrm {rec} \); (ii) the polar angle of the \(\mathrm{Z} \) candidate, \(|\cos \theta _{\mathrm{Z}}|\); and (iii) the modulus of the angle of the jets from the \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay in \(\mathrm{Z} \) rest frame, relative to its laboratory frame direction of motion, \(|\cos \theta _{\mathrm{q}}|\). The two-dimensional distributions of \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) and \(m_\mathrm {rec} \), are shown separately for the signal and background in Fig. 5. As expected, the \(\mathrm{H} \mathrm{Z} \) signal events peak around \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \approx m_{\mathrm{Z}} \) and \(m_\mathrm {rec} \approx m_{\mathrm{H}} \). The anti-correlation between \(m_\mathrm {rec} \) and \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) is expected; when the reconstructed jet energies are higher than the true energies, the reconstructed value of \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) will be higher than \(m_{\mathrm{Z}} \) and \(m_\mathrm {rec} \) will be lower than \(m_{\mathrm{H}} \) due to the \(-2\!\sqrt{s} \,E_{\mathrm{Z}}\) term in the expression for the recoil mass, \(m_\mathrm {rec} ^2= s - 2\!\sqrt{s} \,E_{\mathrm{Z}} + m_{\mathrm{q} {\overline{\mathrm{q}}}} ^2\,\). The broad peaked structure in the background distribution at lower values of \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) arises from \(\mathrm{q} {\overline{\mathrm{q}}} {\ell } {\upnu } \) events (which have been forced into a four- or five-jet topology). The use of the two-dimensional distribution of \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) versus \(m_\mathrm {rec} \) in the likelihood accounts for the associated correlations.

Fig. 5
figure 5

The distribution of the reconstructed \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) mass, \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \), versus the hadronic recoil mass, \(m_\mathrm {rec} \), for \(\mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) events (top) and for all background processes (bottom). In both cases the plots show all events passing the visible Higgs preselection for CLIC operating at \(\sqrt{s} =350~\text {GeV} \)

The two angular variables used in the likelihood selection are shown in Fig. 6. The discriminating power arises from the fact that the Higgs boson is a scalar particle, and the angular distributions in \(\mathrm{H} \mathrm{Z} \) production are different from those in the dominant backgrounds which mostly arise from the production of two vector particles.

Fig. 6
figure 6

a The polar angle of the reconstructed \(\mathrm{Z} \) candidates, \(|\cos \theta _{\mathrm{Z}}|\), for both signal and background events for CLIC operating at \(\sqrt{s} =350~\text {GeV} \), and b the modulus of the angle of the jets from the \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) decay relative to the \(\mathrm{Z} \) direction after boosting into its rest frame, \(|\cos \theta _{\mathrm{q}}|\). The signal and background distributions are normalised to 500 \(\text {fb}^{-1}\), but the \(\mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) signal has been scaled by a factor of 25 to improve its visibility

The resulting relative likelihood distribution is shown in Fig. 7. Despite the fact that the signal-to-background ratio in the preselected event sample is approximately 1:25, the likelihood selection provides good separation. The statistical precision on the cross section for \(\mathrm{H} \mathrm{Z} \) production (where the \(\mathrm{Z} \) decays hadronically and the \(\mathrm{H} \) has SM branching fractions) is maximised with a likelihood cut of \(\mathcal{{L}}>0.65\). The resulting efficiencies and the expected numbers of selected events for an integrated luminosity of 500 \(\text {fb}^{-1} \) are shown in Table 3. The corresponding statistical uncertainty on the production cross section is \({\pm }1.9~\%\). The precision can be improved by extracting the number of signal events by performing a maximum-likelihood fit to the shape of the simulated likelihood distribution by varying the normalisations of the signal and background components, yielding a statistical error of

$$\begin{aligned} \varDelta {\upsigma }_{\text {vis.}} = {\pm } 1.7~\% . \end{aligned}$$
Fig. 7
figure 7

The resulting likelihood distribution for the hadronic recoil mass analysis. The distributions correspond to 500 \(\text {fb}^{-1}\) of CLIC operation at \(\sqrt{s} =350~\text {GeV} \) with unpolarised electron and positron beams. The optimal likelihood cut at \(\mathcal{{L}}=0.65\) is indicated by the arrow

Table 3 Summary of the CLIC \((\mathrm{H} \rightarrow \text {vis.})(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) event selection at \(\sqrt{s} =350~\text {GeV} \), giving the cross sections, preselection efficiency, overall selection efficiency for a likelihood cut of \(\mathcal{{L}}>0.65\) and the expected numbers of events passing the event selection for an integrated luminosity of \(500~\text {fb}^{-1}\) assuming unpolarised electron and positron beams. The numbers shown for the invisible Higgs decay modes correspond to a 100 % branching ratio

3.5 Selection of \(HZ\rightarrow q\overline{q}\) with invisible Higgs decays

The main backgrounds after preselection for the invisible Higgs decay selection arise from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} {\ell } {\upnu } \) and \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} {\upnu } {\overline{\upnu }} \), which are dominated respectively by single-\(\mathrm{W} \) (\(\mathrm{W} \mathrm{e}{{\upnu } _{\mathrm{e}}} \)) and \(\mathrm{Z} \mathrm{Z} \) processes. A relative likelihood selection is used to separate the \((\mathrm{H} \rightarrow \text {invis.})(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \)) signal from the non-Higgs background. The discriminant variables employed are the same as those used for the visible Higgs decay likelihood function, namely the two-dimensional distribution of \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) versus \(m_\mathrm {rec} \), \(|\cos \theta _{\mathrm{Z}}|\) and \(|\cos \theta _{\mathrm{q}}|\). The most powerful of these variables is the recoil mass itself, shown in Fig. 8a, where the signal is plotted for the artificial case of \(\text {BR} (\mathrm{H} \rightarrow \text {invis.}) = 100~\%\). The resulting relative likelihood distribution for signal and background is shown in Fig. 8b, where good separation between signal and background is achieved.

Fig. 8
figure 8

a The reconstructed hadronic recoil mass distribution for events passing the preselection cuts in the clear two-jet topology. b The invisible Higgs decay relative likelihood distribution for signal and background. In both distributions the event rates are normalised to a CLIC integrated luminosity of \(500~\text {fb}^{-1} \) at \(\sqrt{s} =350~\text {GeV} \). The \(\mathrm{H} \mathrm{Z} \) signal is shown for the artificial case of \(\text {BR} (\mathrm{H} \rightarrow \text {invis.}) = 100~\%\). The optimal likelihood cut at \(\mathcal{{L}}=0.60\) is indicated by the arrow

In the limit where the \(\mathrm{H} \rightarrow \text {invis.}\) branching ratio is small (as expected), the expected uncertainty on the number of invisible Higgs decays selected by a particular likelihood cut is driven by the statistical fluctuations on the number of background events, \(\sqrt{B}\). In this limit, the corresponding uncertainty on the cross section for \(\mathrm{H} \mathrm{Z} \) production with \(\mathrm{H} \rightarrow \text {invis.}\) is given by

$$\begin{aligned} \varDelta {\upsigma }_{\mathrm{{invis.}}}= \frac{\sqrt{B}}{S} {\upsigma }_{\mathrm{H} \mathrm{Z}}^\mathrm{SM}, \end{aligned}$$

where S is the number of signal events that would have been selected for the case of a 100 % branching fraction for \(\mathrm{H} \rightarrow \text {invis.}\) This uncertainty is minimised for a relative likelihood cut of \(\mathcal{{L}}>0.60\), resulting in a \({\pm }0.58~\%\) statistical uncertainty on a \(\upsigma _\text {invis.}\), relative to the SM cross section for \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \). The corresponding selection efficiencies are shown in Table 4, where the expected background from SM \(\mathrm{H} \mathrm{Z} \) production includes the \(\mathrm{H} \rightarrow \mathrm{Z} \mathrm{Z} ^*\rightarrow {\upnu } {\overline{\upnu }} {\upnu } {\overline{\upnu }} \) component that has a SM branching fraction of 0.1 %.

Table 4 Summary of the CLIC \((\mathrm{H} \rightarrow \text {invis.})(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) event selection at \(\sqrt{s} =350~\text {GeV} \), giving the raw cross sections, preselection efficiency, overall selection efficiency for a likelihood cut of \(\mathcal{{L}}>0.60\) and the expected numbers of events passing the event selection for an integrated luminosity of \(500~\text {fb}^{-1}\) and unpolarised electron and positron beams. The numbers shown for the invisible Higgs decay modes correspond to a 100 % branching ratio

A more optimal approach to extracting the signal cross section is to fit the shape of the likelihood distribution of Fig. 8b, rather than simply imposing a single likelihood cut. In the limit that the invisible branching ratio is small, the resulting Gaussian uncertainty on the \(\mathrm{H} \mathrm{Z} \) production cross section with invisible Higgs decays is

$$\begin{aligned} \frac{ \varDelta {\upsigma }_{\text {invis.}}}{{\upsigma }_{\mathrm{H} \mathrm{Z}}^\mathrm{SM}} = {\pm } 0.56~\% , \end{aligned}$$

relative to the SM \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \) cross section. For the SM Higgs, the corresponding expected 95 % confidence level upper limit on the invisible Higgs branching ratio is

$$\begin{aligned} \text {BR} (\mathrm{H} \rightarrow \text {invis.}) < 0.9~\%\quad \text {at} \ 95~\% \ \text {C.L.} \end{aligned}$$

3.6 Model independence of the hadronic recoil mass measurement

By combining the two analyses for \(\mathrm{H} \mathrm{Z} \) production where \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) and the Higgs decays either to visible or invisible final states,

$$\begin{aligned} \upsigma (\mathrm{H} \mathrm{Z}) = \frac{\upsigma _{\text {vis.}} + \upsigma _\text {invis.}}{\text {BR} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )} , \end{aligned}$$

it is possible to determine the absolute \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} \) cross section in a nearly model-independent manner. Since the fractional uncertainties (relative to the total cross section) on the visible and invisible cross sections are 1.7 and 0.6 % respectively, the fractional uncertainty on the total cross section will be the quadrature sum of these two fractional uncertainties, namely

$$\begin{aligned} \varDelta \upsigma (\mathrm{H} \mathrm{Z}) = {\pm }1.8~\%. \end{aligned}$$

Thus, the Higgsstrahlung cross section can be measured with a precision of better than 2 % at \(\sqrt{s} = 350~\text {GeV} \) using the hadronic recoil mass (for 500 \(\text {fb}^{-1}\) of data with unpolarised beams). Such a measurement is competitive with that obtainable from the leptonic recoil mass measurement at \(\sqrt{s} =250~\text {GeV} \), where a precision of \({\pm } 2.6~\%\) [5] is achievable with 250 \(\text {fb}^{-1}\) of data (assuming \(-80~\%\) and \(+30~\%\) polarisation of the electron and positron beams). The strongest physics argument for operating a linear collider at \(\sqrt{s} =250~\text {GeV} \) is the model-independent measurement of \(\upsigma (\mathrm{H} \mathrm{Z})\) that provides a determination of \(g_{\mathrm{H} \mathrm{Z} \mathrm{Z}} \). If it can be argued that the hadronic recoil mass measurement is effectively independent of the nature of the Higgs boson decay modes (including possible extensions to the SM), then the arguments for operating an \(\mathrm{e}^{+} \mathrm{e}^{-} \) linear collider at \(\sqrt{s} \sim 250~\text {GeV} \) are greatly reduced; almost all other measurements of the properties of the Higgs boson are found to benefit from higher centre-of-mass energies [5]. In addition, operating at \(\sqrt{s} \sim 350~\text {GeV} \) allows the study of Higgs production through the \(\mathrm{W} \mathrm{W} \)-fusion process and the pair production of top quarks. It is worth noting that the hadronic recoil mass analysis will not deliver the precise Higgs mass measurement that can be obtained from the leptonic recoil mass distribution.

The hadronic recoil mass measurement of \(\upsigma (\mathrm{H} \mathrm{Z})\) can only be truly model independent if the overall (visible \(+\) invisible) selection efficiency is independent of the Higgs decay mode. Table 5 summarises the combined selection efficiency for \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\), broken down into the different Higgs decay modes. Also shown are the efficiencies for \(\mathrm{H} \rightarrow \mathrm{W} \mathrm{W} ^*\) decays broken down into the different \(\mathrm{W} \) decay modes, covering a very wide range of event topologies, from four-jet final states \((\mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} )\) to final states with two relatively soft particles, for example the visible tau decay products from \(\mathrm{W} \mathrm{W} ^*\rightarrow {\uptau } {\upnu } {\uptau } {\upnu } \). For all final-state topologies, the combined (visible \(+\) invisible) selection efficiency lies between 19 and 26 % compared to the mean selection efficiency of \({\sim }23~\%\); a relative variation of \({\pm }15~\%\). It should be noted that these numbers are only indicative, since the measured cross sections are extracted from fits to the likelihood distributions, rather than from a likelihood cut.

Table 5 Summary of the efficiencies of the \(\mathrm{H} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) analyses at \(\sqrt{s} =350~\text {GeV} \), giving the overall selection efficiency for the visible analysis (\(\mathcal{{L}}>0.65\)) and the invisible Higgs analysis (\(\mathcal{{L}}>0.60\)). Here \({\ell } \) refers to either \(\mathrm{e}\) or \({\upmu } \)
Table 6 Biases in the extracted \(\mathrm{H} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) cross section for cases where the Higgs BR to a specific final state is increased by 5 %, i.e. \(\text {BR} (\mathrm{H} \rightarrow X) \rightarrow \text {BR} (\mathrm{H} \rightarrow X) + 0.05\)

To assess the impact of the different sensitivities to the different \(\mathrm{H} \) decay topologies, the different Higgs decay modes in the \(\mathrm{H} \mathrm{Z} \) MC samples are reweighted to correspond to modified (non-SM) branching fractions and the total (visible \(+\) invisible) cross section is extracted as before (assuming the SM Higgs branching ratios). Table 6 shows the resulting biases in the extracted total cross section for the case when a \(\text {BR} (\mathrm{H} \rightarrow X) \rightarrow \text {BR} (\mathrm{H} \rightarrow X) + 0.05\). In all cases, the resulting biases in the extracted total \(\mathrm{H} \mathrm{Z} \) cross section are less than 1 %, which should be compared to the 1.8 % statistical uncertainty. These variations represent large deviations from the SM which would be observable in studies of exclusive final states. For example, for an integrated luminosity of 500 \(\text {fb}^{-1}\), a 5 % (absolute) increase in branching ratio would result in an increase of 3350 \(\mathrm{H} \mathrm{Z} \) events in that particular Higgs decay topology, including an increase of 230 events with either \(\mathrm{Z} \rightarrow {{\upmu }}^{+} {{\upmu }}^{-} \) and \(\mathrm{Z} \rightarrow \mathrm{e}^{+} \mathrm{e}^{-} \) decays. Such large effects would be observable at a linear collider either through their impact on exclusive Higgs branching ratio analyses or they would manifest themselves as large excesses of events in the \(\mathrm{Z} \rightarrow \mathrm{e}^{+} \mathrm{e}^{-} /{{\upmu }}^{+} {{\upmu }}^{-} \) recoil mass analysis event samples. It is therefore reasonable to conclude that unless very large BSM effects had been previously discovered, the hadronic recoil mass study gives an effectively model-independent measurement of the \(\mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) cross section.

4 The hadronic recoil mass measurement at the ILC

The hadronic recoil mass study presented in this paper was first performed in the context of the CLIC accelerator, where the first stage of the machine was assumed to operate at \(\sqrt{s} =350~\text {GeV} \). The study was then repeated for the ILC at \(\sqrt{s} =350~\text {GeV} \). Again a full Geant4 simulation of the detector response and a full reconstruction of the simulated events was performed. Since both studies used the same simulation and reconstruction software, only small differences in precisions on \(\upsigma (\mathrm{H} \mathrm{Z})\) from the hadronic recoil mass measurement at the ILC and CLIC are expected. There are two main effects. Firstly, because of the smaller beam spot at CLIC, the impact of beamstrahlung is greater than for the ILC, leading to a larger number of events towards lower values of \({\sqrt{s^\prime }}\) at CLIC compared to the ILC, where \({\sqrt{s^\prime }}\) is the effective centre-of-mass energy of the colliding electron and positron after the radiation of beamstrahlung photons, although the difference is not large at \(\sqrt{s} =350~\text {GeV} \). Secondly, the ILD detector concept used for the ILC studies has more complete calorimeter coverage down to low polar angles than the CLIC_ILD detector concept used for the CLIC studies. Both effects will tend to degrade the hadronic recoil mass reconstruction for the CLIC configuration compared to the ILC. However, the impact is not large, as can be seen from Fig. 9.

Fig. 9
figure 9

The reconstructed hadronic recoil mass distributions for events passing the preselection cuts a for the clear two-jet topology of the invisible Higgs decay analysis and b for the visible Higgs decay analysis. In both cases the distributions compare the CLIC and ILC simulations for \(500~\text {fb}^{-1} \) at \(\sqrt{s} =350~\text {GeV} \), with unpolarised beams

Table 7 Summary of the statistical precision achievable on \(\upsigma (\mathrm{H} \mathrm{Z})\) from the hadronic recoil mass analysis at \(\sqrt{s} =350~\text {GeV} \) for CLIC and the ILC. The ILC numbers are shown for both zero and the nominal beam polarisations

Table 7 compares the statistical precision achievable at a centre-of-mass energy of \(\sqrt{s} =350~\text {GeV} \) for: 500 \(\text {fb}^{-1} \) at CLIC with unpolarised beams; 500 \(\text {fb}^{-1} \) at the ILC with unpolarised beams; and 350 \(\text {fb}^{-1} \) at the ILC with the nominal ILC beam polarisationsFootnote 1 of \(P(\mathrm{e}^-,\mathrm{e}^+) = (-0.8,+0.3)\). For the same integrated luminosity and unpolarised beams, the precision achievable at the ILC is approximately 8 % better than that at CLIC, reflecting the slightly better recoil mass resolution at the ILC seen in Fig. 9. Since the instantaneous luminosity at the ILC is expected to scale with the Lorentz boost of the colliding beams \(\gamma _{\mathrm{e}}\), the time taken to accumulate 350 \(\text {fb}^{-1} \) of data at \(\sqrt{s} \) is comparable to the time required for 250 \(\text {fb}^{-1} \) at \(\sqrt{s} =250~\text {GeV} \). Hence, for the nominal ILC beam polarisation of \(P(\mathrm{e}^-,\mathrm{e}^+) = (-0.8,+0.3)\), the statistical precision of \(1.8~\%\) achievable on the \(\mathrm{H} \mathrm{Z} \) cross section at \(\sqrt{s} =350~\text {GeV} \) using \(\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \) is directly comparable to the statistical precision of 2.6 % [5, 10, 11] achievable with \(250~\text {fb}^{-1} \) of data at \(\sqrt{s} =250~\text {GeV} \) using \(\mathrm{Z} \rightarrow {\ell } ^{+} {\ell } ^{-} \) decays. This conclusion weakens the motivation for operating a future linear collider significantly below the top-pair production threshold.

5 Centre-of-mass energy dependence of the hadronic recoil mass analysis

The hadronic recoil mass analysis described above for \(\sqrt{s} =350~\text {GeV} \) was repeated for CLIC at \(\sqrt{s} = 250~\text {GeV} \) and \(\sqrt{s} = 420~\text {GeV} \). In each case a full set of SM model background processes was generated using the Geant4 simulation of the CLIC_ILD detector concept. Because the complete simulation of the CLIC beam is not available for these centre-of-mass energies; the 250 \(\text {GeV} \) samples used that same \(\sqrt{s^\prime }/\sqrt{s} \) distribution as for \(\sqrt{s} = 350~\text {GeV} \), whereas the 420 \(\text {GeV} \) used the \(\sqrt{s^\prime }/\sqrt{s} \) for the 500 \(\text {GeV}\) option for CLIC. The analysis described in Sect. 3 was repeated at each centre-of-mass energy using the appropriate distributions for the likelihood function. The binning and range used for \(m_\mathrm {rec} \) in the two-dimensional distribution of \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) versus \(m_\mathrm {rec} \) was optimized for each centre-of-mass energy. The resulting sensitivities are listed in Table 8. Compared to \(\sqrt{s} = 350~\text {GeV} \), the overall sensitivity for \(\varDelta \upsigma (\mathrm{H} \mathrm{Z})\) is worse at both \(\sqrt{s} =250~\text {GeV} \) and \(\sqrt{s} =420~\text {GeV} \), although for two different reasons (explained below).

Table 8 Summary of the statistical precision achievable on \(\upsigma (\mathrm{H} \mathrm{Z})\) from the hadronic recoil mass analysis at CLIC for \(\sqrt{s} =250~\text {GeV} \), \(\sqrt{s} =350~\text {GeV} \) and \(\sqrt{s} = 420~\text {GeV} \). In each case unpolarised beams were assumed
Fig. 10
figure 10

The two-dimensional distributions of the reconstructed \(\mathrm{Z} \) mass, \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \), versus the reconstructed hadronic recoil mass, \(m_\mathrm {rec} \), in the visible Higgs decay analysis, broken down into \(\mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) signal and the SM background for CLIC operating at \(\sqrt{s} = 250~\text {GeV} \), \(\sqrt{s} = 350~\text {GeV} \) and \(\sqrt{s} = 420~\text {GeV} \). All events passing the preselection cuts are included

Figure 10 shows two-dimensional distributions of \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \) versus \(m_\mathrm {rec} \), broken down into signal and background for the three centre-of-mass energies considered. For all centre-of-mass energies, the most significant backgrounds are from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \) and \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} {\ell } {\upnu } \). The \(\mathrm{q} {\overline{\mathrm{q}}} {\ell } {\upnu } \) background (predominately from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{W} \mathrm{W} \)) accounts for the broad band of events on the left-hand side of the background plots. This event population is well separated from the signal region. The more significant background arises from the \(\mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \) final state, populating the regions with \(m_{\mathrm{q} {\overline{\mathrm{q}}}} \sim m_{\mathrm{Z}} \). In this region, the \(\mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \) background arises primarily from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{W} \mathrm{W} \), \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{Z} \mathrm{Z} ^*\) and \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{Z} {\upgamma } ^*\), where the “\(*\)” indicates an off-mass-shell particle; the component from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{Z} \mathrm{Z} \), where both \(\mathrm{Z} \) bosons are on-shell is largely suppressed by the preselection cuts. The board recoil mass distribution for the preselected \(\mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \) background is pushed towards the kinematic limit due to two main effects: (i) the pair of jets with the invariant mass closest to \(m_{\mathrm{Z}} \) is used to calculate the four-momentum of the assumed \(\mathrm{Z} \) boson, in the case of the \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{W} \mathrm{W} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \) background, this can lead to pairing of two jets from different \(\mathrm{W} \)-boson decays; (ii) for events with significant ISR or beamstrahlung, the calculated recoil mass (which uses the assumed centre-of-mass energy \(\sqrt{s} \), rather than \(\sqrt{s^\prime } \)) is higher than the invariant mass of the recoiling system.

From Fig. 10 it can be clearly seen that the width of the recoil mass distribution for \(\mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) events increases with increasing centre-of-mass energy. This can be understood from the expression for the recoil mass:

$$\begin{aligned} m^2_{\text {rec}}= & {} (\sqrt{s}- E_{\mathrm{Z}})^2 -(-{\mathbf {p}}_{\mathrm{Z}})^2 \\= & {} s -2\sqrt{s} ~ E_{\mathrm{Z}} + E_{\mathrm{Z}}^2 - {\mathbf {p}}_{\mathrm{Z}}^2 \\\approx & {} s + m_{\mathrm{Z}} ^2 -2\sqrt{s} \,(E_1+E_2), \end{aligned}$$

where \(E_1\) and \(E_2\) are the energies of the two jets forming the reconstructed \(\mathrm{Z} \) boson and assuming \(E_{\mathrm{Z}}^2 - {\mathbf {p}}_{\mathrm{Z}}^2 \approx m_{\mathrm{Z}} ^2\), which is true for the signal region. Propagating the errors on the jet energy measurements, \(\upsigma _1\) and \(\upsigma _2\), implies that

$$\begin{aligned} {\upsigma }_{m_{\text {rec}}}= \frac{\sqrt{s}}{m_{\text {rec}}}\,( {\upsigma }_1^2+{\upsigma }_2^2)^{\frac{1}{2}}. \end{aligned}$$

Therefore, the recoil mass resolution is expected to worsen with increasing centre-of-mass energy due to both the \(\sqrt{s} \) dependence and the fact that the absolute uncertainty on the jet energies increases with jet energyFootnote 2 (\(\upsigma _E \sim 0.03E\)) and therefore with centre-of-mass energy. The expected increased width of the recoil mass distribution accounts for the increase of \(\varDelta \upsigma _{\text {invis.}}\) with \(\sqrt{s} \), listed in Table 8, and the larger value of \(\varDelta \upsigma _{\text {vis.}}\) at \(\sqrt{s} =420\,\text {GeV} \). However, despite the better recoil mass resolution, the sensitivity to \(\varDelta \upsigma _{\text {vis.}}\) at \(\sqrt{s} =250~\text {GeV} \) is significantly worse than for the other centre-of-mass energies considered. The reason for this can be seen clearly in Fig. 10. At \(\sqrt{s} =250~\text {GeV} \), \(\mathrm{H} \mathrm{Z} \) production is not very far above threshold and the recoil mass distribution is relatively close to the kinematic limit. This is the region populated by the large \(\mathrm{q} {\overline{\mathrm{q}}} \mathrm{q} {\overline{\mathrm{q}}} \) background passing the preselection cuts, resulting in a greatly reduced separation between signal and background in the variable that provides the best distinguishing power, namely \(m_\mathrm {rec} \).

6 Summary and conclusions

This paper presents the first detailed study of the potential of the hadronic recoil mass analysis at a future linear collider, both for visible Higgs decay modes and possible BSM invisible decay modes. By combining the analyses for visible and invisible modes, it is shown that the measured \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) cross section does not depend strongly on the nature of the Higgs boson decay and thus provides a model-independent determination of \(g_{\mathrm{H} \mathrm{Z} \mathrm{Z}} \). The statistical precision achievable at CLIC operating at \(\sqrt{s} =350~\text {GeV} \) with \(500~\text {fb}^{-1} \) of data with unpolarised beams is \(\varDelta \upsigma _{\mathrm{H} \mathrm{Z}}\approx {\pm } 1.8~\%\). A similar precision is obtained for the ILC with \(350~\text {fb}^{-1} \) and the nominal beam polarisation of \(P(\mathrm{e}^+,\mathrm{e}^-) = (+30~\%,-80~\%)\). In both cases the branching ratio to invisible decay modes can be constrained to \(\text {BR} (\mathrm{H} \rightarrow \text {invis.}) < 1~\%\) at 90 % confidence level. It is demonstrated that \(\sqrt{s} = 350~\text {GeV} \) is likely to be close to the optimal energy for the hadronic recoil mass analysis; at lower centre-of-mass energies there is less discrimination between signal and background and at higher centre-of-mass energies the measurement is limited by the worsening recoil mass resolution.

It is often stated that operation of a future \(\mathrm{e}^{+} \mathrm{e}^{-} \) linear collider close to threshold (\(\sqrt{s} \sim 250~\text {GeV} \)) is necessary to provide an absolute measurement of the coupling between the Higgs boson and the Z boson, \(g_{\mathrm{H} \mathrm{Z} \mathrm{Z}}\). This is based on the determination of \(g_{\mathrm{H} \mathrm{Z} \mathrm{Z}}\) from the recoil mass analysis for \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow {\ell } ^{+} {\ell } ^{-} )\). The results presented in this paper show that, for a comparable running time, a statistically more precise measurement can be obtained from \(\mathrm{e}^{+} \mathrm{e}^{-} \rightarrow \mathrm{H} \mathrm{Z} (\mathrm{Z} \rightarrow \mathrm{q} {\overline{\mathrm{q}}} )\) events at \(\sqrt{s} =350~\text {GeV} \). This conclusion argues against initial operation of a future linear collider at significantly below the top-pair production threshold.