1 Introduction

After the observation of a Standard Model-like Higgs boson [1, 2] the main physics program has shifted to measuring its precise properties. The exotic possibility of the new particle having spin one or two is already disfavored [3, 4] by the analysis of its decays into \(\gamma \gamma \) [58], \(WW^*\) [911], and \(ZZ^*\) [4, 1221]; so is the possibility of it being a pure pseudo-scalar [22, 23], though it could still be an admixture of scalar and pseudo-scalar [2431]. The couplings to gauge bosons and fermions have also been measured for various decay modes, and they are so far consistent with the SM predictions within uncertainties [32, 33].

With no apparent deviation from the SM so far, it is important to closely examine the channels where one has a fighting chance to encounter new physics. One such promising process is Higgs production via gluon fusion: In order to avoid unnatural fine-tuning while still obtaining a light Higgs mass, loops of new particles need to soften to the Higgs mass squared UV sensitivity of the top loop. If these particles are charged under the \(SU(3)\) color gauge group (which they are in almost all known cases), gluons will couple to the loop. With two gluons coupled to the new physics loop and one Higgs set to its vacuum expectation value, one gets a contribution to gluon fusion, the dominant Higgs production mechanism; see e.g. [34, 35].

At the same time, top partners can lead to a modified top-Yukawa coupling. A change in the top-Yukawa affects the Higgs production cross section and can even compensate for new particles in the loop such that a SM-like inclusive cross section is obtained even though new physics is present. The reason for this is that already for the top quark the effective gluon Higgs interaction [36, 37] obtained from the low energy theorem is a very good description [38, 39] which works even better for heavier particles. Therefore the inclusive amplitude can be expressed as the sum of two identical Feynman diagrams with the effective interaction (one from the top loop and one from the non-SM loop) which differ only by a coefficient, \(c_t\) and \(\kappa _g\), respectively. The cross section is therefore only sensitive to the absolute square of the sum of these coefficients. The effects of this in composite Higgs models were calculated in [4044] where it is shown that the contributions to the inclusive cross section indeed cancel in minimal models.

The main idea now is to study boosted Higgs shapes above a certain \(p_{ T}\) scale. This scale should be high enough to resolve the top loop beyond the effective description, but low enough to keep the effective description of the loop of the new particle valid; see [45] for a discussion in a concrete model. The simple relation \(\sigma \propto |c_t+\kappa _g|^2\) for the inclusive cross sections does not apply but is modified, allowing the two coefficients to be extracted separately, when combined with the inclusive measurement. Early studies looking for New Physics in the Higgs \(p_T\) distribution in the gluon-fusion production mode include [4648] and recent preliminary studies looking at highly boosted Higgs shapes include [45, 4951]. An alternative approach to measure the coupling \(c_t\) in boosted \(p p \rightarrow HZ\) was presented in [52, 53]. Although there are attempts to measure \(c_t\) directly by looking into the difficult \(t\bar{t}H\) channel [5460], it is important to explore boosted Higgs production from gluon fusion as a complementary approach.

To simplify the extraction of the small amount of high-\(p_{ T}\) signal from the background, we focus on the clean decay of a Higgs to two leptons \(\ell =e^\pm ,\mu ^\pm \) and missing transverse momentum \(\mathbf {p}\!\!/_\mathrm{T}\). For a \(125\,\text {GeV}\) Standard Model-like Higgs boson, this occurs almost entirely via \(H\rightarrow WW^*\) and \(H\rightarrow \tau \tau \); we will focus on these two channels separately as detailed in Sect. 4.

The organization of the paper is as follows. In Sect. 2 we discuss some examples of beyond the Standard Model physics which motivate this analysis. Section 3 outlines how we generated our signal and background samples. Section 4 contains our signal versus background analyses for boosted \(H\rightarrow 2\ell +\mathbf {p}\!\!/_\mathrm{T}\) in the Standard Model. Section 5 contains a discussion of the analysis and we conclude in Sect. 6.

2 New physics models

2.1 Minimal Composite Higgs Model

In the Minimal Composite Higgs Model (MCHM) [61], electroweak symmetry is broken dynamically by a strong interaction based on the coset \(SO(5)/SO(4)\). For reviews of MHCM see [62, 63]. In this class of models, the Higgs arises as a pseudo-Nambu–Goldstone Boson (pNGB) of the symmetry breaking which naturally explains its small mass. Fermionic resonances of the strong sector, coming in multiplets of \(SO(4)\), will contribute to the gluon-fusion loop diagram. These resonances also mix with the SM fermions and thus modify their couplings to the Higgs. Interestingly, the contributions of these resonances to the sum of the coefficients \(\kappa _g\) and \(c_t\) cancel exactly in a broad class of MCHM models and lead, up to small corrections which are negligible at the LHC [64], to [4044]

$$\begin{aligned} c_t+\kappa _g=f_g(\xi ) \end{aligned}$$
(1)

where \(f_g\) is a function satisfying \(f_g(\xi \rightarrow 0)=1\) with \(\xi \equiv v^2/f^2\) and \(f\) is the decay constant of the non-linear sigma model. The gluon-fusion cross section is therefore independent of the mass spectrum of the fermionic resonances, and for small \(\xi \) is even SM-like. This makes it impossible to find traces of the top partner spectrum in the inclusive gluon-fusion process.

While the resonances are needed to cut off the UV-divergences of the Higgs mass and thus must not be too heavy to avoid excessive fine tuning, they should still be heavy enough to allow for an effective description of the boosted Higgs production. In [45] it was shown that as long as the mass of the lightest resonance is at least of the order of the Higgs transverse momentum, the result of the calculation in the heavy top limit lies within \(\mathcal {O}(10\,\%)\) of the full calculation. Considering that the masses of the resonances have to be heavier than \(600-800~\mathrm{GeV}\) depending on the representation [6574], the effective description is well justified within the scope of the paper.

2.2 Supersymmetry

In the Minimal Supersymmetric SM (MSSM), an analogous flat direction of the inclusive cross section exists which can be resolved by looking at boosted Higgs shapes. For certain choices of the stop masses \(m_{\tilde{t}_{1}}\), \(m_{\tilde{t}_{2}}\) and \(A_t\), the effects of two contributions cancel, yielding a SM-like inclusive signal strength [7583]. Assuming the MSSM is in the decoupling limit, and neglecting small D-term contributions, the inclusive signal strength is given by [84]

$$\begin{aligned} \frac{\Gamma (gg\rightarrow H)}{\Gamma (gg\rightarrow H)_{\mathrm{SM}}}=(1+\Delta _t)^2 \end{aligned}$$
(2)

where

$$\begin{aligned} \Delta _t\approx \frac{m_t^2}{4}\left( \frac{1}{m^2_{\tilde{t}_1}} +\frac{1}{m^2_{\tilde{t}_2}}-\frac{(A_t-\mu /\tan \,\beta )^2}{m^2_{\tilde{t}_1}m^2_{\tilde{t}_2}}\right) \end{aligned}$$
(3)

quantifies the deviation from the SM value and can vanish due to the relative minus sign. A \(125~\mathrm {GeV}\) Higgs can easily achieved by extending the MSSM by additional D- or F-terms which should, of course, not have a major impact on the couplings of the SM-like lightest Higgs.

Since the \(A_t\)-dependent parts of the production cross section are less sensitive to the boost of the Higgs than the \(A_t\)-independent ones, the aforementioned degeneracy gets broken in the boosted regime. Therefore the non-SM nature of the Higgs production can be revealed by looking at the boosted production. Moreover, this can make light stops [8598] accessible which are hidden in the stealth region and challenging to extract given the similarity to the top background [99103]. An outline showing this sensitivity and taking vacuum stability constraints into account has been presented in [45].

2.3 Effective description

It is useful to parametrize our ignorance of new physics in terms of an effective Lagrangian. Out of the 59 dimension six operators one can add to the SM [104, 105], only four can affect the Higgs production through gluon fusion [106108]. These four operators as well as the other dimension six operators involving the Higgs are already constrained to some extent by LHC data [108113]. We will focus on CP-conserving effects and omit the CP-violating operator containing the dual of the QCD gauge field strength. The remaining three important operators are

$$\begin{aligned}&\mathcal {O}_y=\frac{y_t}{v^2}|H|^2\bar{Q}_L \widetilde{H} t_R,\quad \mathcal {O}_H=\frac{1}{2v^2}\partial _\mu |H|^2\partial ^\mu |H|^2,\nonumber \\&\qquad \quad \text {and}\quad \mathcal {O}_g=\frac{\alpha _S}{12 \pi v^2}|H|^2 G_{\mu \nu }^a G^{a\,\mu \mu }. \end{aligned}$$
(4)

After adding them to the SM Lagrangian and extracting the terms relevant for the gluon-fusion process we obtain

$$\begin{aligned} \mathcal {L}_{\text {eff}}=-c_t \frac{m_t}{v}\bar{t}tH+\kappa _g\frac{\alpha _S}{12\pi }\frac{h}{v}G^a_{\mu \nu }G^{a\,\mu \nu } + \mathcal {L}_{\text {QCD}}, \end{aligned}$$
(5)

where \(c_t=1-\text {Re}(C_y)-C_H/2\) scales the top Yukawa coupling which enters the process via the top loop and \(\kappa _g=C_g\) controls the direct gluon–Higgs interaction. The \(C_i\) are the coefficients of the corresponding operators in (4) and the coefficients are chosen such that for \(c_t=1\) and \(\kappa _g=0\) the SM Lagrangian is obtained.

The full matrix element for boosted Higgs production is then given byFootnote 1

$$\begin{aligned} \mathcal {M}(c_t,\kappa _g)=c_t \mathcal {M}_{\mathrm{IR}}+\kappa _g \mathcal {M}_{\mathrm{UV}} \end{aligned}$$
(6)

where \(\mathcal {M}_{\mathrm{IR}}\) is the matrix element taking the full top mass dependence into account [115] and \(\mathcal {M}_{\mathrm{UV}}\) is the one obtained from \(\mathcal {M}_{\mathrm{IR}}\) in the heavy top limit or equivalently from the tree-level diagram generated by \(\mathcal {O}_g\). From Eq. (6) we see that the differential cross section, normalized by the SM value, can be described as

$$\begin{aligned} \frac{\sigma ({p_T^\mathrm{cut}})}{\sigma ^{\mathrm{SM}}({p_{ T}^\mathrm{cut}})} = \frac{\int _{p_T^\mathrm{cut}}^{\infty } \mathrm{d} p_{ T} \mathrm{d} \Omega | c_t \mathcal{M}_{\mathrm{IR}}(m_t) + \kappa _g \mathcal{M}_{\mathrm{UV}}|^2}{\int _{p_{ T}^\mathrm{cut}}^{\infty } \mathrm{d} p_{ T}\mathrm{d} \Omega |\mathcal{M}_{\mathrm{IR}}(m_t)|^2}\nonumber \\ = (c_t + \kappa _g)^2 + \delta (p_T^\mathrm{cut}) c_t \kappa _g + \epsilon (p_T^\mathrm{cut}) \kappa _g^2, \end{aligned}$$
(7)

where

$$\begin{aligned}&\delta (p_{ T}^\mathrm{cut}) = \frac{2\int _{p_{ T}^\mathrm{cut}}^{\infty } \mathrm{d} p_{ T} \mathrm{d} \Omega { Re}(\mathcal{M}_{\mathrm{IR}}(m_t) \mathcal{M}_{\mathrm{UV}}^*)}{\int _{p_T^\mathrm{cut}}^{\infty }\mathrm{d} p_{ T} \mathrm{d} \Omega |\mathcal{M}_{\mathrm{IR}}(m_t)|^2} - 2,\end{aligned}$$
(8)
$$\begin{aligned}&\epsilon (p_{ T}^\mathrm{cut}) = \frac{\int _{p_{ T}^\mathrm{cut}}^{\infty } \mathrm{d} p_{ T}\mathrm{d} \Omega |\mathcal{M}_{\mathrm{UV}}|^2}{\int _{p_{ T}^\mathrm{cut}}^{\infty } \mathrm{d} p_{ T} \mathrm{d} \Omega |\mathcal{M}_{\mathrm{IR}}(m_t)|^2} - 1. \end{aligned}$$
(9)

For small \(p_{ T}^\mathrm{cut}\), the coefficients \(\delta , \epsilon \) are very small, modifying the cross section only by a few percent, which is less than the uncertainty expected in the inclusive Higgs cross section measurements [116118]. This is what is expected due to the very good description of both the top and the new particle loop by the effective interaction. On the other hand, \(\delta , \epsilon \) grow significantly as \(p_{ T}^\mathrm{cut}\) increases, and they become \(\mathcal{O}(1)\) for \(p_{ T}^\mathrm{cut}> 300\) GeV [45]. It means we can break the degeneracy by measuring the Higgs \(p_{ T}\) distribution while we cannot break the degeneracy along \(c_t + \kappa _g = \mathrm{const.}\) direction only by determining the inclusive cross section.

3 Event generation

3.1 Signal sample

In this paper we consider \(H+\)jet events with subsequent \(H\) decays to \(WW^* \rightarrow \ell ^+\ell ^- \nu \bar{\nu }\) and \(\tau ^+\tau ^-\) modes as a signal. The signal events are generated with MadGraph5, version 1.5.15 [119] and showered with HERWIG++ [120122], where only \(WW^*\) and \(\tau ^+\tau ^-\) decays are specified.

We have used MadGraph5 to generate \(H+\)jet events using the ‘HEFT’ model with SM couplings which makes use of the low energy theorem. The generated cross section is proportional to \(|\mathcal {M}(0,1)|^2\) and does not take into account finite top mass effects which are crucial to our analysis. To obtain the correct weight of the events we reweighted them by a weight factor

$$\begin{aligned} w(c_t,\kappa _g)=\frac{|\mathcal {M}(c_t,\kappa _g)|^2}{|\mathcal {M}(0,1)|^2} \end{aligned}$$
(10)

making use of our own code, which is based on an implementation of the formulas for the matrix elements given in [115] and also calculated in [123]. At present no finite top mass NLO computation of the SM Higgs \(p_{\mathrm{T}}\) spectrum is available. An exact NLO prediction of SM Higgs \(p_{\mathrm{T}}\) spectrum would be very desirable and help to exploit the full potential of this observable. Recent progress in the precision prediction of \(h+\mathrm {jet}\) can be found in Refs. [124126]. We will approximate the NNLO (+ NNLL) result of 49.85 pb [127130] by multiplying the exact LO result with a \(K\) factor of 1.71.

We reweight the events for points along the line \(c_t + \kappa _g = 1\) for \(\kappa _g\in [ -0.5, 0.5]\) with steps of 0.1, as shown in the left panel of Fig. 1. This is consistent with the SM inclusive Higgs production cross section. The size of \(c_t\) alone is only weakly constrained by the current \(t\bar{t}H\) measurement. Although we only consider the most difficult points satisfying \(c_t + \kappa _g = 1\) (i.e. an exactly SM-like inclusive cross section), an analysis along different \(c_t + \kappa _g = \mathrm{const.}\) lines would be straightforward as a different choice essentially just corresponds to an overall rescaling of the signal.

Fig. 1
figure 1

Left panel: model points generated for this analysis in \((c_t, \kappa _g)\) plane. The shaded area shows parameter space which gives the inclusive cross section consistent to the SM prediction within 20 %. Right panel: parton-level \(p_{T,H}\) distributions for the SM, and \((c_t,\kappa _g)=(1 -\kappa _g, \kappa _g)\) with \(\kappa _g =\pm 0.1,\pm 0.3,\pm 0.5\)

The right panel of the Fig. 1 shows the \(p_{T,H}\) distributions for several model points. In the region with low \(p_{T,H}\), the distributions are degenerate but for high \(p_{T,H}\) the distributions start to split. For the model points with \(\kappa _g>0\) we see an enhancement in the high \(p_{T,H}\) region while we see the suppression for the model points with \(\kappa _g<0\). Table 1 shows the Higgs production cross sections relative to the SM value for several model points \((c_t,\kappa _g)\) and \(p_{T,H}\) cuts. As one can see, for \(p_{T,H}>10\,\text {GeV}\) the cross sections are essentially the same as the SM value within 3 %, while for increasing \(p_T^\mathrm{cut}\), significant differences from the SM predictions can be observed. For the model point \((c_t,\kappa _g)=(0.7,0.3)\), for example, a 6 % difference would be observed for \(\sigma (p_{T,H} > 200~\mathrm {GeV}\)), and a \(\sim \) 20 % difference for \(\sigma (p_{T,H} > 300~\mathrm {GeV}\)). We will see that these effects are comparable to the sensitivity of the boosted Higgs shape measurements; see Sect. 4. For very hard cuts, \(\mathcal{O}(1)\) differences can be observed, as can be seen from the cross section ratios for \(p_{T,H} > 500\) GeV and harder.

Table 1 Cross-sections normalized by the SM value after applying several \(p_{T,H}\) cuts in parton level for several model points \((c_t,\kappa _g)\)

3.2 Background sample

We include \(W\)+jets, \(Z\)+jets and \(t\bar{t}\)+jets as background processes which we have generated with ALPGEN + PYTHIA [131, 132]. Since we consider boosted Higgs reconstruction and since we will require the existence of one hard recoil jet, we apply a pre-selection cut in the generation step, where we demand at least one recoil parton of \(p_{T}>150\) GeV. We merge up to two partons for \(WW\)+jets and \(Z\)+jets, and up to one parton for \(t\bar{t}\)+jets using the MLM matching scheme [133, 134]. As we only consider the dilepton mode in this paper we preselect the \(W\) decay mode, including \(W\) from tops only with leptons, \(e, \mu \), and \(\tau \). For the \(Z\) decay, we consider only \(Z \rightarrow \tau ^+\tau ^-\) since for the other leptonic decay modes we can reconstruct the \(Z\)-peak and reject them. We rescale the \(t\bar{t}\) sample to obtain a NLO inclusive cross section of 918 pb [135137]. For the \(Z+\)jets and \(WW+\)jets samples we used LO cross sections.

Our analysis is performed at particle level with a simple detector simulation with the granularity resolution of \(\Delta \eta \times \Delta \phi =0.1 \times 0.1\). After removing the isolated leptons, the energy of the remaining visible particles falling into each cell are summed up. Cells with transverse energy above \(0.5~\mathrm {GeV}\) are used for the further jet reconstruction.

Jet clustering was performed using the FastJet [138] version 3.0.4. We use the Cambridge–Aachen (C/A) algorithm [139, 140] with \(R=0.5\) for normal jet and \(b\)-tag jet definition. We also define ‘fat’ jets, as explained later, defined using the C/A algorithm with \(R=1.5\).

In this paper, we only consider the events with isolated leptons for simplicity. There is room for improving the analysis with hadronic tau modes with tau tagging for example [141, 142], which is, however, beyond the scope of our current study.

4 Boosted \(H\rightarrow 2\ell +\mathbf {p}\!\!/_\mathrm{T}\) in the standard model

In our notation a subscript \(\ell \) will denote leptonically decaying: \(\tau _\ell \) thus represents \(\tau \rightarrow \ell +2\nu \), \(W_\ell \) is mostly \(W\rightarrow l\nu \) with some \(W\rightarrow \tau _\ell \nu _\tau \), and \(t_\ell \) is \(t\rightarrow bW_\ell \). The decay \(H\rightarrow 2\ell +\mathbf {p}\!\!/_\mathrm{T}\) is mostlyFootnote 2 through \(H\rightarrow W_\ell W_\ell ^*\) and \(H\rightarrow \tau _\ell \tau _\ell \). As noted in [143, 144], in the decay \(H\rightarrow WW^*\rightarrow 2\ell +2\nu \) spin correlations ensure that the two lepton momenta have similar directions, as do the two neutrino momenta. In \(H\rightarrow \tau _\ell \tau _\ell \), however, the two \(\tau \) leptons are back-to-back in the Higgs rest frame, and each of them gives rise to a highly collimated \(\ell +2\nu \) trio. These two facts imply that for a boosted \(H\rightarrow 2\ell +\mathbf {p}\!\!/_\mathrm{T}\) decay, the \(\mathbf {p}\!\!/_\mathrm{T}\) is typically outside the lepton pair for the \(H\rightarrow W_\ell W_\ell ^*\) contribution and inside the lepton pair for \(H\rightarrow \tau _\ell \tau _\ell \), as shown in Table 2. We use this binary criterion—\(\mathbf {p}\!\!/_\mathrm{T}\) inside or outside the leptons—to split our analysis into two sub-analyses, which differ in their background compositions as well as signals.

Table 2 Showing the difference in the relative positioning of the neutrinos/\(\mathbf {p}\!\!/_\mathrm{T}\) and the dilepton system between \(H\rightarrow W_\ell W_\ell ^*\) and \(H\rightarrow \tau _\ell \tau _\ell \) decays

4.1 Common cuts for \(H\rightarrow \tau _\ell \tau _\ell \) and \(H\rightarrow W_\ell W_\ell ^*\)

In both of our sub-analyses the cuts begin by requiring the following:

  • Two opposite-sign isolated leptons each having \(p_{T} > 10\) GeV and \(|\eta |<2.5\). If a third isolated lepton with \(p_{T} > 5\) GeV and \(|\eta |<2.5\) is present, the event is vetoed. Our isolation criterion is \(E_{T,\mathrm{had}}^{R<0.2}/E_{T,\ell } < 0.1\), where \(E_{T, \mathrm{had}}^{R<0.2}\) is the sum of transverse energies over all hadronic activity in the cone \(\Delta R<0.2\) around the lepton. (The signal leptons are typically hard, so our \(p_T\) threshold could be raised with minimal loss of efficiency.)

  • A dilepton mass \(m_{\ell \ell }\) exceeding 20 GeV, which is necessary in practice to suppress Drell–Yan dilepton production (not simulated here).

  • At least 200 GeV of transverse momentum for the system obtained by vectorially summing the dilepton and missing transverse momenta:

    $$\begin{aligned} |{\mathbf p}_{T,H}| \equiv |\mathbf {p}\!\!/_\mathrm{T} + {\mathbf p}_{T,\ell _1} + {\mathbf p}_{T,\ell _2}| > 200~\mathrm{GeV}. \end{aligned}$$
    (11)

    The system thus defined has a transverse (but not longitudinal) momentum coinciding with the Higgs in the case of the signal: herein lies our restricted focus on highly energetic/boosted Higgs bosons.

  • One ‘fat’ jet, resulting from clustering using the C/A algorithm with a distance parameter \(R_\mathrm{jet}= 1.5\). This jet should be very hard:

    $$\begin{aligned} p_{T,j}> 200~\mathrm{GeV}. \end{aligned}$$
    (12)

    The presence of a very hard jet coincides with our parton-level picture of the signal process: a boosted Higgs recoiling against a gluon/quark. Defining geometrically large ‘fat’ jets allows us to capture the radiation emitted by this gluon/quark (which might otherwise be clustered into a separate jet when clustering with traditional ‘skinny’ jets). We veto if there is a second fat jet with \(p_{T,j}>100\) GeV. Vetoing on additional hadronic activity beyond the first hard fat jet suppresses higher-multiplicity backgrounds, i.e. \(t\bar{t}+\)jets. Not vetoing additional fat jets approximately doubles the \(t \bar{t}\) background, while the signal increases by roughly \(30 \%\). These numbers even hold in case regular jets with a cone size of \(R=0.4\) are vetoed instead. When vetoing jets large logarithms \(\sim \ln ^2 (\sqrt{\hat{s}}/p_{T,\mathrm {veto}})\) can be induced which need to be resummed [145, 146]. However, due to the high veto scale we do not expect these contributions to spoil the reliability of our analysis. As an alternative to jet vetos, 2-jet observables can be used to disentangle signal from background in this process [147, 148].

  • Zero \(b\)-tags. This considerably reduces the (until now dominant) \(t\bar{t}+\)jets background while having negligible effect on the signal. We re-cluster the hadronic activity into jets, again using the C/A algorithm but now with \(R_\mathrm{jet}= 0.5\), to use for the \(b\)-tagging. We assume a flat 70 % (1 %) efficiency for \(b\) (light quark or gluon) initiated jets, i.e. a 30 % (99 %) probability for such a jet not to provoke the veto. We only consider \(b\)-jets of \(p_{T,b}>30\) GeV and \(|\eta _b| < 2.5\).

The efficiencies of these cuts for the signal and various backgrounds are shown in the first part of Table 3. At this stage the backgrounds from \(WW\)/\(Z\)/\(t\bar{t}\)+jets are seen to contribute at similar levels. The set of cuts described so far are common to both our \(H \rightarrow \tau _\ell \tau _\ell \) and \(H \rightarrow W_\ell W_\ell ^*\) analyses; from this point onwards they diverge.

Table 3 Cut efficiencies for our analysis aimed at \(H \rightarrow \tau _\ell \tau _\ell \). The values for each process are cross sections in fb. \(S/\sqrt{B}\) has been calculated for \({300~\mathrm{fb}^{-1}}\). The \(W\) bosons in our \(WW\)/\(t\bar{t}\)+jets backgrounds were forced to decay to \(e\), \(\mu \), or \(\tau \)

4.2 \(H \rightarrow \tau _\ell \tau _\ell \) analysis

The Higgs mass in the decay \(H\rightarrow \tau \tau \) can be reconstructed using the collinear approximation [123]. The large hierarchy between the Higgs and tau masses ensures a very large boost for the taus, highly collimating their visible and invisible decay products. We can approximate the neutrino momenta by a decomposition of the missing transverse momentum, which assumes that each invisible momentum is parallel to the corresponding visible momentum. (This procedure can be extended to decays of more than one particle; see [149].) As was noted in [123], and further explored in [150], this procedure gains sensitivity with increasing transverse momentum of the Higgs—i.e. when the Higgs recoils against a hard jet. It suffers for a low-\(p_T\) Higgs because the two \(\tau \) daughters are then nearly back-to-back, providing a poor basis for the \(\mathbf {p}\!\!/_\mathrm{T}\) decomposition. For our high-\(p_T\) Higgs study the mass reconstruction of the signal in this manner is very good and provides a sharp peak.Footnote 3

In more detail, the Higgs mass in \(H \rightarrow \tau _\ell \tau _\ell \) is reconstructed via the collinear approximation as follows. We require the missing transverse momentum \(\mathbf {p}\!\!/_\mathrm{T}\) to be inside the two leptons (more precisely, projecting the two lepton momenta into the transverse plane defines two segments; ‘inside’ the leptons means inside the smaller segment). We decompose \(\mathbf {p}\!\!/_\mathrm{T}\) as a linear combination of the two lepton momenta (defining for it a longitudinal component in the process):

$$\begin{aligned}&\mathbf {p}\!\!/_\mathrm{T} = {\mathbf p}_{T,\nu _1,\mathrm{col}} + {\mathbf p}_{T,\nu _2,\mathrm{col}}: \qquad {\mathbf p}_{\nu _1,\mathrm{col}} = \alpha _1 {\mathbf p}_{\ell _1}, \ \ \ \nonumber \\&\qquad \quad {\mathbf p}_{\nu _2,\mathrm{col}} = \alpha _2 {\mathbf p}_{\ell _2}. \end{aligned}$$
(13)

The requirement that \(\mathbf {p}\!\!/_\mathrm{T}\) be inside the leptons is equivalent to demanding that the decomposition coefficients are both positive:

$$\begin{aligned} \alpha _1 > 0 \qquad \mathrm{and} \qquad \alpha _2 > 0. \end{aligned}$$
(14)

\({\mathbf p}_{\nu _1,\mathrm{col}}\) and \({\mathbf p}_{\nu _2,\mathrm{col}}\) thus defined approximate the neutrino three-momenta. Promoting them to massless four-momenta and adding them to the lepton four-momenta gives an approximate Higgs four-momentum, the mass of which we refer to as the collinear Higgs mass:

$$\begin{aligned} p_\mathrm{col}=p_{\nu _1,\mathrm{col}} + p_{\nu _2,\mathrm{col}} + {p}_{\ell _1} + {p}_{\ell _2}, \quad M_\mathrm{col}^2= p_\mathrm{col}^2. \end{aligned}$$
(15)

We apply one more cut before making use of the collinear mass variable: an upper limit for the dilepton mass, \(m_{\ell \ell } < 70\) GeV. This cut reduces the \(t\bar{t}+\)jets and \(WW+\)jets backgrounds very efficiently while leaving most the \(H+\)jets signal and \(Z\rightarrow \tau \tau \) background (see Fig. 2, left panel). At this stage \(Z\rightarrow \tau \tau \) becomes the dominant background for extracting the \(H \rightarrow \tau \tau \) signal. The size of the \(t\bar{t}\) and \(WW\) backgrounds can be estimated in a data-driven way by removing \(m_{\ell \ell } < 70\) GeV cuts. We discuss this in detail in the appendix.

Fig. 2
figure 2

Left panel: the invariant mass of the two leptons, \(m_{\ell \ell }\), after cut 6. Central panel: the collinear mass \(M_\mathrm{col}\) after cut 7, stacking the different processes. Histograms are normalized to the respective cross sections. Right panel: stacked distributions of the ‘Higgs’ transverse momentum \(p_{T,H}\) (defined in Eq. 11) after selection cut 8, with a logarithmic scale

The collinear mass is shown in the central panel of Fig. 2. Note that any particle decaying to \(\tau _\ell \tau _\ell \) with enough boost that the two \(\tau \) are not back-to-back will have its mass reconstructed by this process; indeed the most striking feature of the collinear mass distribution is the \(Z\) mass peak from the large irreducible \(Z \rightarrow \tau _\ell \tau _\ell \) background. A peak due to the signal is visible at \(M_\mathrm{col}\sim m_H=125\) GeV. By selecting events in the window \(|M_\mathrm{col} - m_H| < 10\) GeV we achieve a \(S/B \sim 0.4\) with \(S/\sqrt{B} > 9\) for 300 fb\(^{-1}\). The signal is taken to include the \(H \rightarrow W W^*\) contribution, which contributes about \(\sim 10\) % the \(H \rightarrow \tau \tau \) selection. We estimate the statistical error of the high \(p_T\) cross section measurement with \(\sqrt{S+B}/S\). We obtain uncertainties of 12 % for \(\sigma (p_{T,H}>200\) GeV), 22 % for \(\sigma (p_{T,H}>300\) GeV), and 41 % for \(\sigma (p_{T,H}>400\) GeV), respectively. Assuming we can achieve the same efficiencies for high-luminosity run of the LHC (HL-LHC) at 3 ab\(^{-1}\), we obtain \(\sim 4\) % for \(\sigma (p_{T,H}>200\) GeV), \(\sim 7\) % for \(\sigma (p_{T,H}>300\) GeV), and \(\sim 13\) % for \(\sigma (p_{T,H}>400\) GeV).

As seen in the central panel of Fig. 2 the smooth side-band distribution can be used for estimating the background contribution. We show in the appendix that these side-bands are available even after hard \(p_{T,H}^\mathrm{rec}\) cuts. We therefore expect that a data-driven strategy for background estimation will be available, and take the statistical errors as a background uncertainty estimate. There will of course be further systematic uncertainties induced by MC background modeling.

In this analysis we mostly use the recoiling fat jet to remove the \(t\bar{t}+\)jets background. It could be beneficial to make use of the difference between the jet substructure of gluon and quark jets [152156] since the dominant background at the last stage is \(Z+\)jets, which gives a different fraction of gluon and quark jets than the \(H+\)jets signal. We leave this for future work.

4.3 \(H \rightarrow W_\ell W_\ell ^*\) analysis

Our selection criteria for extracting \(H \rightarrow W_\ell W_\ell ^*\) from the background begin with those described in Sect. 4.1. In Sect. 4.2 we required that the \(\mathbf {p}\!\!/_\mathrm{T}\) vector be inside the two lepton momenta, after which the signal was dominated by \(H \rightarrow \tau _\ell \tau _\ell \) and the background by \(Z_{\rightarrow \tau _\ell \tau _\ell }\)+jets. Here we will remove most of the contribution of these processes by requiring that \(\mathbf {p}\!\!/_\mathrm{T}\) be outside the two lepton momenta. This is equivalent to demanding that the \(m_{T2}^{\ell \ell }\) variable [157] be greater than zero, as \(m_{T2}^{\ell \ell }=0\) when this is not satisfied—the ‘trivial zero’ [158]. In fact we go further and impose

$$\begin{aligned} m_{T2}^{\ell \ell } > 10~\mathrm{GeV}. \end{aligned}$$
(16)

This rejects essentially all of the contributions from \(H \rightarrow \tau \tau \) and \(Z\rightarrow \tau \tau \)+jets, which have the same end point close to \(m_\tau \). Allowing for endpoint smearing we cut a little harder at \(10\) GeV instead of \(m_\tau \).

We are now left with \(H \rightarrow W_\ell W_\ell ^*\) as our signal process, competing with the \(WW\)/\(t\bar{t}\)+jets backgrounds. Their kinematics unfortunately allow for little discriminating power: all of them contain two leptonic \(W\) bosons, with no possibility of mass reconstruction. Luckily, the transverse mass provides some discrimination. As shown in [159], the transverse mass variable satisfying \(m_{T,\ell \ell } \le m_h\) that gives the greatest lower bound on the Higgs mass in its decay to \(W_\ell W_\ell ^*\)

$$\begin{aligned} m_{T,\ell \ell }^2 = m_{\ell \ell }^2 + 2(E_{T,\ell \ell }/\!\!{E}_T - \mathbf {p}_{T,\ell \ell }\cdot \mathbf {p}\!\!/_\mathrm{T}), \end{aligned}$$
(17)

where \(E_{T,\ell \ell } = (m_{\ell \ell }^2 + p_{T,\ell \ell }^2)^{1/2}\) is the transverse energy of the dilepton system, and \(/\!\!{E}_T=|\mathbf {p}\!\!/_\mathrm{T}|\) is the missing transverse energy. We adopt this definition of \(m_{T,\ell \ell }\), also used by the ATLAS Collaboration [160].Footnote 4 The end point at \(m_H\) for the transverse mass of the signal is shown in the left panel of Fig. 3, where all the selection cuts up to step \(6'\) in Table 4 have been applied. We therefore impose

$$\begin{aligned} m_{T,\ell \ell } < m_H = 125~\mathrm{GeV}. \end{aligned}$$
(18)

Finally, backgrounds are further suppressed by requiring that the leptons have similar directions,

$$\begin{aligned} \Delta R_{\ell \ell } < 0.4, \end{aligned}$$
(19)

which is typically the case for the signal due to the aforementioned spin correlations.

Fig. 3
figure 3

Distributions of the transverse mass \(m_{T, \ell \ell }\) after all selection cuts up to \(6'\) imposed (left panel) and the dilepton separation \(\Delta R_{\ell \ell }\) after all selection cuts up to \(7'\) imposed (central panel). Right: stacked distribution of the ‘Higgs’ transverse momentum \(p^\mathrm{rec}_{T,H}\) (defined in Eq. 11) after all selection cuts for \(H \rightarrow W_\ell W_\ell ^*\) optimization, with a logarithmic scale

Table 4 Cut efficiencies for our analysis aimed at \(H \rightarrow W_\ell W_\ell ^*\), continued from the first part of Table 3. The values for each process are cross sections in fb. \(S/\sqrt{B}\) has been calculated for \({300~\mathrm{fb}^{-1}}\)

The efficiencies of the cuts aimed at \(H \rightarrow W_\ell W_\ell ^*\) are shown in Table 4, together with the last common cut—the \(b\) veto. We finally find \(S/B \sim 0.4\), with \(S/\sqrt{B} > 6\) for 300 fb\(^{-1}\). The table also shows the event numbers left after increased \(p_T\) cuts on the reconstructed Higgs. The resulting reconstructed Higgs \(p_{T,H}^\mathrm{rec}\) distributions are shown in Fig. 3 (right panel), stacked with the signal and background processes. As \(p_{T,H}^\mathrm{rec}\) increases, the signal over background ratio drops faster for the \(WW\) mode selection than for the \(\tau \tau \) selection.

5 Discussion

In this section, we discuss how much of the difference in the \(p_{T, H}\) distributions due to the modified couplings can be observed after the realistic reconstruction of the previous section has been performed. The left panel of Fig. 4 shows the signal \(M_\mathrm{col}\) distributions for the model points after applying the analysis described in Sect. 4.2 up to cut 7. We see the peak in the observable for all points. The central and right panel show the signal \(p_{T,H}^\mathrm{rec}\) distributions after the reconstruction described in Sect. 4 for \(H\rightarrow \tau \tau \) and \(H \rightarrow W_\ell W_\ell \) optimizations, respectively. As we expect, the difference in shape expected from the parton-level result of Fig. 1 manifests itself also in the reconstructed \(p_{T,H}^\mathrm{rec}\) distributions. A detailed breakdown after successive selection cuts is shown in Table 5 for the \(H\rightarrow \tau \tau \) optimization and in Table 6 for the \(H\rightarrow W_\ell W_\ell \) optimization, quoting cross sections relative to the corresponding SM value. Compared with the parton level numbers in Table 1, the \(p_{T,H}^\mathrm{rec}\) dependence is more enhanced at the reconstructed level. This is because most of the selection cuts are more efficient for the boosted Higgs event topology.

Fig. 4
figure 4

Signal distributions for the SM and six model points, normalized to the respective cross sections. Left panel: the collinear mass \(M_\mathrm{col}\) after cut 7. \(p_{T,H}^\mathrm{rec}\) is shown for \(H\rightarrow \tau \tau \) (\(H \rightarrow W_\ell W_\ell \)) in the central (right) panel after all optimized selection cuts

Table 5 The relative cross section \(\sigma /\sigma _\mathrm{SM}\) for several new physics model points after successive selection cuts for \(\tau \tau \) optimization
Table 6 Relative size \(\sigma /\sigma _\mathrm{SM}\) for several new physics model points after successive selection cuts for \(WW\) optimization

We will now estimate how much integrated luminosity is needed to find a certain significance for the signal. We perform a binned likelihood analysis of signal and background using the \(\mathrm {CL}_s\) method, as described in [162]. We include systematic errors on the cross section normalization assuming a Gaussian probability distribution.

Figure 5 shows the expected \(p\) values as a function of the integrated luminosity \(\mathcal{L}\) in the SM (left panel), the model point of \(\kappa _g=0.5\) (central panel) using the \(H \rightarrow \tau \tau \) analysis and \(\kappa _g=0.5\) using the \(H \rightarrow W_\ell W_\ell \) analysis (right panel). The analysis is based on the expected signal-plus-background against a background-only hypothesis. In the analysis, three different systematic errors on the cross section normalization of 0, 5, and 10 % are assumed. While achieving theoretical uncertainties of less than \(10~\%\) is challenging, in the separation of signal and background we rely predominantly on the lepton momenta which can be measured very precisely. As one can see from the left panel in Fig. 5, with \(\mathcal{L}=20 \sim 60\) fb\(^{-1}\), we are able to see the SM signal at 95 % confidence level depending on the assumed systematic uncertainty.

Fig. 5
figure 5

CL\(_s\) vs. the integrated luminosity for the model points \(\kappa _g= 0\) (SM, left) and \(\kappa _g=0.5\) (central) against a background-only hypothesis using the \(\tau \tau \) mode. Right panel: CL\(_s\) plot for the model point of \(\kappa _g=0.5\) against a background-only hypothesis using the \(WW\) mode

For \(\kappa _g>0\), the signal is enhanced and the required integrated luminosity decreases: it would be \(\mathcal{L}=15 \sim 30\) fb\(^{-1}\) for \(\kappa _g=0.5\) to observe the signal at 95 % CL, as shown in the central panel.

The right panel of Fig. 5 shows the \(p\) values for \(\kappa _g=0.5\) using the \(H \rightarrow W_\ell W_\ell \) mode. The sensitivity compared to the \(\tau \tau \) mode is slightly reduced. However, it is still possible to exploit the \(W_\ell W_\ell \) final state to observe a boosted Higgs boson.

We also perform a binned likelihood analysis to estimate how well we can distinguish these model points from the SM given the presence of backgrounds. The left panel of Fig. 6 shows the expected \(p\) values to observe the signal and background against the SM and background hypothesis as a function of the integrated luminosity \(\mathcal{L}\) for the model point of \(\kappa _g=0.5\) using the \(H \rightarrow \tau \tau \) analysis. Again, systematic errors of 0, 5, and 10 % are assumed. We find that we are able to distinguish the model point \(\kappa _g=0.5\) from the SM with \(\mathcal{L}=1000\) fb\(^{-1}\) even assuming 10 % systematic uncertainty.

Fig. 6
figure 6

CL\(_s\) vs. the integrated luminosity using \(\tau \tau \) mode for the model point of \(\kappa _g=0.5\) (left) and \(\kappa _g=-0.5\) (central). Right: CL\(_s\) as a function of \(\kappa _g\) for an integrated luminosity of 3000 fb\(^{-1}\)

It is more difficult to prove a deviation from the SM for model points with \(\kappa _g < 0\), compared to \(\kappa _g>0\) with the same \(|\kappa _g|\) value, since this gives a deficit rather than a surplus of signal events. The central panel of Fig. 6 shows the \(p\) values for \(\kappa _g=-0.5\) using the \(H \rightarrow \tau \tau \) analysis. As expected we have less sensitivity, and even smaller values of \(|\kappa _g|\) require larger integrated luminosities.

The right panel of Fig. 6 shows the \(p\) values as a function of \(\kappa _g\) using the \(H \rightarrow \tau \tau \) for an integrated luminosity of 3000 fb\(^{-1}\). If we assume 0 % systematic uncertainty we can exclude \(\kappa _g < -0.29\) and \(\kappa _g > 0.24\) for \(\mathcal{L}=3000\) fb\(^{-1}\) at 95 % CL. For the same integrated luminosity, assuming 10 % systematic uncertainty, we can still exclude \(\kappa _g < -0.4\) and \(\kappa _g > 0.3\) at 95 % CL.

We have not combined the \(\tau \tau \) and \(W_\ell W_\ell \) analyses although it could improve our sensitivity by some amount. Combining both channels is a complex task since the systematic uncertainties of both channels have to be evaluated by the experimental collaborations. Furthermore, it is not easy to avoid double-counting of events when combining both decay modes, as the final state reconstructions discussed in Sect. 4 are not able to strictly separate them (see Table 3).

6 Conclusions

The dominant production mode of the Higgs boson at the LHC—gluon fusion—is an important probe of new physics. Even though the inclusive rate has been measured to be in agreement with the SM, the study of a Higgs boosted by recoil against a hard jet constitutes an interesting, albeit challenging, measurement. It is motivated in the context of supersymmetry and composite Higgs models, and indeed generically in natural new physics: the Higgs coupling to a top-quark loop is both central to the question of natural electroweak symmetry breaking, and the chief source of gluon fusion. Due to the low energy theorem, however, the details of this loop-induced process are entirely obscured unless one can access the boosted Higgs regime.

We have shown boosted Higgs signal isolation in the dilepton channel via \(H \rightarrow \tau \tau \) and \(H \rightarrow WW\). The boost enhances the efficiency of the collinear approximation for mass reconstruction in the \(H \rightarrow \tau \tau \) mode, giving a peak at \(m_H\) visible above the dominant \(Z\)+jets background. \(Z\)+jets provides its own peak for this reconstructed mass distribution; using the side-bands around the \(m_H\) peak we expect a relatively precise background estimate. In the end we achieve \(S/B\sim 0.4\). For \(H \rightarrow WW\) mode, we can also achieve \(S/B\sim 0.4\) but with fewer events. This is nevertheless a helpful addition to the statistical significance. We expect a 12 % error for the cross section measurement for \(p_T>200\) GeV, 22 % for \(p_T>300\) GeV, and 41 % for \(p_T>400\) GeV with an integrated luminosity of 300 fb\(^{-1}\).

A direct measurement of the top Yukawa coupling in the \(t\bar{t}H\) channel is also instrumental for breaking the degeneracy concerning the coupling of the Higgs to gluons and to the top quark, and the \(H+\)jets mode provides a complementary determination. We have shown that we can distinguish several new physics models in an effective field theory approach using the reconstructed Higgs \(p_T\) distribution. With an integrated luminosity of 3000 fb\(^{-1}\) at the 14 TeV LHC, we can exclude \(\kappa _g < -0.4\) and \(\kappa _g > 0.3\) along the line \(c_t + \kappa _g=1\) at 95 % confidence level assuming the systematic uncertainty of 10 %.