Highly Boosted Higgs Bosons and Unitarity in Vector-Boson Fusion at Future Hadron Colliders

We study the observability of new interactions which modify Higgs-pair production via vector-boson fusion processes at the LHC and at future proton-proton colliders. In an effective-Lagrangian approach, we explore in particular the effect of the operator $h^2 W_{\mu\nu}^a W^{a,\mu\nu}$, which describes the interaction of the Higgs boson with transverse vector-boson polarization modes. By tagging highly boosted Higgs bosons in the final state, we determine projected bounds for the coefficient of this operator at the LHC and at a future 27 TeV or 100 TeV collider. Taking into account unitarity constraints, we estimate the new-physics discovery potential of Higgs pair production in this channel.


Introduction
In 2012, the Higgs boson was discovered at the LHC [1,2]. Current experimental data show that its properties agree with the predictions of Standard Model (SM), but further measurements are still necessary to examine the SM and search for new physics.
The SM predicts that the Higgs boson has three kinds of interaction at tree level: (1) the Yukawa interaction with fermions; (2) the interaction with massive vector bosons (W ± and Z ); (3) the cubic and quartic Higgs self-interactions. One of the prime targets of the second run and future runs of the LHC is to measure the Higgs self-couplings (3).
Higgs pair-production in vector-boson fusion (VBF) [3], V V → hh, is sensitive to the last two kinds of interaction. VBF-type processes are possible at both electron-positron and hadron colliders. At a hadron collider, the two incoming vector bosons V = W ± , Z are radiated from two initial quarks. In the final state, in addition to two Higgs bosons, there are two back-to-back hard jets which should be tagged in the forward and backward regions of a detector, respectively. This allows us to use VBF cuts to reject the QCD type backgrounds efficiently.
In this paper, we study double Higgs production in the VBF process in an effective-field theory (EFT) approach. We follow the conventions given in our previous paper [4], and use the following phenomenological effective Lagrangian: Dots indicate higher-dimensional interactions which are not relevant for the VBF Higgs production process that we consider. We only include the CP-conserving interactions and omit any CP-violating operators. The relations g W,a1 = g W,a2 = g Z ,a1 = g Z ,a2 = 1 and g V,b1 = g V,b2 = g V,b3 = g W,a3 = g Z ,a3 = 0 characterize the SM reference values at the tree level, where the subscript letter V denotes W, A, X , Z , respectively. The corresponding terms have been removed from the SM Lagrangian, as indicated by the overline notation SM , such that they are not doublecounted.
Introducing gauge degrees of freedom in this phenomenological Lagrangian, the vertices can be rewritten as gauge-invariant operators which are understood as the low-energy effect of short-range structure or new heavy degrees of freedom beyond the SM. Such effects would appear, for instance, in a composite Higgs model -for concrete examples, cf. [5][6][7][8]. In Sec.3, we relate the parameterization to the formulation in terms of gauge-invariant operators, adopting a concrete basis and truncating the expansion at dimension six, as commonly done in the literature.
The single Higgs couplings g V,a1 can be determined to up to 5 − 10% via measuring the decay fractions h → W W * and h → Z Z * at the LHC. Current LHC data exclude deviations from the SM prediction [9][10][11][12] of more than about 15%. (This limit, as well as the bounds discussed below, depends on a universal assumption about the absence of undetected Higgs decays which we will adopt for this paper.) For the couplings of type g V,b1 , the bounds are weaker. For instance (g V,b1 ∈ [0. 8, 4.5]) was reported in Ref. [9]. The measurement of double Higgs couplings hhVV (g V,a2 and g V,b2 ) is challenging at the LHC. Recently, ATLAS reported a search for double Higgs production in VBF [13], which excludes the ranges g V,a2 < −0.56 and g V,a2 > 2.89.
At the LHC, searches for a double Higgs final state focus on the gluon-gluon fusion process g g → hh. The Higgs decay channels bbbb [14,15], bbγγ [16,17], bbττ [18,19] and bbV V [20,21] have been investigated by both ATLAS and CMS. In addition, result for the channels W W γγ [22] and W W W W [23] were reported by ATLAS. A combination of these searches can be found in Ref. [24,25]. It is shown that the Higgs self-coupling λ 3 can be constrained to [−5, 12] by data, while no constraint of κ 5 has been reported.
As a complementary process, double Higgs production via VBF at hadron colliders has been extensively studied in the literature [26][27][28][29][30]. The NLO QCD and higher order correction for this process has been calculated in Ref. [31][32][33][34][35][36]; they find an enhancement of around 7% as it is natural for a pure electroweak process. For the high-luminosity LHC (HL-LHC), assuming L = 3 ab −1 at 14 TeV, it is expected that the couplings of type g V,a2 can be constrained to 20% [28], while a precision of around 1% should be achievable at a future 100 TeV hadron collider [29]. Furthermore, the hhVV coupling is also accessible via hV V or hhV final states. Ref. [37] argues that a measurement of the W ± W ± h final state can constrain the hhW W coupling to O (100%) at the HL-LHC, and to 20% at a 100 TeV collider, while the determination of this coupling from the hhV final state can only yield a weak bound [38].
Possible measurements of the g V,a2 couplings, i.e., the Higgs interacting with the logitudinal components of massive vector bosons, have thus been covered in some detail in previous work. However, without further assumptions it is not evident that couplings of the Higgs to transversal vector bosons play a lesser role. In this work, we will aim at filling this gap and thus perform a Monte-Carlo study of the sensitivity to couplings of type g V,b2 , both for the LHC and for future high-energy hadron colliders, and correlate this with the determination of g V,a2 .
The VBF process V V → hh receives a contribution from the Higgs cubic self-coupling. It is well known that at hadron colliders, the Higgs self-coupling is most accessible in the gluongluon fusion process [39], whose cross section is one order of magnitude larger than that of the VBF process. This fact has received a lot of attention . A measurement of the VBF process will provide additional precision for the Higgs self-coupling. However, for simplicity we will assume here that the couplings which are accessible in gluon-gluon fusion are known, and we fix those at their SM value. This allows us to focus on the couplings which are specific to the VBF class of processes, and lets us more easily estimate the sensitivity potential for those. This paper is organized as follows. In Sec. 2, we briefly introduce the mass-drop method used for tagging highly boosted Higgs bosons, and use it to explore the SM case at the LHC and at a 100 TeV collider. In Sec. 3, we investigate the potential for the discovery of new physics via multi-Higgs production, in form of the interactions discussed above. In Sec. 4, we provide projections for bounds on g V,b2 and g V,a2 for different collision energies. We conclude this paper with a discussion of our findings in Sec. 5.

Higgs pair production in the Standard Model
We will focus on the signal process pp → hh j j → 4b2 j due to the reason that the decay channel h → bb has the largest branch ratio. The partonic signal events are generated using WHIZARD [65] with the cuts listed in Table 1. We takes the parton distribution functions from CTEQ6l1 [66].
In this work, we consider the main backgrounds pp → tt → 2b4 j , pp → 2b4 j (QCD) and pp → 4b2 j . The background events pp → tt → 2b4 j are generated by WHIZARD. We have cross-checked the results with Madgraph [67]. Pure-QCD partonic events of type pp → 2b4 j and pp → 4b2 j are generated using ALPGEN [68].
For all event samples, parton shower and hadronization are performed by Pythia 8 [69]. Jets are reconstructed by FastJet [70] using the anti-k t algorithm [71] with a jet radius R = 0.4 Cuts s = 14 TeV s = 27 TeV s = 100 TeV  and transverse momentum cut P t > 20 GeV. We do not account for detector effects in detail but insert values for efficiency and mistagging rates where appropriate.

Analysis method at the 14 TeV LHC
In Table 2 (first column) we list the expected number of signal and background events in the SM for the LHC with s = 14 TeV and luminosity L = 3 ab −1 . To suppress the QCD background, we require four b-tagged jets in the final state (n b = 4). We assume the b-tagging efficiency b = 0.7 and mistagging rate mi ss = 0.001. The number of events after applying the b-tagging requirement are listed in the 2nd column of Table 2.
To identify two forward jets in the VBF process, we first select the jet with highest energy and label it as j 1 . If its energy satisfies E j 1 > 500 GeV, we scan over all other jets and determine the maximal rapidity difference ∆η( j 1 , j ) and invariant mass m( j 1 , j ) with respect to the leading jet. If the conditions ∆η( j 1 , j ) max > 3.6 and m( j 1 , j ) max > 500 GeV are met simultaneously, we identify the most energetic j as j 2 and label corresponding pair of jets as the tagging forward jets of a VBF process. Otherwise, the event is rejected. The number of events after applying this VBF cut are listed in the 3rd column of Table 2.
With these b-jet and forward-jet tagging requirements, the backgrounds pp → tt and pp → 2b4 j are greatly reduced. The dominant remaining background originates from QCD processes pp → 4b2 j , where the cross section after cuts is still five orders of magnitude larger than that of the signal.
To further suppress this huge background, we select those events with massive jets formed by highly boosted Higgs bosons and adopt the mass drop method [72].

The mass drop method for highly-boosted Higgs boson tagging
A significant fraction (of the order of 10 %) of the VBF event sample in the SM contains highly boosted Higgs bosons. A highly boosted Higgs boson has a large transverse momentum (P t > 200 GeV) and can be detected in the central region of the detector. The decay products of the Higgs boson typically form a fat jet with a large jet mass, if a large cone parameter is used for the analysis. A fraction of the QCD background events also contains massive jets, but the fat jets originating from Higgs pairs can in principle be distinguished by their characteristic jet substructure.
In recent years, various methods for jet substructure analysis have been developed: (1) Jetgrooming methods aim at removing soft radiation which is unlikely to originate from the hard process (see, e.g., [72]). (2) Radiation-constraint methods impose a cut on jet shape to separate the signal from the background (see, e.g., [73]). (3) Prong-finder methods detect a massive boosted object as a fat jet with multiple hard cores by exploiting the recombination history of the jet algorithm. As a particular prong-finder algorithm, the mass-drop tagger method is particularly suited for isolating boosted Higgs bosons, decaying to bb, from the QCD background [72]. A detail review of jet substructure can be found in Ref. [74] In this work, we adopt the mass-drop tagger [72] as a means for tagging highly boosted Higgs bosons in the final state of the VBF process. The method consists of the following two steps: (1) We first identify jets by the standard anti-k t algorithm with the cone parameter R = 0.4. After identifying the forward jets associated with the VBF process, we recluster the remaining jets using the Cambridge/Aachen (CA) algorithm [75,76] with R = 1.2. (2) If there are 2 to 4 CA jets in an event, and its leading or subleading jet has a transverse momentum satisfying P t > 200 GeV and a jet mass m j > 100 GeV, we apply the following procedure to tag candidates for highly boosted Higgs bosons.
1. Undo the last step of clustering of jet j to get two daughter jets j 1 and j 2 with m j 1 > m j 2 .
2. If the conditions m j 1 < µm j and min(P t ( j 1 ), P t ( j 2 )) m 2 j ∆R 2 j 1, j 2 > y cut (2.1) are satified, we identify j as a fat jet associated with a highly boosted Higgs boson.
3. Otherwise, we redefine j as j 1 and repeat the above procedure.
There are two dimensionless parameters µ and y cut in this method, as given in Eq. (2.1). In this work, we fix them as µ = 0.67 and y cut = 0.09. After two subjets are found, we also apply the filtering method to remove soft radiation which originates from the underlying event and contaminates CA jets with a larger cone size.
After applying this tagging algorithm, both signal and background events fall into three classes: 2-boosted Higgs (2BH), 1-boosted Higgs (1BH) and 0-boosted Higgs (0BH) candidate events. The number of events for each class are listed in Table 3.
We observe that the fraction of signal events in the 2BH category is still small (2 − 3%). Nevertheless, the corresponding background is two orders of magnitude lower than for the 0BH category, where the fraction of signal events is even less. The 1BH category falls in between. The tagger significantly improves the chance for finding signal events, but by itself it is clearly not sufficient for a measurement if the rate is SM-like.
In order to construct kinematic observables which improve the signal vs. background discrimination, we can reconstruct the mass peaks of the Higgs bosons for each event category. For each of the 2BH events, two jet masses m j should peak around Higgs the boson mass m h , as shown in Fig. (1). For a 1BH event, the jet mass of the leading fat jet should peak around the mass of the Higgs boson, while the mass of the second reconstructed Higgs candidate should coincide with the invariant mass of two b jets. For a 0BH event, two Higgs boson candidates are reconstructed by using the χ 2 method. To this end, we define χ 2 as follows: We assume m h = 125 GeV, and m( j 1 , j 2 ) and m( j 3 , j 4 ) are the invariant mass of two jet pairs in the final state, scanning over each combination. σ j = 10 GeV is used to take into account the error in jet energy resolution. The pairing which minimizes χ 2 is selected, and the corresponding invariant masses determined by the pairs of jets are taken as the reconstructed masses of Higgs bosons.
We display the distributions of the reconstructed mass of the leading Higgs boson in Fig. 1. The shapes of the 2BH and 1BH cases are almost identical. We note that the Higgs peak in the 2BH case is narrower than in the 0BH case, since wrong pairings are rejected more efficiently.

Multivariate analysis
After applying the VBF cuts and boosted-Higgs tagging, in order to further improve the ratio of signal over background, we employ the method of boosted decision tree (BDT) which utilizes the correlation of observables in the signal and can help to further suppress background. We select the following observables as the input to our BDT analysis: • P t (h i ): the transverse momenta of the two reconstructed Higgs bosons.  • m(h i ): the invariant masses of the reconstructed Higgs bosons.
• P t ( j i ): the transverse momenta of the two forward jets.
• E ( j i ): the energies of the forward jets.
• η( j i ): the pseudo-rapidity of the forward jets.
• m( j , j ): the invariant mass of the forward jets.
• P t ( j sub i ): the transverse momenta of the two subjets of each highly boosted Higgs boson.
• m(h, h): the invariant mass of the two Higgs boson candidates.
• χ 2 mi n (only for the 0BH case): the minimum value of χ 2 .
The results of the BDT response are presented in the Fig. 2. Obviously, signal and background can separated best in the 2BH case. For the 1BH and 0BH cases, extracting the signal is challenging even after exploiting the BDT method.
We can optimize the BDT cut to achieve the maximal significance, which is defined as where S is the event number of the signal and B is the event number of the total background. The efficiencies and significances of the optimized BDT cut are listed the Table 4 for all three event classes.
Comparing the results given in Table 3 and Table 4, we conclude that the BDT reduces the background by one additional order of magnitude, improving on the sequential cut method. Nevertheless, the final number of SM signal events is still tiny compared to the background in all cases, and the significance can only reach 0.02 (2BH, 1BH) or 0.04 (0BH). This is still far from the requirement of a discovery at the LHC.

Analysis of pp → hh j j in SM at 100 TeV hadron collider
We now apply the methods as described above to the VBF process at a future 100 TeV hadron collider. We assume a high integrated luminosity of L = 30 ab −1 , and adapt all selection parameters to the different environment as appropriate.
We require that the most energetic jet has an energy E j > 800 GeV. The VBF cuts are adjusted as ∆η j j ,max > 4.0 and m j j ,max > 800 GeV. Table 5 shows the number of events after imposing btagging and VBF cuts. Due to the increased cross section at high energy and the high luminosity   of the collider, the number of signal events is increased by a factor 1000 compared to the highluminosity LHC. After applying b-tagging and VBF cuts, the total number of the signal events is 8.96 × 10 4 . Of course, the number of background events also increases and reaches 10 9 , so background reduction is still essential.
The number of events for the 2BH, 1BH and 0BH cases are listed in Table 6. As one would expect, more events with highly boosted Higgs bosons can be observed at the 100 TeV collider than at the LHC. This is illustrated by the SM curves of Fig. (4). To enable signal/background discrimination in these three cases, we apply the BDT method as described above. The results are shown in Fig. 3 and Table 7. We can set the BDT cut in a region where the residual background is effectively zero.
These results show that it is possible in principle to discover the SM signal at a 100 TeV collider. The large rate of the VBF process at high collision energy leaves enough signal events after all measures for background reduction have been applied. We note, however, that pileup effects in a high luminosity run might pose a challenge for analyses such as this one. For a final verdict on the detectability of the SM signal, a more sophisticated full-simulation study would be necessary which is beyond the scope of this paper.

Multi-Higgs production in VBF processes with dimension-6 operators
In this section, we extend our study of the pp → hh j j process to contributions beyond the SM. Given the phenomenological Lagrangian 1.3, such effects are parameterized in terms of the coefficients g i , where we focus in particular on the g V,b2 coupling that describes a non-SM double-Higgs interaction with the transverse polarization components of the vector bosons.

Effective-Theory description
If we introduce the gauge symmetry of the SM in the phenomenological description, any anomalous effects can be re-expressed in terms of higher-dimensional gauge-invariant operators. To avoid redundancy, it is convenient to choose a particular operator basis such as the SILH basis [77] that we adopt for this work. As usual, we truncate the power-series expansion of the gauge-invariant effective theory at dimension six. This truncation allows us to express the phenomenological couplings in 1.3 in terms of a small number of SILH operator coefficients. Such a simplification allows for relating quantitative results for the VBF Higgs-pair process to the analysis of existing data, as well as to studies of different processes and interations.
We list the relevant terms in the SILH basis, In the following, we make use of the result from [4,78] in Table 8 which relates the phenomenological coefficients such as g V,b to the parameters of 3.1. In particular, the couplings g V,b that we are interested in, only depend on c HW , c H B and c γ , but not on c W and c B which determine g V,a . The SILH parameterization has been evaluated as the low-energy limit of various ultra-violet theories, such as models where the Higgs becomes a pseudo Nambu-Goldstone boson theory or holographic completions with extra dimensions; cf. [5][6][7][8]. We note the correlations between g V,a1 and g V,a2 , g V,b1 and g V,b2 which follow from the dimension-six truncation, and should be tested against real data if actual deviations from the SM show up.
In the SILH effective Lagrangian, Higgs interactions include further dimension-six operators such as The coefficients of these operators are switched off in our analysis because they are to be measured in different processes. In particular, the first one entails a global shift to all Higgs interactions which is equivalent to a modified Higgs total width, while the latter violates the custodial symmetry of weak interactions and globally modifies Z Z vs. W W Higgs couplings. In our current work we assume that no custodial-symmetry violation beyond the SM is present, as detailed below.
In the third column of Table 8, we present some typical numerical values for these parameters. Considering g ∼ 0.654, g ∼ 0.350, v ∼ 246GeV, tanθ = g /g = 0.535, α = c HW  [4,78]) are induced by the Higgs and gauge-boson wave-function normalization, respectively. with c HW (3.4) (Our notation slightly differs from the definitions forc i used in Ref. [79].) To simplify our discussion on the future collider sensitivity on g V,b2 and g V,a2 , we assume custodial-symmetry relations, namely that g W,b2 = g Z ,b2 and g X ,b2 = g A,b2 = 0 and g W,a2 = g Z ,a2 . This roughly renders c γ , c H B , c B ∼ 0, so we can perform a two-parameter study on c W and c HW . The interactions proportional to g W,b2 and g W,a2 (g Z ,b2 and g Z ,a2 ) account for dominant contributions to the cross section of pp → hh j j process, up to 70% (30%), respectively.

Higgs-pair couplings to transverse vector polarizations in VBF
Introducing non-SM effects proportional to the g V,b2 couplings, we consider kinematical distributions and their discriminating power. In Fig. 4, we display the distributions of the P t of the leading Higgs boson and the invariant mass the Higgs-bosons pair, respectively. The subfigures (a) and (b) show the distributions at the LHC with collision energy 14 TeV. The new-physics effect is illustrated by the green and blue curves which correspond to two different values of g V,b2 , namely g V,b2 = 0.09 and g V,b2 = 0.18, repectively. We observe a huge enhancement at high P t in the curves which include the new interaction of Higgs bosons with transversal gauge bosons. For the selected parameter values, the fraction of events in the region with P t > 200 GeV increases to 50% and 70%, respectively, while in the SM this fraction is just 18%.
The corresponding distributions at a 100 TeV hadron collider are shown in the Fig. 4 (c) and (d), where we select g V,b2 = 0.018 and 0.024. In this case, the fraction of events in the region with P t > 200 GeV increases to 50% and 60%, respectively, while in the SM, it is 25%.      Table 9 we demonstrate the cut flow for the value g V,b2 = 0.18 at the 14 TeV LHC. For this parameter value, the total signal cross section (σ × L ) is enhanced by a factor of 4 over the SM value. As Figure (4a) demonstrates, most of enhancement occurs in the boosted region. After b-tagging and VBF cuts have been applied, there are more than 600 signal events left that can be observed.   In Table. 10 we present the resulting numbers for the three classes of events. In this scenario, the signal accounts for 25% and 35% of the total events in the 2BH and 1BH classes, respectively. Compared with the results given in Table 4, the increase in signal events in the 2BH (1BH) case amounts to a factor 35 (10), respectively. When the BDT method is applied, this result is further improved, as shown in Fig. 5 and Table. 11. By comparing this with the results of Table 4, we conclude that isolating the boosted-Higgs region significantly enhances the discovery potential of this process, both for the 2BH and 1BH cases.
By contrast, in the 0BH case the signal significance is too low for the chosen collider parameters and benchmark values of g V,b2 .

Unitarity limits and discovery reach
The interactions of Higgs bosons with transverse gauge bosons in the Lagrangian (1.3) involve derivative couplings, and therefore are enhanced over the SM interactions for high values of the four-momenta. This is clearly visible in Fig. 4, where the contribution of the new interactions dominates for sufficiently large values of the transverse momentum P t or the Higgs-pair invariant mass m(h, h). If this behavior is naively extrapolated, the computed amplitudes will violate unitarity constraints. In Ref. [4], we have derived a generic unitarity bound for the pp → hh j j process, which relates the value of g V,b2 to a UV cutoff Λ UV .
For energy-momentum values beyond Λ UV , the EFT expansion breaks down. We expect higherorder contributions and, eventually, a new structure of underlying interactions to dampen the rise of the amplitudes, e.g., a resonance. Conversely, for the EFT description to remain useful, the value of g V,b2 must be such that the cutoff implied by (3.5) is outside the accessible kinematic range, or we would have to apply a cutoff or form factor on the calculated distributions. In Fig. 6, we display the sensitivity on Λ UV by using the (a) 2BH case and (b) 1BH case at the 14 TeV LHC, respectively. To this end, we determine the maximally allowed g V,b2 value as a function of Λ UV using (3.5). The y-axis indicates the number of signal events that result for corresponding values of the cutoff Λ UV and the maximal g V,b2 , respectively. The dashed blue line, for each plot, marks the 5σ discovery threshold as it follows from the signal and background event rates derived above.
The crossing points of the signal curves and the discovery thresholds in Fig. 6 determine the sensitivity of the analysis to heavy new physics, assuming that the bound (3.5) for g V,b2 is saturated. We conclude that the effect of a heavy resonance, for instance, can be accessible up to 4.4 TeV in the 2BH case and up to 3.6 TeV in the 1BH case. The cleaner environment of the 2BH case is clearly preferred. By contrast, the discovery reach of the 0BH case is limited to about 2.4 TeV.
We can apply the analogous analysis to a 100 TeV hadron collider. Based on the results obtained for the LHC with s = 14 TeV where the 2BH case has been most useful, below we only present the results for this analysis which requires two highly boosted Higgs bosons.
As shown above, at a 100 TeV machine the signal can be discovered already in the SM case where no enhancement is present. Therefore, we define the observability of new physics in By using the BDT analysis as before, we derive the 5σ excess bound for new physics at the 100 TeV collider, as marked in Fig. 7 by the blue dashed line. We read off this figure that this collider can yield a meaningful constraint on g V,b2 at a value which corresponds to a sensitivity in the new-physics scale of Λ UV = 27 TeV.

Two-parameter bounds
So far, we have considered only the dependence of the Higgs-pair VBF process on the couplings g V,b2 to the transverse polarization of vector bosons, which dominate the distribution in the highly boosted region. In this section, we complement the discussion by taking into account also the g V,a2 couplings which describe the interaction of a Higgs pair with longitudinally polarized vector bosons. This interaction exists in the SM but can receive a correction if dimension-6 operators are included.   Table 12. Coefficients σ 0 a 2 , σ 1 a 2 , and σ 2 a 2 in the expression (4.1) for pp → hh j j at three different collider energies.
In order to express the cross sections of pp → hh j j in terms of the parameters given in Eq. (1.1), we impose some simplifying assumptions on the phenomenological information from expected data on single-Higgs processes. For instance, the W W h vertex should be strongly constrained by data from the Higgs decay to W W as well as from VBF single-Higgs production. As stated before, we ignore the universal ambiguity in the Higgs-coupling determination due to undetected Higgs decays at the LHC, which can be lifted by model-independent measurements at Higgs factories such as the CEPC, the ILC, or the CLIC collider. Accordingly, we fix g W,a1 and g W,b1 , the single-Higgs couplings to longitudinal and transverse W polarizations, to their SM values. As a further simplification, we assume the custodial-symmetry relations g W,ai = g Z ,ai and g W,bi = g Z ,bi (i = 1, 2) whenever contributions of Z bosons are considered.
Since the amplitudes are at most linear in the parameters, we can parameterize the cross section of pp → hh j j in terms of g V,a2 as given below We compute the coefficients σ 0 a 2 , σ 1 a 2 , and σ 2 a 2 numerically using Monte-Carlo methods, evaluating the total cross section for a sufficient number of different coupling values. The results are given in Table 12.
Analogously, we can parameterize the cross section as a function on g V,b2 as follows  are given in Table 13. When both g V,b2 and g V,a2 are turned on, we have to include a mixed coefficient proportional to g V,b2 g V,a2 . The corresponding results can be found in [4]; the values are 1.7 fb for 14 TeV, 9.6 fb for 27 TeV, and 95 fb for 100 TeV, respectively.
The numerical results given in Table 12 and Table 13 have been obtained with Madgraph5 and Whizard by independent calculations, with very good numerical agreement. It should be pointed out that the results in Table 12 clearly reflect the strong gauge cancellation which occurs between individual terms of (4.1) in the SM limit. Some of the coefficients σ 0 a 2 , σ 1 a 2 , and σ 2 a 2 are one order of magnitude larger than that of the cross section in the SM, as given as σ 0 b 2 in Table 13. Using this parameterization and applying the results of our study, we derive the parameter ranges tabulated in Table 14. Outside the given limits, the deviation from the SM can be detectable as a 5σ discovery. We show the projected bounds on g V,a2 and g V,b2 in Fig. (8). We have compared our results for g V,a2 with those given in Ref. [29] and found good agreement. Regarding the unitarity bounds shown in the plots, the bounds on g V,b2 correspond to Eq. (3.5), while for g V,a2 we make use of the result [4] It is interesting to explore whether it is possible to disentangle the effects of the operator h 2 W a µν W a,µν from those of the operator h 2 W a µ W a,µ . As mentioned in our previous work [4], the azimuthal angle between two forward jets can be a useful observable for this purpose.
In Fig. (9), we display the azimuthal angle distributions for the 14 TeV, 27 TeV and 100 TeV cases. We take into account only events in category 2BH and apply no further cuts beyond the VBF cuts. The relative azimuthal angle is defined as ∆φ = |φ( j 1 )−φ( j 2 )|, where φ( j 1 ) denotes the azimuthal angle of the leading forward jet, and φ( j 2 ) denotes that of the second leading forward jet.
We note the main features of the distributions given in Fig. (9): • In the SM, the azimuthal-angle distribution of the SM is flat due to the dominance of longitudinal polarized vector bosons in the process pp → hh j j , as shown by the red curve in each of plot. Similarly, the distribution is also flat for the case where the term h 2 W a µ W a,µ is turned on, as shown by the green curve in each of plot where both the contrbutions of the SM and new physics have been included. Although new physics represented by the operator h 2 W a µ W a,µ enhances the total cross section, the azimuthal angle distribution remains similar to that of the SM.
• If the operator h 2 W a µν W a,µν is turned on, the Higgs pair being coupled to transversely polarized vector bosons leads to distributions that significantly differ from that of the SM. As the blue curves indicate, this type of new interaction causes more events with back-toback outgoing jets.
• At the 100 TeV collider, the difference in shape of the angular distribution is much less pronounced. In fact, the chosen benchmark values of the operator coefficients cause just a small disturbance of the SM signal in this plot, which is still detectable due to the much larger event rate for the high energy and high luminosity of the machine. To distinguish the two kinds of contributions, a more detailed analysis of this and other distributions becomes necessary.   To fully utilise the differential information, we perform a χ 2 -fit on the distribution of ∆φ( j , j ), using the 2BH events after the BDT cuts with the collision energy s = 100 TeV and integrated luminosity 30 ab −1 . In Fig. 10, we show the 2σ allowed region obtained with differential information (blue) or just from the total cross section (green). For the leftmost plot, we assume the SM for the true values of the coefficients. It is evident that the information from the differential distribution significantly improves the precisions on both δg V,a2 and g V,b2 . The middle plot assumes (δg true V,a2 , g true V,b2 ) = (0.07, 0) for the true value, i.e., only g V,a2 receives a new-physics contribution. The inclusive cross section confines the region allowed by a measurement to a ring, due to the quadratic dependence on both δg V,a2 and g V,b2 , while the differential distribution singles out a small area around the point (δg V,a2 , g V,b2 ) = (0.07, 0). Analogously, for the right plot we assume the true values (δg true V,a2 , g true V,b2 ) = (0, 0.07). Again, the differential distribution selects a small region, but in this case there is a two-fold sign ambiguity left for g V,b2 . This reflects the fact that the effects of g V,b2 is dominated by the squared term, and thus the sign of g V,b2 cannot be determined.
Finally, using the relations given in Table 8, we can map this contour onto the plane spanned by c W and c HW . The assumption of a linearly realized symmetry and truncation at the dimension-6 order enforces the constraint δg V,a2 ∼ 6g V,b2 , if both couplings depend only on a single parameter c HW . The two-parameter analysis that we describe in this paper, would allow us to search for the relation of c W and c HW , which would point to the presence of contributions that do not follow the simple power-counting assumption underlying the dimension-six truncation of the SILH basis. Currently, the experimental data constraints on the parameter set are rather weak, to the level of c V ∼ O(10 3 ) [79]. In this study, we conclude that we can reach a sensitivity of up to c V ∼ O(10) at future colliders.

Discussions and Conclusions
We have studied double-Higgs production in vector-boson at a proton-proton collider, pp → hh j j → 4b2 j . We compare the 14 TeV LHC with future high-energy and high-luminosity colliders of 27 TeV and 100 TeV. The analysis of this process benefits greatly from identifying highlyboosted Higgs bosons. To this end, we use the mass-drop method to analyse the jet substructure, and we optimize the significance by the boosted decision-tree method. Our results show that the 2BH case where two highly-boosted Higgs are tagged can provide the cleanest experimental environment to discover new physics in the the VBF signal.
At the LHC, the number of 2BH events is too small to discover this channel if the SM is valid without new contributions. Conversely, at a 100 TeV collider with a high luminosity of the order of 30 ab −1 , the number of 2BH events is large enough to discover the SM signal.
To study the effects of new physics, we use a phenomenological effective Lagrangian (1.3). We explore the effect of the interactions of type h 2 V a µ V a,µ and h 2 V a µν V a,µν and extract bounds for the associated coefficients g V,a2 and g V,b2 . Our results demonstrate that new-physics scales up to 4.4 TeV at the 14 TeV LHC, and 27 TeV at a 100 TeV hadron collider, are within reach of discovery. Fig. 11 collects the projected bounds on g V,a2 and g V,b2 that we have determined in this work, cf. also Table 14. The bounds that can be obtained at 27 TeV, are one order of magnitude stronger than for the 14 TeV LHC. The 100 TeV machine can further constrain these parameters by another order of magnitude. At both the 27 TeV and 100 TeV machines, this offers a significant indirect sensitivity to new physics in this sector.
Our study does not account for underlying-event or pile-up effects. These produce additional soft radiation and may render the extraction of a hard-process signal more difficult. In fact, the mass-drop method selects two-pronged jets in the final state. This reduces the impact of QCD jets from underlying events or pile-up, which are one-pronged. We also use filtering to remove soft radiation after reconstructing two sub-jets in the final state, which is helpful to reject jets from extra radiation produced by the parton shower. Recent studies indicate that modern pile-up mitigation techniques [80] can minimize the pile-up contamination efficiently for the 4b final state [57]. A detailed pile-up analysis can be done but is beyond the scope of the current paper.
For a further improvement of our result, we may consider color-flow properties [81] as a tool to further discriminate h → bb decays from b jets in the QCD background. Color-connection information can be quantified by observables such as the pull vector [82]. In a recent study of double Higgs production at LHC [83], it was argued that while the color flow is very different between the double-Higgs signal and the QCD background, this information may be diluted after applying kinematics cuts. The authors of [83] proposed to use jet-image and a Deep Neutral  Figure 11. The summarized bounds of g V,a2 and g V,b2 at 14 TeV, 27 TeV and 100 TeV hadron collider, which is the same as Table 14.
Network analysis methods to discriminate the signal from background, rather than constructing the pull vector. We defer these refinements of our study to future work.
In summary, the analysis of highly boosted Higgs-boson pairs in vector-boson fusion is promising and can be used to significantly improve our knowledge about Higgs-sector interactions. In particular, the method should become important for a future high-energy proton collider where the sensitivity is sufficient to extract a signal down to the SM rate.