Phenomenological analysis of Higgs boson production through gluon fusion in association with jets

We present a detailed phenomenological analysis of the production of a Standard Model Higgs boson in association with up to three jets. We consider the gluon fusion channel using an effective theory in the large top-quark mass limit. Higgs boson production in gluon fusion constitutes an irreducible background to the vector boson fusion (VBF) process; hence the precise knowledge of its characteristics is a prerequisite for any measurement in the VBF channel. The calculation is carried out at next-to-leading order (NLO) in QCD in a fully automated way by combining the two programs GoSam and Sherpa. We present numerical results for a large variety of observables for both standard cuts and VBF selection cuts. We find that for all jet multiplicities the NLO corrections are sizeable. This is particularly true in the presence of kinematic selections enhancing the VBF topology, which are based on vetoing additional jet activity. In this case, precise predictions for the background can be made using our calculation by taking the difference between the inclusive H+2 jets and the inclusive H+3 jets result.


Introduction
The discovery of a Higgs boson [1,2] during Run I of the Large Hadron Collider (LHC) has ushered in an era of precision measurements to determine the nature of the new particle. A major step was the analysis of its spin structure [3][4][5], resulting in a very good agreement of the measurement with the Standard Model prediction. A second major step was the measurement of different production times decay rates by ATLAS [6][7][8][9][10][11][12] and CMS [13][14][15][16]. All experimental results point towards the mechanism of electroweak symmetry breaking [17][18][19][20] being indeed as intimately linked to the generation of fermion masses as predicted by the Standard Model. This hypothesis will be further scrutinized during Run II of the LHC, where the Vector Boson Fusion (VBF) channel will play a leading role. In this production mode, a Higgs boson is created by annihilation of virtual W or Z bosons, radiated off the initial-state (anti-)quarks in a t-channel scattering process with no color exchange at leading order [21,22]. The experimental signature thus consists of two forward jets, separated by a large rapidity gap with no hadronic activity [23][24][25][26][27]. This topology is the key to identifying the signal among an overwhelming number of backgrounds, which include Higgs boson production through gluon fusion accompanied by two or more jets. Despite the latter production mechanism of the Higgs boson being dominant, it can be largely suppressed compared to the VBF channel by requiring a large rapidity separation between the leading two jets (called the tagging jets), a large invariant mass of the corresponding dijet system, and an additional veto on jet activity in the central rapidity region. Higher-order QCD and EW corrections are known to alter the signal rate only insignificantly under these cuts [28][29][30][31][32]. In the present article, we investigate to which extent QCD corrections alter the background.
We also present a comprehensive analysis of H + 3 jets production in its own right. The high phenomenological relevance of the gluon fusion channel has spurred an unprecedented effort in the theoretical community. For Higgs boson production in conjunction with up to two jets, the NLO corrections have been available for some time [33][34][35][36][37][38][39][40][41][42][43][44][45][46][47][48][49][50][51]. In addition it has been shown how to include parton shower resummation on top of the fixed-order result. Especially, in the case of Higgs boson plus two jets, this has been demonstrated recently [52,53]. The first computation of Higgs boson production in association with three jets was accomplished just two years ago [54]. The development of the improved reduction algorithm Ninja [55][56][57] to compute the virtual corrections then allowed a first phenomenological analysis that was published in Ref. [58]. For the Higgs boson plus one jet final state, the full NNLO QCD results were computed lately [59][60][61][62] whereas the NNLO results for inclusive Higgs boson production have been around for a decade [63][64][65]. The frontier concerning the latter however has been pushed further with a seminal calculation of the NNNLO corrections that has just been finalized [66][67][68][69][70][71][72][73][74][75].
In this article we focus on the behavior of the NLO results under different scale choices, and on the scaling with increasing number of jets. We also test for potential high-energy effects, which may require resummation [76,77]. We note that the scaling of jet cross sections with increasing number of jets is comparable to W plus jets production [78,79], once the number of jets is large enough. This can be understood using jet calculus [80,81]. We note that the effect could be tested experimentally to a high accuracy by measuring the ratio of jet rate ratios in different processes. Systematic uncertainties should cancel to a large extent in the analyses.
In our calculations we use an effective gluon-to-Higgs coupling, where the top quark is treated as an infinitely heavy particle. Although this requires the top mass to be much larger than the Higgs mass, the approximation has been shown to work very well [82][83][84].
This article is organized as follows: Section 2 presents the technical prerequisites for our calculation. Section 3 discusses the properties of our results under generic cuts. Section 4 finally focuses on the experimentally most interesting case of VBF background predictions, and on the behavior under different selection cuts for the tagging jets. Section 5 contains our conclusions and an outlook.

Calculational setup
The calculation of the NLO corrections is performed by combining the two automated programs GoSam [85,86] for the generation and evaluation of the virtual one-loop amplitudes, and the Monte Carlo event generator Sherpa [87]. The two are linked using the Binoth Les Houches Accord [88,89], a standard for event and parameter passing between one-loop programs and Monte-Carlo generators.

Virtual corrections
The GoSam framework is based on an algebraic generation of d-dimensional integrand using a Feynman diagrammatic approach, employing QGraf [90] and Form [91,92] for the diagram generation, and Spinney [93], Haggies [94] and Form to write an optimized Fortran output. For the reduction of the tensor integrals we used Ninja [55][56][57], an automated package for integrand reduction via Laurent expansion. Alternatively one can use other reduction techniques such as integrand reduction in the OPP method [95][96][97] as implemented in Samurai [98] or methods of tensor integral reduction as implemented in Golem95 [99][100][101]. The resulting scalar integrals are evaluated using OneLoop [102]. As already anticipated, with respect to the very first computation [54], the integrand reduction performed with Ninja allows an improved timing and also a better stability in the evaluation of the virtual. This was a crucial aspect for both the analysis presented here and also the previous one presented in [58].

Real emission and phase space integration
The calculation of tree-level matrix elements, real emission contributions and subtraction terms in the Catani-Seymour framework [103], as well as phase space integration have been performed using Sherpa [87]. We have used the matrix element generator Comix [104,105]. We emphasize that this is different to the results obtained previously [54,58], where we used a combination of MadGraph 4 [106,107], MadDipole [108,109] and MadEvent [110] for the calculation of real emission matrix elements, subtraction terms and phase space integration of the real emission contribution. We recalculated the results obtained in [54,58] with the Sherpa setup and found complete agreement. This is a very strong consistency check on both results.

Definitions relevant to the calculation
In the approximation of an infinitely large top mass, the Higgs coupling to gluons, which at LO is mediated by a top-quark loop, becomes independent of m t , and can be described by an effective operator [111], as In the MS scheme, the coefficient c i is given by [33,34] c i = − α s 3πv 1 + 11 4π α s + O(α 3 s ), (2.2) in terms of the Higgs vacuum expectation value v, set to v = 246 GeV. The operator (2.1) leads to new Feynman rules, with vertices involving the Higgs field and up to four gluons.
In the absence of accompanying jets, i.e. in the Higgs boson production process, it is natural to evaluate the strong coupling associated with the effective vertex at a scale equal to the Higgs boson mass m H . When further jets are present, the natural scale choice is more ambiguous. One possibility is to keep the two powers of the strong coupling in the effective vertex fixed to m H , while the other powers of α s are computed at a different scale µ R . When choosing this method at NLO, an additional finite correction has to be added to the virtual contribution, taking into account the fact that the strong couplings in the Born contribution are evaluated at different scales [112].
In the case of Higgs boson production associated with N jets the general formula for computing the cross section is where B, V and R are the Born, the virtual and the real contribution respectively, and β 0 is the one loop beta function coefficient: In Section 2.5 we will discuss and compare different settings for the renormalization and factorization scale and their impact on the theoretical predictions.

Ntuples generation and usage
In order to simplify our analysis, the numerical results have been produced and stored in the form of Root Ntuples. This format is particularly useful for changing cuts and observables, as they can be directly extracted from the given set of Ntuples without having to generate new results. It also simplifies changing PDFs and scale choices, as this can be achieved without having to reevaluate the matrix elements, which is the most time consuming contribution. The writing of the Ntuples is implemented in Sherpa, and we refer to [113] for more details. Table 1 summarizes the total number of Ntuple files which were generated for the analyses presented in the next sections. The files are available upon request.

Kinematic requirements and parameter settings
The following numerical results have been calculated for center of mass energies of 8 and 13 TeV. We discuss first a set of basic cuts and second a set of VBF cuts. In both cases jets are clustered using the anti-kt algorithm [114,115] as implemented in the FastJet package [116]. If not specified differently, the jet radius has been set to R = 0.4 and the PDF set was CT10nlo [117]. In the basic selection, the following set of jet cuts is applied For the VBF analysis additional cuts on the jets are imposed, namely m j 1 j 2 > 400 GeV, |∆y j 1 ,j 2 | > 2.8 . If not stated otherwise the tagging jets j 1 and j 2 are defined as ordered in p T , i.e. the jets with the highest and second highest transverse momentum. In sections 3.5 and 4 we will also discuss different tagging schemes. Concerning the choice of renormalization and factorization scale we define our central scale to be given by where the sum runs over all partons accompanying the Higgs boson in the event. This scale was shown to be sensible in other multi-leg calculations [118]. However, in the Higgs-production process it is not obvious that this scale should be adopted for all strong couplings. Because the top quark has been integrated out, one could argue that the strong coupling associated to the effective Higgs-gluon vertex should be fixed to the natural scale of this vertex, and that this would be of the order of m H . Furthermore it is not obvious whether one should vary this scale together with the scales for the other powers of α s when performing the usual scale variation to assess the theoretical uncertainties. We note that there is no 'correct' choice and that the differences between the choices are formally of higher order. It is therefore interesting to investigate their effect on phenomenological results. We consider three different scale choices, defined as The presence of the factor x indicates that this scale is varied in the range x ∈ [0.5 . . .

2].
A variant of scale A, where the two powers of α s evaluated at m H do not change with varying x was used in previous computations of H + 3 jets [54,58]. This however leads to a somewhat artificial reduction of the scale uncertainty, and therefore we do not consider the corresponding scale in the following.

Higgs boson plus jets phenomenology
In this section we discuss the results that have been obtained with the set of basic gluon fusion cuts as described in Eq. (2.5).

Cross sections, scale dependence and technicalities
We start our analysis by presenting the results for the inclusive cross sections using basic gluon fusion cuts and the different scale choices shown above. Fig. 1 for n + 2, 3, 4 at LO and n = 2, 3 at NLO accuracy. Independent of the scale choice and order in perturbation theory we see that the ratio r 2/1 is larger than the ratio r 3/2 . One might expect this behavior for two reasons: On the one hand the H + 2 jets process has new kinematical topologies compared to H + 1 jet. In particular, it is possible that the Higgs boson is (almost) at rest in a H + 2 jets final state, while it must always carry some transverse momentum in a H + 1 jet final state, at least when computed at LO. This means that the phase space is larger in the H + 2 jets case. On the other hand, new partonic channels open up for the H + 2 jets process which do not exist in the H + 1 jet case. Adding a third jet neither opens additional phase nor further partonic channels.
Regarding the scale choices we observe that except for scale C the ratios are largely independent of the choice and that the ratios between LO and NLO are very similar, in particular for scale B there is a very good agreement. Scale C works well for the NLO results but yields an increased ratio for the LO results. This is an indication that this is not a sensible choice. Going from 8 TeV to 13 TeV, the jet rate ratios increase, as may be expected due to the increase in phase space for jet production.
One interesting point in this context is the fact that for scale B the ratio r 3/2 is very similar between LO and NLO. Given the additional evidence from W/Z +jets calculations, which also points towards dynamical scales being more suitable for multi-leg processes, we choose scale B (2.8b) as our default. It is also striking that the NLO ratio is almost identical in the r 2/1 and in the r 3/2 case. This hints at the possibility to extrapolate to the r 4/3 case. The idea has been worked out in Ref. [79] for the case of W production in association with jets, and it is supported by jet calculus [80,81]. However, in our case the fit to r n/n−1 that was performed in Ref. [79] would be trivial, with the first nontrivial check of staircase scaling being the ratio r 4/3 itself. Fig. 1 also suggests that the quality of the fit will depend on the scale choice. Furthermore, it remains to be seen whether the good agreement for the ratios will hold for differential distributions.
Nevertheless a very interesting experimental opportunity opens up at this point: Let us assume a set of hard processes that contain all possible kinematical topologies, and also all possible partonic channels, like W/Z + 2 jets production and H + 2 jets production. Then it should be possible to test the universality of staircase scaling to a very good precision by measuring the ratio of jet rate ratios r n/n−1 for different hard processes, like W/Z and H production. The differences between the dominant partonic production mechanisms at leading, next-to-leading and next-to-next-to-leading order will be reduced at higher jet multiplicity, and the ratio should be largely independent of n. A major deviation from this behavior would very likely signal the presence of new physics.
A more detailed summary of the total cross sections and their ratios is presented in Table 2. It lists the total cross sections for the one-, two-and three-jet process as well as it theoretical uncertainties from scale variations, the global K-factors, cross section ratios, labelled with r, and jet fractions denoted by f .
It is interesting to investigate the NLO result and its stability as a function of the jet radius, R. The corresponding results are shown in Fig. 2 for 8 TeV (left) and 13 TeV (right). The cross sections are normalized to σ tot (R = 0.4). At leading order, the cross section decreases with increasing R, because the probability for two partons to be clustered into a single jet increases, leading to a rejection of the event. From this basic consideration it is clear that the dependence on the jet radius is stronger for H + 3 jets than for H + 2 jets, simply because with an increasing number of jets, the average distance in R-space between two jets becomes smaller.
At NLO the situation is more involved. The additional parton present in the real  corrections leads, on average, to softer partons than at LO. For small values of R the partons will not be clustered which means that there is an increased probability that the event will not pass the cuts because there are not enough jets above the p T threshold. If the jet radius is increased, partons that are relatively close in R-space but where each of them is not above the threshold, get clustered into a jet. If the radius is increased further we end up with the same situation as for the leading order result and the cross section decreases again. Therefore we expect a change of shape and the interplay between the behavior of the leading order results with the one of the real emission contribution leads to a stabilization of the R-dependence. In general we see that the effects between different scale choices are very small, also largely owing to the fact that the plots in Fig. 2 show ratios, so differences due to different runnings and factors of α s cancel out. Fig. 3    Independent of the jet multiplicity we observe the typical change of shape when going from LO to NLO. Both processes have their maximum approximately at half of the central scale, except for scale C in the 3 jet case, which is shifted towards higher values. This means that taking the central scale and varying up and down by a factor of two yields a reliable 0.5 1 2 x, µR = µF = x µ0  estimation of the theoretical uncertainties. The plots show the results for three different PDF sets, CT10nlo [117], MSTW08 [119] and NNPDF23 [120]. This is compared to the scale choices A, B and C. For scale C we show the result only for one PDF set. Although the different PDF sets lead to slightly different results, their effect is considerably smaller than a different choice of the scale. If we compare the 2-jet result and the 3-jet result we see that for the former the effects of scale and PDF choice leads to a broader range of results, i.e. the spread between the single curves is bigger compared to the 3-jet case, where the curves (except for scale C) seem to be more bundled. For the 2-jet process the scale C is in quite good agreement with the other scales, which is clearly not the case for the 3-jet process. This indicates that this scale (i.e. the Higgs mass) is smaller than the other scales hence shifting curves to higher values of x. Another interesting point is that for both jet multiplicities the shapes are almost independent of the center of mass energy, the plots for 8 TeV and 13 TeV are very similar. Only for scale choice C we find a visible deviation, which is not surprising as it is a fixed scale and therefore does not account for a change in the center of mass energy. Fig. 4 shows the exclusive jet cross sections for Higgs plus one, two and three jets for both 8 and 13 TeV. At NLO a H + n jets process contributes of course to two jet multiplicities,  For each of the three processes two bins are plotted. The first contains its exclusive NLO contribution, the second the contribution to the n+1 process, i.e. the real emission with LO accuracy. The darker shaded areas denote the uncertainty from scale variation.
to the n-jet bin and to the (n + 1)-jet bin. The contribution to the n-jet bin is given at NLO accuracy, whereas the (n + 1)-jet contribution is only present at LO accuracy, as it is given by the real emission contribution. For each subprocess, the left bin contains the exclusive contribution to the n-jet bin, the right plot its contribution to the (n + 1)-jet bin. As we have used the same set of NLO PDFs for both LO and NLO contributions the real emission contribution of the n-jet process is exactly equal to the LO contribution of the (n + 1)-jet process. The dark shaded areas on the different jet-bins denote the theoretical uncertainty stemming from the scale variation. As the contribution to the (n + 1)-jet bin is only given at leading order accuracy the error bars are substantially larger compared to the n-jet bin, which is given at NLO accuracy. It is important to mention that for all three subprocesses the (n + 1)-jet contribution constitutes a substantial fraction of the total cross section. This is particularly important when investigating observables that separate the two contributions, as in the case of vetoed cross sections. We will return to this point in Sec. 4, when dealing with VBF topologies. Another interesting class of observables which is somewhat related to VBF topologies since it requires at least two jets in the final state, is the one combining the Higgs boson and the two leading-p T jets momenta. These observables are particularly interesting because they are directly sensitive to any additional radiation beyond the two leading jets. We start with the transverse momentum distribution of the system consisting of the Higgs and the two leading jets p T, Hj 1 j 2 . Fig. 5 shows this observable, computed using H + 2 jets NLO and H + 3 jets LO and NLO, both at 8 and 13 TeV. To better disentangle the two energies, the distributions for 8 TeV were divided by a factor of 10. The lower part of the plot shows the LO and NLO H + 3 jets curves normalized to the NLO H + 2 jets predictions separately for the two center of mass energies.
We are using the same PDF set for both LO and NLO calculations, hence the NLO H + 2 jets and the LO H + 3 jets curves are identical. They predict the transverse momentum spectrum of a third parton at LO. In the H + 3 jets case, the jet-p T threshold at 30 GeV   Figure 5: The transverse momentum distribution of the combined Higgs boson plus associated leading dijet system (5a), and the distribution of their relative azimuthal angle, ∆φ H, j1j2 , (5b) at both LHC center-of-mass energies of 13 TeV as well as 8 TeV.
is made explicit. In the H + 2 jets sample instead the third parton can become arbitrarily soft, leading to a distribution which diverges for p T, Hj 1 j 2 → 0, with the divergence being canceled by the virtual corrections that contribute to the first bin of the spectrum only.
Our H + 3 jets NLO calculation adds additional, well-known features to this spectrum: Sudakov suppression effects around the jet-p T threshold, and NLO K factors at larger values of p T, Hj 1 j 2 . Kinematical configurations with four jets or with three jets and an unresolved parton lead to p T balancing as well as p T enhancing effects such that we find large, O(3), fairly constant corrections for higher p T, Hj 1 j 2 as well as a contribution to the spectrum below the p T threshold. Universal parts of the virtual corrections in combination with kinematical effects in the radiative corrections lead to a depletion right above the threshold. The reduction of scale uncertainty when going from H + 3 jets at LO to H + 3 jets at NLO is small at large p T, Hj 1 j 2 , anticipating a large influence of higher-multiplicity processes. This will be discussed in Sec. 3.4. In contrast to other H + 3 jets NLO distributions, we find a much more symmetric uncertainty, which is again a consequence of the dominance of the four-jet contributions that feature a LO scale variation 1 . Figure 5 (right) shows the azimuthal separation between the Higgs boson and the two 1 On a more quantitative level, we note that the 'BVI' contributions of the pT, Hj 1 j 2 and pT,j 3 distributions are exactly the same while the 'RS' ones of the former are considerably harder than those of the latter. This leads to a different cancellation pattern when combining the scale varied 'BVI' and 'RS' predictions (they work in opposite directions). For the pT, Hj 1 j 2 , we then find the 'RS' uncertainties to be the dominating ones for pT, Hj 1 j 2 60 GeV.
leading transverse momentum jets, ∆φ H, j 1 j 2 . The peak at π is common to all curves and indicates that the Higgs boson is preferably recoiling against the jet-jet system. In fact, this is the only possible kinematical configuration in H + 2 jets at LO, therefore the spread to smaller values of ∆φ H, j 1 j 2 indicates the size of the higher-order corrections. Correspondingly, the curves for H + 2 jets at NLO show a very large scale dependence over the full spectrum, the only exception being the bin at π, which contains the singular real emission configurations and the virtual corrections. In contrast to p T, Hj 1 j 2 , the LO predictions for H + 3 jets do not overlap with the NLO results for H + 2 jets, as real radiation can become relatively soft in the H + 2 jets case, while still contributing to the spectrum at ∆φ H, j 1 j 2 < 3, i.e. outside the bin at π. In other words the gap between the H + 3 jets and H + 2 jets curves is filled by events that contribute to the p T, Hj 1 j 2 spectrum below 30 GeV. The apparent difference in scale uncertainty between H + 3 jets at LO and H + 2 jets at NLO shown in the ratio plots is an artifact of the different central value of these predictions. Our H + 3 jets NLO calculation allows to describe ∆φ H, j 1 j 2 at true NLO accuracy. For both collider energies the relative size of the corrections is more than a factor of 2 for ∆φ H, j 1 j 2 → 0 and decreases to about 40% for ∆φ H, j 1 j 2 → π. We conclude this section discussing two more technicalities that are of relevance to all Higgs boson plus jets final states. The left panel of Fig. 6 shows the azimuthal separation between the Higgs boson and the leading p T jet for the three different final state multiplicities considered. Increasing the number of jets has a very dramatic effect on this observable. In fact for H + 1 jet, deviations from a pure back-to-back configuration are only possible through real radiation. The distribution is therefore only LO accurate and displays a large uncertainty band. For H + 2 jets the azimuthal separation in a Born-like event cannot exceed 90 degrees, because the Higgs boson and the leading jet cannot recoil together against a second softer jet. This however becomes possible in the presence of a third jet originating from the real radiation contribution. For this reason the H + 2 jets curve features a reduced uncertainty band for π/2 < ∆φ H,j 1 < π and a larger uncertainty band for smaller angles. This also explains the large drop of the H + 1 jet curve for angles ∆φ H,j 1 < π/2. The difference between the LO H + 2 jets distribution, which vanishes at ∆φ H,j 1 = π/2 and the NLO H + 1 jet is due to real-radiation events which have the hardest jet with very large pseudorapidity (|η| > 4.4), therefore falling outside the applied jet cuts. Thus the events populating the distribution of H + 1 jet below π/2 have large missing transverse energy. A full NLO accuracy is finally reached over the whole kinematical range for H + 3 jets. Despite the several jets present in the final state, there is a clear preference for the Higgs boson to recoil against the hardest jet.
As theĤ T scale plays a central role in our calculations, we are interested in how it compares to other reasonable choices characterizing the transverse activity of Higgs boson occurrences in association with jets. Using the generalized transverse mass definitions, m T,1...n = given in Ref. [121], we can construct observables that may serve as a proxy for alternative scale choices. At the level of observables, i.e. where jets rather than partons are used to build the variable, we can investigate to what extent the alternative choices lead to deviations from  Figure 6: Azimuthal separation between the Higgs boson and the hardest jet (6a) for H + 1 jet, H + 2 jets and H + 3 jets production at 13 TeV LHC center-of-mass energy. Note that the results for the H + 1 jet and H + 2 jets channels have been multiplied by additional factors of 4 and 2, respectively. While both the LO and NLO predictions are shown in the main plot, the lower subplot depicts differential ratios, R k/n , between various H + jet cross sections at NLO only. Alternative transverse mass/momentum observables (6b) in comparison to the H T observable as derived from our default scale choice. NLO results are shown for all three H + n jets channels (separated by additional factors of 10) and an LHC collision energy of 13 TeV. For each observable, the differential K-factor is given in the associated ratio plot where again separation factors have been applied to enhance the visibility of the results. theĤ T default. We then obtain at least a qualitative understanding of the possible size of theoretical uncertainties that arise from a variation of the scale's functional form. This theoretical scale dependence should be kept in mind as it may turn out to be important in the interpretation of H + n jets cross section measurements.
The results of this comparison are presented in the right panel of Fig. 6 where for the three different jet multiplicity final states, the main plot shows the transverse observables including H T (as green curves) while the ratio plot shows the NLO vs. LO K-factors in dependence of the respective observable. These K-factors are noticed to show a great amount of overlap supporting the fact that the higher-order corrections to the different observables lead to rather similar effects. As before the bands indicate the size of standard scale B variations by factors of two. Note that for better visibility, the H + 1, 2, 3 jets results are separated by additional factors of 10 (by factors of 10 and 4) in the main (ratio) plot. The differential K-factors of the ratio plot are moreover divided by their respective inclusive H + n jets K-factors. The black curves, A T ≡ M T,H,j 1 ...jn , represent a choice whose threshold is always given by m H and does not shift upwards with increasing jet multiplicity as it occurs for H T . In the H + 1 jet case, the differences are small while for H + 2, 3 jets, they are more pronounced as A T dives into the softer region and thus gives rise to softer tails. In the H + 2, 3 jets cases, we also show the effects of choosing a more VBF-inspired M T scale (orange curves). Although the soft region is covered even more widely, M T,j 1 j 2 ,H provides much harder tails compared to A T with no overlap among them. Again, softer tails but similar coverage of the soft region can be achieved by switching to m T,Hj 1 j 2 (purple curve) which we only demonstrate for H + 3 jets. For H + 1 jet events, an m T based choice such as a T ≡ m T,Hj 1 ...jn (depicted by the lightblue shaded curve) would then offer a scale setting very similar to H T but neglecting the Higgs boson mass. This is depicted by the curve shaded in lightblue. Based on the overall behavior of the alternative scales apart from M T,j 1 j 2 ,H , they can be anticipated to yield larger H + jets cross sections, for some cases they may even be outside the uncertainty range as the standard scale variation bands do not overlap in all cases. This is where further investigation will have to step/set in.

Single-particle observables
We now turn to the discussion of one-particle or one-jet observables. Figures 7b-7d show the transverse momentum distribution of the Higgs in the H + 3 jets process for the three different scale choices A, B and C of Eq. (2.8), while Fig. 7a shows the results for the different scales normalized to the NLO result for scale A. The advantage of scale B is the flatness of the K-factor over the entire p T range. This supports our choice to make scale B the default scale. For the lower p T region up to ∼ 250 GeV also scale C seems to be a sensible choice. However it completely breaks down for higher p T , and the K-factor can even become negative. 2 The different behaviors of the scales are more pronounced in Fig. 7a. For scale C the leading order curve shows the opposite behavior compared to the NLO curve. For scale A the situation is not as bad, but the K-factor still has a strong dependence on the p T of the Higgs. This is much better for scale B, for this choice the LO and NLO curve are almost parallel, which is a further confirmation that usingĤ T for all factors of α s is a sensible choice.
In Fig. 8 we compare the p T distributions of the Higgs and the associated jets at center of mass energy of 8 TeV (left plots) with the 13 TeV (right plots) prediction. The upper row shows the LO and NLO result for the Higgs p T . One observes that the NLO corrections lead only to a mild change of the shape, with a small decrease of the K-factor in the high-p T tail. Comparing the two center of mass energies, we also see that the peak of the distribution remains in the region slightly below 100 GeV, for both LO and NLO. The distribution is steeper for 8 TeV, and the theoretical uncertainties at NLO are slightly smaller at 13 TeV.
In the lower row we show the transverse momentum distribution for the three leading-p T jets. For a better visibility of the different curves we have rescaled the first jet by a factor of 10, the third jet by a factor of 0.1, the second jet is shown unchanged. On a logarithmic scale this leads to a vertical shift of the different curves, preserving however their shapes and the size of the uncertainty bands. It is therefore possible to better appreciate the different behavior of the curves over the considered kinematical range. In the ratio plots below we show the NLO/LO ratio for each pair of curves, this means that each jet distribution is normalized to its own LO distribution. Looking at the distributions we observe to a larger   Figure 8: The p T distributions of the Higgs boson (upper row) and the three hardest jets (lower row) for 8 TeV (l.h.s.) and 13 TeV (r.h.s.). For the p T distribution of the jets, the curves for the first and the third jet have been rescaled by a factor of 10 and 1/10 respectively. extent the patterns seen for the Higgs-p T : The peak of the distribution is almost insensitive to a change of the center of mass energy, but in the 8 TeV result the tail of the curve decreases faster. For the first and the second jet one also obtains a slight decrease in the K-factor towards higher energies and a small reduction of the theoretical uncertainties when going from 8 to 13 TeV. The third jet however behaves a bit differently. The K-factor is almost flat and the size of the scale uncertainties remains constant towards higher energies whereas for the first two jets one obtains a slight reduction.
Another important observable is the rapidity distribution which is shown in Fig. 9 for the Higgs (upper row) and the three jets (lower row) again for both 8 TeV (left column) and GoSam + Sherpa pp → H + 3 jets at 13 TeV Higgs rapidity GoSam + Sherpa pp → H + 3 jets at 8 TeV GoSam + Sherpa pp → H + 3 jets at 13 TeV Figure 9: The rapidity distributions of the Higgs boson (upper row) and the three hardest jets (lower row) for 8 TeV (l.h.s.) and 13 TeV (r.h.s.). For the rapidity distribution of the jets, the curves for the first and the third jet have been rescaled by a factor of 10 and 0.1 respectively.
13 TeV (right column). Starting with the rapidity of the Higgs we first of all observe a very flat K-factor across the whole range of the distribution. Comparing the 8 TeV with the 13 TeV result this still holds and is accompanied by a mild reduction of the scale uncertainty in the 13 TeV result, as already observed for the p T distributions. However the shape of the distribution changes for both LO and NLO predictions when increasing the center of mass energy. For 8 TeV the fraction of Higgs particles in the central region is higher and we see a steeper decline of the cross section towards large values of the rapidity. This is much less pronounced for the 13 TeV case. There we get a relative enhancement of the regions with a large rapidity.   For the rapidity distribution of the jets in the lower row of Fig. 9 we have applied the same scaling procedure as for the p T distribution in order to obtain a better readability of the plots. Also for the jets the K-factor is flat to a very good approximation, with a small reduction of the scale uncertainties when going from 8 to 13 TeV. As for the Higgs, one can observe a relative enhancement of the regions with large rapidities in the 13 TeV result. This is due to the fact that the increased center of mass energy increases allows to fill more the phase space corners where the particles are scattered towards the forward/backward regions, while at the same time fulfilling the p T requirements on the jets.
We conclude this section by discussing the impact of higher-order corrections on the wimpiest jet in H + 1 jet, H + 2 jets and H + 3 jets configurations. In NLO calculations for W/Z +jets performed with scale B, it was noted that the transverse momentum spectrum of this jet exhibits a flat K-factor [123]. We test for the effect in Higgs+jets production for the first time, and we find a similar behavior, as exemplified in Fig. 10 (left). The green curves show the first jet in H + 1 jet, the blue ones the second jet in H + 2 jets, and the red ones the third jet in H + 3 jets. The ratio plots show the transverse momentum dependent K-factors for the three cases, scaled by factors of 20/3 (H + 1 jet), 2 (H + 2 jets), and 2/3 (H + 3 jets). It is evident that the K-factors are not only flat over the entire range, but they are also very similar for all three calculations. Fig. 10 (right) shows the same analysis for the scale choice A. In this case the K-factors have a larger transverse momentum dependence. Their average also differs much more between the different jet multiplicities.

Multi-particle observables and correlations
Multi-particle or multi-jet observables are at the core of any measurement that involves many objects in the final state. They allow to test QCD dynamics at the LHC to an unprecedented precision, and they often reveal inappropriate modeling by LO calculations or by MC event generators. Figure 11 shows the dijet invariant mass distribution, m j i j k , for each of the three possible combination (i = 1, k = 2), (i = 1, k = 3), (i = 2, k = 3), where the jets are ordered in transverse momentum. The left panels show results for 8 TeV, the right panels are for 13 TeV. In order to avoid overlaps in the figure, the curves for m j 1 j 2 are scaled by a factor of 10, whereas the curves for m j 2 j 3 are scaled by a factor 1/10. We observe a steeper decrease of the distribution in the 8 TeV case and also for softer jets, as expected. Comparing the left and right panels, one observes that the maximum of the curves is to a good approximation independent of the collider energy.
In the lower part of the plots we show separately the K-factors for the three distributions. Apart from the expected reduction of the theoretical uncertainty we observe a K-factor that is approximately constant for both energies and for all the three jet combinations. Only the invariant mass of the two leading jets, m j 1 j 2 , shows a small decrease in the relative size of the NLO corrections for higher values, in particular at 13 TeV. This is of course to a large extent due to the scale choice.
A further observable that is particularly important in view of vector boson fusion processes is the azimuthal angle ∆φ between jets, as shown in Fig. 12 right. As for the invariant masses one observes a flat K-factor for all combinations and for both energies. The shape however changes when increasing the energy. This is particularly visible in the peak regions, which are slightly more pronounced at 13 TeV. Their position is related to the choice of the jet radius, and in particular with configurations where the two jets have ∆y = 0 and the separation in φ exactly corresponds to the chosen value of R. Further multi-particles observables, less related to multi-jet QCD dynamics and more specific to Higgs production in association with jets were already shown before in Section 3.1 in Figs. 5 and 6.

Multi-jet ratios at NLO
In this section we ask the question how observables change in the presence of additional QCD radiation, starting with a core process of H + 1 jet. The VBF topology requires at least two jets, but our observations hold in both cases, largely because the phase space available to QCD radiation at the LHC is tremendous. This has been pointed out many times before, and a particularly nice example of the effects is given in Ref. [124]. In this section we use our NLO results for H + 3 jets to make some of the statements explicit.
The visible energy, H T , is the classical example of a 1-jet inclusive observable which is impacted by higher-order radiative effects, simply because it sums the Higgs p T and all jet transverse momenta, irrespective of their correlations in azimuth. The corresponding spectrum is shown in Fig. 13 for 8 TeV (left) and 13 TeV (right). The upper plots show the distribution for one, two and three jets. Unless stated otherwise, the jet multiplicity is exclusive, i.e. with a veto on an additional jet activity. The one jet and the three jet subprocesses are shown twice, once for the exclusive case, and once for the inclusive case,        labelled by 'incl'. The lower panel shows the ratio of each contribution to the inclusive H + 1 jet prediction. In the following we will only discuss the results for 13 TeV. The exclusive H + 1 jet contribution dominates below 200 GeV, but it falls off steeply towards higher values of H T . The exclusive H + 2 jets contribution is negligible in the low H T region, but just above 250 GeV it takes over from the H + 1 jet contribution and dominates the H T spectrum. The H + 3 jets exclusive contribution shows a similar pattern. Being completely negligible up to around 250 GeV, its relative importance rises quickly and it crosses the exclusive H + 1 jet curve at around 350 GeV. In the range of 500-600 GeV it then becomes equally important as the exclusive H + 2 jets contribution. Comparing the exclusive H + 3 jets to the inclusive H + 3 jets result, we observe that even the fourth jet plays a very important role. Above 300 GeV the inclusive H + 3 jets result is the second largest contribution to the H T spectrum, and it rises to 80 per cent of the inclusive H + 1 jet at around 500 GeV.
In conclusion, processes of higher multiplicity give rise to important contributions to the H T spectrum. This does not only happen in the high H T region but already at around 250 GeV. Including higher multiplicities is therefore important for a reliable prediction of the distribution.
In the plots of the middle row of Fig. 13 the same constellation is shown without the H + 1 jet process. Instead, all the contributions are now normalized to the inclusive H + 2 jets result. Similar to the H + 1 jet case, the H + 2 jets process the exclusive contribution is dominant in the low H T range but its contribution drops fast with increasing H T . In the range around 500 GeV the exclusive H + 3 jets is already of the same size. This has to be kept in mind, particularly for VBF searches.
Finally the plots of the lowest row show the inclusive and exclusive 3 jet predictions and in the ratio plot the results were normalized to the inclusive cross section. The pattern is very similar to the previous cases, namely that the exclusive contribution dominates the low H T region and becomes increasingly unimportant in the high H T range.
It is to be expected from the definition of the observable, that H T is largely influenced by additional radiation, as shown above. What is more striking though, is that also more inclusive observables, like the Higgs boson transverse momentum, are susceptible to the same effect. This is exemplified in Fig. 14. We observe that higher-multiplicity processes play an equally important role for the Higgs p T spectrum as they do for the visible energy.
A comparison of the inclusive differential cross sections for both the Higgs p T and for the H T distribution is shown in Fig. 15. The left panels show results for 8 TeV, the right panels for 13 TeV. For better visibility, the H + 1 jet contribution has been multiplied by a factor of 10, and the H + 2 jets result has been multiplied by a factor of 2. The ratio plots are not scaled. They show the ratio of the (n + 1)-jet cross section normalized to the n-jet cross section. This allows to judge the relative importance of a contribution compared to the one is which the jet multiplicity is one unit lower. Starting with the Higgs p T distribution we see the same qualitative behavior as in Figs. 13 and 14. The sole difference is that the effects of higher jet multiplicities are more pronounced in the case of inclusive cross sections. In the high tail the H + 2 jets contribution becomes as big as the H + 1 jet contribution and the H + 3 jets contribution reaches about 60 per cent of the H + 2 jets contribution. Comparing 8 and 13 TeV one also observes an enhancement of this effect for higher energies. The lower row shows the same for the H T observable. Again the pattern is qualitatively the same as for the exclusive case shown in Fig. 13 but also more pronounced in the inclusive case. The H + 2 jets contribution supersedes the H + 1 jet already at moderate values of H T for both 8 and 13 TeV and the H + 3 jets contribution easily makes up 60-80 per cent of the H + 2 jets result.
The same behavior as discussed in previous plots is found in the transverse momentum distribution of the leading jet, shown on the l.h.s. of Fig. 16. It can be compared to the transverse momentum of the Higgs, with an increasing importance of the higher jet multiplicities when increasing the transverse momentum. However for the rapidity of the leading jet, shown on the r.h.s. of Fig. 16 this is not the case. GoSam + Sherpa pp → H + 1, 2, 3 jets at 13 TeV H+1 incl H+2 incl H+3 incl Figure 16: Successive ratios of inclusive H + n jets differential cross sections at 13 TeV for the transverse momentum and rapidity distributions of the leading jet.

Comparing tagging jet selections and testing high-energy effects
Typically the definition of tagging jets is based on the jet transverse momentum. An alternative that is more suitable for the VBF Higgs analysis to be investigated as well is to order the jets according to their rapidity and choosing the most forward to the most backward jets. We will denote the first option as p T -tagging and the latter as y-tagging. y-tagging is theoretically motivated not only because of Higgs coupling measurements in the VBF channel, but also because it allows to confirm the universal properties of QCD in the high-energy limit. In this limit, t-channel gluon exchange dominates the cross section. Jet production can then be described by Lipatov effective vertices that are resummed in the BFKL equation [125][126][127]. Event generators based on a Monte-Carlo solution to this equation [128][129][130][131][132][133] were constructed for the LHC in order to describe the relevant event topologies at high precision [76,77,[134][135][136]. It is interesting to test how much phase space a calculation performed in collinear factorization can cover before high-energy resummation becomes relevant. Our calculation allows to study this question in Higgs-boson production through gluon fusion for the very first time.
Naturally, the observables most affected by the two tagging options above are the rapidity distance of the two tag jets and their inclusive transverse momentum. The spectrum for the latter is shown in Fig. 17 together with the transverse momentum distribution of the subleading jets. The rapidity distance is instead shown on the left in Fig. 18 whereas on the right we display the (averaged) rapidity difference between the tag jets and the Higgs boson, y * H,j 1 j 2 . This latter obsevable is defined as y * H,j 1 j 2 = |y H − (y j 1 + y j 2 )/2| . The ratio of H + 3 jets LO and H + 2 jets NLO result (blue shaded region), and the ratio of H + 3 jets NLO and H + 2 jets NLO (red shaded area), both using p T -tagging. And, finally, the same two ratios, but this time using y-tagging. We will call the latter two ratios R and R (y) 3/2 , respectively. Since the subleading jets are present only in H + 3 jets beyond LO, the right plot in Fig. 17 does not follow the same conventions. There the H + 3 jets LO and NLO curves are shown together with their ratio for the two tagging schemes. Figure 18 (left) shows a significant shift towards larger rapidity difference in y-tagging, as compared to p T -tagging. Jets are still predominantly produced centrally, because we apply democratic transverse momentum cuts, and the phase space for centrally produced jets is larger. Figure 17 (left) exemplifies the change in p T -spectra, with the y-selection leading to softer tag jets than the p T -selection. This is most easily seen in the upper ratio plot. For both p T -and y-tagging the H + 3 jets NLO results are very similar to the LO results, albeit with reduced scale uncertainty. Compared to the H + 2 jets NLO calculation, the shape of tag jet distributions changes mostly in the low-p T region. This indicates that the emission of a third jet, especially at large p T , is very likely. The large radiative corrections are described  Figure 18: NLO predictions for rapidity difference observables in H + 2 jets and H + 3 jets production at the 13 TeV LHC. The rapidity separation between the tagging jets is presented on the left while the y * variable measuring the distance between the Higgs boson and the two tagging jets is depicted in the right panel. Distributions are shown for the two tagging jet definitions, p T jet tagging and y jet-tagging. See text for more details, and definition of the depicted ratios. more precisely by a H + 3 jets than by a H + 2 jets calculation, leading to largely reduced scale uncertainties. This confirms the findings of Sec. 3.4.
Another important feature appears in the lower ratio plots on the left of Figs. 17-18. On the one hand, R On the other hand, R (y) 3/2 (∆y) is roughly proportional to ∆y, with slightly larger slope at NLO, indicating an increasing NLO over LO ratio. We can compare this, on a qualitative level, to the results presented in Ref. [137], where R (y) 3/2 (∆y) has also been computed. This calculation was performed in an approach based on the high-energy resummation (using Hej [77,135]) and compared to results from collinear factorization (using MCFM [37]). The authors observed a considerable discrepancy between the two calculations, particularly for large rapidity differences. In this context it is important to stress that MCFM describes the H + 3 jets topology at LO accuracy only. Comparing the findings with Fig. 18 (left) we note that the discrepancy observed in [137] is largely reduced by the NLO correction to the H + 3 jets process.
It is interesting to consider the subleading jets shown in Figure 17 (right), which are defined for both selections as the jets with highest transverse momentum, excluding tag jets. We observe that the subleading jets have very different p T -spectra in the two selections, which is simply due to the fact that the hardest jet is preferably produced at central rapidity, making it the leading tag jet in the p T -selection, but the first subleading jet in the y-selection. The differential K-factors shown in the middle and lower ratio plots display only modest variation over the entire kinematic range, with the theoretical uncertainty being smaller for y-tagging.
Another important observable is y * H,j 1 j 2 as shown in Fig. 18 (right) for H + 3 jets at 13 TeV. The overall behavior is comparable to the rapidity distance between the tag jets shown on the left, however the inclusive R 3/2 ratios are flatter for y-tagging and show a more pronounced decrease for p T tagging.

Vector boson fusion phenomenology
The production of a Higgs boson in the VBF channel is phenomenologically highly relevant, as it allows to measure the couplings between electroweak gauge bosons and the Higgs boson. It also provides sensitivity to the CP structure of the Higgs couplings [138], as well as access to possible anomalous couplings in both the Higgs sector and the electroweak sector of the Standard Model.
As gluon fusion is an irreducible background to the VBF channel, the challenging task for theory is to provide a precise prediction of its rate compared to the signal. NLO precision for the signal (VBF with up to three jets) has already been achieved [24,25,139,140]. In this section we therefore focus on the background.
The main obstacle is the extraction of the exclusive H + 2 jets cross section in the fiducial region of typical VBF analyses. We have already seen in Sec. 3.4 and Sec. 3.5 that higher-multiplicity final states contribute sizeably to the inclusive cross section. If we extract the effect of a central or global jet veto on the H + 2 jets final state from the NLO H + 2 jets calculation, the prediction is of leading order accuracy and the associated theoretical uncertainty is therefore large. A more reliable fixed-order prediction is derived from a simultaneous calculation of H + 2 jets and H + 3 jets. In this case, one obtains the exclusive H + 2 jets rate as the difference between the inclusive H + 2 jets result and the H + 3 jets result in the vetoed region of the phase space, thus improving on the theoretical  Figure 19: Total cross sections for H + 2 jets and H + 3 jets using VBF cuts and two tagging jet definitions. Left plot is for 8 TeV, right plot for 13 TeV.
accuracy of logarithmically enhanced contributions related to the veto on additional jet activity [141]. The kinematic distribution of H + 3 jets events may also help to devise cuts for improving the purity of an LHC event sample. In this section we therefore provide results for the gluon fusion process when applying the typical VBF selection criteria as described in Eq. (2.6), m j 1 j 2 > 400 GeV, |∆y j 1 ,j 2 | > 2.8 . , (4.1) and we focus in particular on observables where we expect different shapes between signal and background.

Cross sections and scale dependence
We start our discussion with the total cross sections as displayed in Fig. 19. In contrast to Sec. 3.1, which implemented more generic multijet cuts, we refrain from including the H + 1 jet result, as the VBF signal requires at least two jets. Having excluded the fixed scale as a sensible choice in the sections above we only show the two scale choices A and B for comparison. Simultaneously, we include two different definition of the tagging jets: defining them as the two jets with the largest transverse momenta, referred to in the following as "p T -tag", and defining them as the pair which spans the largest rapidity interval between them, referred to as "y-tag". The third (and fourth) jet are then those among the remaining jets with the largest (second largest) transverse momentum. We observe that the choice of the tagging jet scheme has a considerable impact on the total cross section for the H + 3 jets process whereas the H + 2 jets process is almost unaffected by it. This is easily understood as the latter is independent of the tagging scheme at leading order. At NLO a mild dependence is then introduced. This effect gets enhanced for the H + 3 jets process, introducing a difference of the total cross section of almost a factor of two. This shows that the y-tag definition is much more sensitive to additional radiation beyond the two tagging jets. However, independent both of the choice of the tagging scheme and the collider energy we see a good agreement between the LO ratios and the NLO ratios. Similarly, the effect of the scale choice is almost negligible.    20 shows the exclusive jet cross sections for the H + 2 jets and the H + 3 jets process for the two tagging jet definitions at both center of mass energies considered in this paper. We show the contribution of the subprocess to the different jet multiplicities, i.e. the H + 2 jets NLO calculation contributes to the two-jet and the three-jet bins while the H + 3 jets NLO calculation contributes to the three-jet and four-jet bins, where we have to keep in mind that the (n + 1) contribution is only present at leading order accuracy. Comparing these to the results presented in Fig. 4 where no topological cuts were applied it can be observed that the (n + 1)-jet contribution is enhanced in the VBF fiducial region. In the y-tag scheme this effect is a somewhat stronger than for the p T -tag scheme. The same is true comparing the 13 TeV results to the 8 TeV ones. The relative enhancement of the (n + 1)-jet contribution implies that the contribution, which is only described with leading order accuracy, becomes more important. Turning this argument around, this means that the theoretical uncertainty is increased in the VBF fiducial region as the leading order pieces of the calculation have a larger contribution to the total result then in the simple dijet region of the previous section. This stresses the importance of the H + 3 jets calculation also for the VBF region as it allows to determine the radiation of a third jet with NLO accuracy. A detailed summary of the total cross sections of the various contributions, cross section ratios and jet fractions for 8 and 13 TeV is listed in Tab. 3. It shows values for both tagging schemes and gives the precise numbers for the respective K-factors.  . We report the results for both jet tagging definitions described in the text. NLO-to-LO K-factors, K n , for the inclusive 2-jets (n = 2) and 3-jets (n = 3) bin, the cross section ratios r 3/2 , r 4/3 and m-jet fractions, f m , are given in addition. Since the predictions for H+4 jets are only LO accurate, f 4 and r 4/3 coincide.

Differential observables
In order to separate events tagged by the presence of a dijet configuration which is compatible with a VBF process, experimental analyses [7,15,142] rely on multivariate discriminants which are based on boosted decision trees (BDT). The typical discriminating variables used in these BDT are the invariant mass of the tagging jet system m j 1 j 2 , the rapidity separation between the two tagging jets ∆y j 1 , j 2 and their transverse momenta, p T,j 1 and p T,j 2 . The rapidity of the leading tagging jet y j 1 is also taken into account as well as the azimuthal separation and the rapidity separation between the Higgs boson and the tagging jet system, ∆φ H, j 1 j 2 and y * H, j 1 j 2 , respectively. Furthermore, in the measurements of the Higgs boson decaying to two photons, one also uses the transverse momentum of the diphoton system with respect to its thrust axis in the transverse plane, p T,γγ,t , and some observables directly related to one of the two photons. The latter are not considered in the following since the Higgs boson is not decayed in our analysis. Instead of p T,γγ,t we will directly consider the transverse momentum p T,H of the Higgs boson itself.
Because of the very peculiar signature of the VBF events, the tagging jet invariant mass distribution m j 1 j 2 plays a key role in determining whether an event could stem from a VBF process or not. For this reason, it is interesting to consider a third jet tagging scheme besides the p T -tagging and the y-tagging introduced in Sec. 3.5: one can define the two tagging jets based on the pair of jets that generates the largest invariant mass. In the presence of three or more jets, the treatment of the subleading jets is the same as in the other two schemes where they are ordered according to their transverse momenta. Although closely related to the y-tagging scheme, the new scheme referred to as m jj -tagging will serve as another benchmark scenario in the following discussion.
All figures presented in this section will have the same structure: they will show our results for H + 2 jets at NLO and H + 3 jets at both LO and NLO after the application of the VBF selection criteria. The main plots on the left and on the right always contain the differential distributions, which we obtained by utilizing the p T -tagging and y-tagging scheme, respectively. The differential cross sections of each main plot are accompanied by four ratio plots. Starting from the top we display (i) the three-jet fraction, (ii) the ratio to an alternative tagging scheme definition (p T -tagging/y-tagging on the left and y-tagging/m jj -tagging on the right), (iii) the ratio to a different scale choice where instead of the default scale B, we chose scale A and, finally, (iv) the reduction of the respective baseline cross sections due to the VBF requirements given in Eq. (2.6). Note that the basic gluon fusion cuts as stated in Eq. (2.5) are used to define the baseline of the respective H + n jets analysis. In the topmost subplot, all ratios are taken with respect to the central H + 2 jets prediction at NLO accuracy using scale choice B, cf. Eq. (2.8b). The other three subplots show the ratios between the respective H + 2 jets and H + 3 jets samples that were generated based on different (ii) jet tagging, (iii) scale setting and (iv) selection cut level. In all cases, the shaded bands indicate the respective standard scale uncertainties.
We start by considering the tagging jet invariant mass distribution m j 1 j 2 reported in Fig. 21. After applying the VBF cuts the three-jet fraction varies between 0.3 and 0.4 in the p T -tagging scheme, increasing to 0.5-0.7 in the y-tagging scheme. The contribution from The four ration plots now detail, from top to bottom, the three-jet fraction, the difference between tagging schemes, the difference between different functional forms of the central scale choice, and the impact of the VBF cuts with respect to the baseline dijet selection. Scale uncertainties with respect to the central H + 2 jets NLO prediction are indicated by the shaded bands.
H + 3 jets is therefore non-negligible also for values of m j 1 j 2 close to the cut. As already observed for the inclusive cross section, the ratios between the results of different tagging strategies shows a 25% increase in the cross section for H + 2 jets at NLO over the whole kinematic range and a 100% increase for H + 3 jets both at LO and NLO when moving from p T -tagging to y-tagging. The results are instead almost identical for y-tagging and m jj -tagging. Also, varying the scale from choice B to choice A does not have a big impact, in particular at NLO. Finally we observe that the reduction in the cross section, due to remaining ∆y j 1 ,j 2 cut, is of about 50% at around 400 GeV and is, unsurprisingly, almost absent at 1 TeV. There, almost all dijet configurations also fulfill the rapidity separation criterion. As we will see, this decrease is much more dramatic for other observables which are dominated by the bulk of the events just beyond the cut.
The distribution of the tagging jet rapidity separation is in turn shown in Fig. 22. On the left plot, in the p T -tagging scheme, we observe an important change in the shape of the distribution going from H + 2 jets to H + 3 jets. For small rapidity separations the presence of a further jet gives an additional contribution as large as 60%. This decreases to less than 20% for ∆y j 1 ,j 2 > 7. It instead remains approximately constant in the case of a yor m jj -tagging. Again, as discussed in Sec. 3.5 this reproduces the high energy behavior reported in [137]. Also in this case varying the scale choice has almost no impact. The lowest ratio leads to the conclusion that events with a rapidity interval of at least five units automatically fulfill the m j 1 j 2 cuts independent of the tagging scheme.
One of the most important distributions in the VBF process is the difference in the azimuthal angle φ between the two tagging jets. It allows a distinction between the different possible CP-structures of the Higgs and is an interesting channel to detect anomalous couplings. We present the contribution from the gluon fusion channel after VBF cuts in Fig. 23. Comparing these plots with the ones from the basic cuts, cf. Fig. 12, a clear change in shape, in particular for high values of ∆φ j 1 ,j 2 , is evident. This can easily be understood by recalling that the requirements of the VBF cuts, namely high invariant mass and a considerable difference in rapidity forces the two jets into a back-to-back configuration which is given at ∆φ j 1 ,j 2 π. Interestingly, the largest scale dependence, originating in the large three-jet fraction (which approximately doubles here from 0.3 to 0.6) occurs at ∆φ j 1 ,j 2 ≈ π 2 . This effect is even more pronounced in the y-tagging scheme, which largely coincides with the m jj -tagging scheme, where the impact of the H + 3 jets contribution is close to 90% near the perpendicular azimuth.
Turning now to the transverse momentum distribution of the Higgs boson, displayed in Fig. 24, we observe that the shape remains largely unaffected by the more stringent VBF cuts wrt. the more liberal dijet selection (Fig. 14). This is no surprise as the additional VBF cuts do not act directly on the Higgs boson itself. The cross section, however, decreases by almost an order of magnitude over the whole kinematic range in the p T -tagging scheme while the reduction is again only a factor of 3 in the y-tagging and m jj -tagging schemes. The choice of central renormalization scale introduces a slight tilt in the distributions irrespective of the tagging scheme. Consequently many observations done for Fig. 8 still apply in this case. The contribution of H + 3 jets at NLO becomes as large as 50% of the H + 2 jets contribution already around 160 GeV. This increases to 70% when a yor m jj -tagging strategy is used, stressing the effective LO nature of the H + 2 jets NLO calculation in this region.
In Figs. 25 and 26 the inclusive tagging jet transverse momentum and the transverse momentum of the leading non-tagging jet is shown. These plots can directly be compared with Fig. 17. Apart from the general decrease in the cross section, the curves are qualitatively very similar. Since H + 2 jets describes additional jet activity beyond the tagging jets only through the resolved real radiation contribution, the predictions from H + 2 jets at NLO and H + 3 jets at LO are identical (up to statistical fluctuations) for p T,j 3 . The contribution from NLO corrections to H + 3 jets therefore has a large impact leading to a large K-factor varying between 1.3 and 1.8. Consequently, as already seen in Sec. 3.5, this observable also experiences a large impact on its properties by the choice of tagging scheme, leading to much steeper fall-off in the p T -tagging scheme than in either the y-tagging scheme. The m jj -tagging scheme here exhibits larger variations from the y-tagging scheme, distorting its behavior at large transverse momenta to smaller rates, while giving an overall similar behavior. The effect of the VBF cuts on the shape of the leading non-tagging jet transverse momentum is only moderate in the y-tagging scheme, while in the p T -tagging scheme they reject more events with larger p T,j 3 than with lower. This effect is more pronounced in the transverse momentum of the tagging jets themselves. Again, the choice of functional form of the renormalization scale introduces a noticeable tilt into all observables.
The situation changes for observables that are directly affected by the VBF cuts. In  The dip is of course caused by the ∆y cut between the two tagging jets, that forces the jets towards higher rapidities, leaving a gap in the central region. The precise shape of this gap strongly depends on the choice of tagging scheme: using y-tagging it is somewhat wider than using p T -tagging, while, again, y-tagging and m jj -tagging give very similar results. Unsurprisingly, in three jet events in p T -tagging, comprising on average about 40% of all events, the dip is less pronounced. While in y tagging the presence of such a third jet leads to a wider separation of the tagging jets, as always the most forward and backward ones are chosen. Here, the three jet fraction again is larger overall and ranges up to 60%. The precise choice of scale hardly matters.
As the VBF cuts only act on the two tagging jets and Higgs boson production through gluon fusion, unlike production through weak boson fusion, comprises topologies with color Tagging jets rapidity ytag-jet  connections between all colored partons, the above dip is not present neither in the rapidity distribution of any non-tagging jet nor for the Higgs boson, cf. Fig. 28. Again, this observable is described at LO only by the H + 2 jets NLO calculation, coinciding with the H + 3 jets LO calculation and necessitating the H + 3 jets calculation at NLO accuracy. Characteristic differences are present between both tagging schemes. Besides the difference in the differential K-factor (1.4 for p T -tagging and 1.5 for y-tagging), the shape is a direct consequence of which of the three jets are selected as tagging jets. Thus, in y-tagging the third jet is much more central than in p T -tagging and only somewhat more central than in m jj -tagging. As in Fig. 27, the choice of scale is almost inconsequential.
We finish discussing two observables which relate the Higgs boson to the tagging jets. In Fig. 29 we plot the azimuthal separation between the Higgs boson and the dijet system defined by the two tagging jets. This predictions can be compared with the ones in Fig.  5 (right), where they same observables is shown with baseline cuts only. As shown in the bottom ratios, it is clear that the shape of the predictions is very similar apart from a slightly milder increase of the curve towards the back-to-back configuration. The large uncertainty in the H + 2 jets curve reminds that this in only a LO description. Therefore the contributions coming from H + 3 jets NLO corrections are particularly large and need to be taken into account for a precise theoretical prediction. The previous considerations hold both for p T -tagging and for y-tagging. When directly comparing the tagging schemes we observe that the predictions for the y-tagging scheme are flatter than both p T -and m jj -tagging scheme. While the y-tagging scheme favors configurations with the Higgs boson recoiling from both tagging jets as much as the p T -tagging scheme and only somewhat less likely than the m jj -tagging scheme, it allows for a factor of 3 (1.3) more events where the Higgs boson and the tagging jets recoil against the rest of the event. Again, this is easily understood as a consequence of the tagging jet selection process, leaving more or less energetic jets to recoil against and, thus, more or less opportunities for the Higgs and the tagging jet system to be boosted into the same direction. Herein, scale choice A and B do not differ significantly. Finally, Fig. 30 shows again y * H,j 1 j 2 , defined in Eq. (3.2). Compared to the results with baseline cuts of Fig. 18 (right), the distributions fall off a bit faster for very large rapidity separations, independently of the tagging method. Another difference which is worth mentioning is that the differential three-jet fraction increases from about 40% (60%) at y * H,j 1 j 2 4 to 80% (90%) at y * H,j 1 j 2 ≈ 5 in the p T -tagging (y-tagging) scheme, making the contribution of H + 3 jets NLO corrections even more important. Furthermore, in both cases the ratio compared to baseline cuts is more constant using the H + 3 jets NLO calculation compared to the LO one. Qualitatively, all tagging schemes predict the same shape for y * H,j 1 j 2 4 with both the yand m jj -tagging scheme predicting a few more events at even larger y * H,j 1 j 2 . Again, scale choice A and B do not differ significantly.

Conclusions
Gluon fusion is the dominant production mechanism for a Standard Model Higgs boson at the LHC. The production of a Higgs boson in gluon fusion in association with jets also constitutes an irreducible background to the vector boson fusion mechanism. Reliable predictions for the Higgs boson plus jets processes are therefore indispensable for a precise determination of the Higgs boson couplings and its properties in the VBF signal.
In this paper we have presented a detailed phenomenological analysis of the gluon fusion contribution to Higgs boson plus jets were we focused on two and three additional jets in the final state. The calculations have been performed in the limit of an infinitely heavy top quark, at next-to-leading order in QCD. Results for LHC collision energies of 8 TeV and 13 TeV have been obtained by the combination of the fully automated tools GoSam and Sherpa. The numerical results have been generated in two steps. First we have produced sets of Ntuples for the two energies and the three different jet multiplicities with a minimal set of kinematic requirements, which in a second step, have been analyzed for the particular scenarios. The entire set of Ntuples will be made publicly available.
We have investigated two major scenarios, one defined by applying only basic selection cuts, and the second by applying the considerably more constraining VBF cuts where we also investigated alternative tagging jet selections. We found that independent of the final state jet multiplicity the NLO QCD corrections remain sizeable and are therefore an important prerequisite for a reliable prediction. In particular in the VBF scenario, for both the two jet as well as the three jet bin, the additional jet production accounts for a considerable fraction of the total cross section which means that the results, to a large extent, are only given with leading order accuracy. However, if one considers a veto on the third jet in a two jet calculation, this again would introduce large theoretical uncertainties. Therefore the calculation of the three jet process with NLO accuracy provides important information also for the exclusive two jet result.
For inclusive observables, i.e. observables that are not a priori dependent on a specific number of jets, such as the transverse momentum of the Higgs boson, we find that the higher jet multiplicities are important for the correct description of the shape of the observables. In particular in the tails of the distributions, which might be sensitive to new physics, they can make up the dominant contribution. Also here, the inclusion of the NLO corrections of H + 3 jets leads to an improvement of the theoretical prediction.
We discussed a large variety of differential distributions which are suitable to distinguish the gluon fusion process from that of the vector boson fusion. Some of these observables are also used as input variables for the boosted decision trees in the experiment. We particularly described the effects of a third jet as well as the impact of the NLO corrections.
Further improvements could certainly be achieved by providing a merged NLO result of the different jet multiplicities, but also through the inclusion of top-quark mass effects as well as the matching of the H + 3 jets NLO result with a parton shower. Due to the complexity of these improvements they are however beyond the scope of this paper and we leave them for future work.