Full mass dependence in Higgs boson production in association with jets at the LHC and FCC

The first computation of Higgs production in association with three jets at NLO in QCD has recently been performed using the effective theory, where the top quark is treated as an infinitely heavy particle and integrated out. This approach is restricted to the regions in phase space where the typical scales are not larger than the top quark mass. Here we investigate this statement at a quantitative level by calculating the leading-order contributions to the production of a Standard Model Higgs boson in association with up to three jets taking full top-quark and bottom-quark mass dependence into account. We find that the transverse momentum of the hardest particle or jet plays a key role in the breakdown of the effective theory predictions, and that discrepancies can easily reach an order of magnitude for transverse momenta of about 1 TeV. The impact of bottom-quark loops are found to be visible in the small transverse momentum region, leading to corrections of up to 5 percent. We further study the impact of mass corrections when VBF selection cuts are applied and when the center-of-mass energy is increased to 100 TeV.


Introduction
The gluon fusion mechanism yields the largest contribution to the production cross section of a Standard Model (SM) Higgs boson. However, the fact that already at leading order (LO) this process is mediated by a closed loop of heavy fermions, in other words it is a loop-induced process, leads to a tremendous complication in the computation of theoretical predictions and higher order corrections. This holds not only for the production of a Higgs boson alone, but also and especially for the calculation of its production in association with jets.
When the mass of the fermions is much larger than the Higgs boson mass, the heavy fermion can be integrated out and the coupling between gluons and the Higgs can be described by an effective vertex [1], simplifying the calculations considerably. Since the top quark is giving the dominant contribution in the heavy fermion loops, this approximation is also referred to as the infinite top-quark mass limit. The validity of the effective theory is however limited. In particular it breaks down when the momentum flow through effective vertex becomes of the order as the fermion masses.
This behaviour can be understood better by comparing the high-energy limit of a pointlike gluon-gluon Higgs interaction, with a resolved interaction mediated via a loop. The latter provides a form factor responsible for softening the amplitude in this limit. More specifically one has to consider the transverse momentum behaviour of the amplitude for producing a Higgs boson out of two off-shell gluons in the effective and in the full theory [2][3][4][5]. The contribution in the limit of large gluon transverse momenta (much larger than the heavy-quark mass) is suppressed by the massive quark loop in the full theory, whereas in a pointlike interaction the same transverse momenta are allowed to reach the kinematic limit given by the center-of-mass energy √ s. In the high energy, limit this leads to a different scaling for the two predictions in terms of the leading logarithmic contribution in (m 2 H /s), where m H is the Higgs boson mass. The effective theory has a double logarithmic scaling, whereas the full theory scales as a single logarithm of (m 2 H /s). Recently the corresponding scaling in terms of the Higgs boson transverse momentum p T, H was derived in the same high energy limit [6,7], finding that, as p T, H → ∞, the distribution drops as (p 2 T, H ) −1 in the effective theory, whereas it goes as (p 2 T, H ) −2 in the full theory. The predictions obtained in the infinite top-quark mass limit are therefore an increasingly poor approximation as the transverse momentum of the Higgs boson increases and becomes larger than roughly the top quark mass. This affects an increasing fraction of the phase space when the Higgs boson is produced in association with several jets. The case in which the Higgs boson is produced via the gluon fusion process in association with at least two further jets represents also the most relevant irreducible background to Higgs boson production via vector boson fusion (VBF). These two production mechanisms can be distinguished introducing topological cuts, which may however enhance even more the portion of phase space in which the effective gluon-gluon-Higgs theory is a poor approximation of the full theory prediction.
The purpose of this paper is to investigate the range of validity and the breakdown of the effective theory approach at a more quantitative level, when the Higgs boson is produced in association with up to three jets. In other words, we will pursue the question, if and to what extent the large top-quark mass limit is justified for higher jet multiplicities. Taking full top-and bottom-quark mass dependence in the loop into account, we perform a leading order calculation for the production of a Higgs boson in association with up to three jets and compare this to predictions from the effective theory.
A precise treatment of massive bottom quarks at leading order requires the use of four flavor PDFs and the corresponding removal of initial state bottom quarks. Furthermore, massive b-quarks also invoke a Higgs-bottom Yukawa coupling, which leads to tree-level contributions with massive bottom quarks in the final state. However, in this paper we are interested in the mass effects caused by massive quarks in the loops. In other words we want to determine the effect of taking bottom quarks into account, compared to predictions in which only top quarks are considered. Therefore we keep the external quarks massless and leave the aforementioned approach for further studies.
Leading order results in the full theory for up to two jets have been known for some time [8][9][10][11][12][13], and partial results for Higgs boson plus three jet production were first computed in [14], whereas lately multijet merged predictions for up to two or three jets with full mass dependence were computed in [15] and [16] respectively. As already anticipated, very recently predictions of mass effects beyond LO on the Higgs boson transverse momentum spectrum became available too [7]. A different study investigated instead the effect of light-quark mediated contributions [17].
Studying the effects of mass corrections to the infinite top-quark mass limit becomes even more important for proton colliders with very large center-of-mass energies. Recently, an analysis similar to the one we present here was performed in the context of a comprehensive report about physics at the Future Circular Collider (FCC) for a center-of-mass energy of 100 TeV [18].
The paper is structured in the following way: in Section 2 we present the setup used to perform the computation, the choice of the phenomenological parameters and the cuts we applied. Section 3 is dedicated to the total cross section results for LHC and FCC, whereas the results at the differential level are presented in Section 4. In Section 5 we conclude and offer an outlook on possible future improvements.

Calculational setup
In this paper we will compare predictions for the production of a Higgs boson in association with one, two or three jets at LO and next-to-leading order (NLO) in the effective Higgs-gluon theory, already computed in [19,20], with predictions at LO in the full SM for all three multiplicities. The latter were computed in two different manners: once considering only massive top-quark loop contributions, and once taking into account both massive top-and bottom-quarks running in the loop.
For H + 1 jet there are two different partonic channels which have to be considered: qq → H g , g g → H g . (2.1) For both H + 2 jets and H + 3 jets there are instead four different independent subprocesses, namely qq → H q q (g) , qq → H qq (g) , qq → H g g (g) , g g → H g g (g) . (2.2) All the remaining subprocesses are related by crossing symmetry. Both the one-loop amplitudes for the NLO effective theory results as well as the one-loop amplitudes for the LO results with mass dependence were generated using GoSam [21,22], a publicly available package for the automated generation of one-loop amplitudes. It is based on an algebraic generation of d-dimensional integrands using a Feynman diagrammatic approach, employing QGRAF [23] and FORM [24,25] for the diagram generation, and Spinney [26], Haggies [27] and FORM to write an optimized Fortran output. For the reduction of the tensor integrals we use Ninja [28][29][30], a tool for the integrand reduction via Laurent expansion. Alternatively one can use other reduction techniques such as integrand reduction using the OPP method [31][32][33] as implemented in Samurai [34] or using methods of tensor reduction as offered by the Golem95 [35][36][37][38] library. The remaining scalar integrals have been evaluated using OneLoop [39].
For the NLO prediction in the effective theory the tree-level amplitudes for the Born and real radiation contribution, the subtraction terms and their integrated counterpart were computed with Sherpa [40] and the matrix element generator Comix [41,42]. Sherpa and GoSam were linked via the Binoth Les Houches Accord interface [43,44].
Because of the high statistics needed for such a large multiplicity final state, the Monte Carlo events are stored in the form of Root Ntuples. They are generated by Sherpa and were first used in the context of vector boson production in association with jets [45]. Very recently first studies appeared about possible extensions for NNLO computations [46,47].
For the calculation in the effective theory, sets of Ntuples files with Born (B), virtual (V), integrate subtraction term (I) and real minus subtraction term (RS) type of events have been generated for H + 1, 2 and 3 jets at the center-of-mass energies of 13 and 100 TeV 1 . The events were generated such that jets can be clustered using the k T or anti-k T algorithm [48,49] as implemented in the FastJet package [50] and with radii that can vary between R = 0.1 and R = 1. At 13 TeV a minimal generation cut was imposed on the jets by requiring p T, jet > 25 GeV and |η jet | < 4.5 .
Because of the much wider rapidity span available, at 100 TeV the generation cuts are: which allows to post-process the events in every analysis with more exclusive cuts. Furthermore they allow the user to change a posteriori both the renormalization and the factorization scales as well as the choice of the parton distribution functions (PDFs) 2 . More details about the format of the Ntuples and an extension of their content used for this work is discussed in Appendix A.

Generation of Ntuples incorporating finite-mass effects
The set of Ntuples with the full mass dependence were generated starting from the available Born type Ntuples used for the study presented in [20]. These Ntuples contain the Born matrix element weight in the effective theory. To obtain a set of LO Ntuples in the full theory we have therefore re-weighted these events using the matrix elements with the full quark mass dependence.
Since the effective theory is obtained by assuming an infinitely heavy top quark, which is integrated out, the one-loop amplitudes could be checked in a robust way by setting the top mass to large values and observing that the effective theory result is reproduced. We have checked this behavior numerically by setting the top mass to 10 TeV and found an agreement at the sub-permille level between the one-loop amplitudes of the full theory with the tree-level amplitudes of the effective theory for random phase space points. This is a strong consistency check for the whole setup, in particular of course for the correctness of the one-loop amplitudes.

Physical parameters
In the following we will present numerical results for center-of-mass energies of 13 TeV and for a possible future collider at 100 TeV. As input parameters we use When we consider also bottom quark loops, the bottom-quark mass was set to m b = 4.75 GeV in the propagator mass and to m b (m H ) = 3.38 GeV in the Yukawa coupling [51]. This allows us to quantify the effect due to bottom-quark loops and its interference with top-quark loops. For external partons we keep working with n f = 5 light active flavours. From the input parameters listed above we derive the corresponding values for m W and sin w that enter in the definition of the gluon-Higgs coupling in the effective theory. We define our central renormalization and factorization scale to be where the sum is understood to run over partons rather than over jets. Scale uncertainties are obtained by varying both scales simultaneously by factors of 0.5 and 2 around the central value. The strong coupling constant is calculated at this scale and taken according to the CT14NLO pdf set [52]. We investigate two different set of cuts, one that is suited for a general analysis of the gluon fusion scenario with only a basic set of cuts to render the cross section finite, and a second set, which is more suitable in the context of the vector boson fusion scenario. The baseline cuts for the jets consists of In addition to these cuts, to investigate the vector boson fusion (VBF) scenario, we further demand m j 1 j 2 > 400 GeV , where j 1 and j 2 are the leading jets for a given tagging scheme. We will refer to them as tagging jets in the following. In the next sections, we will mainly consider a p T jet-tagging strategy, in which jets are order by decreasing transverse momentum. In this case j 1 and j 2 are the leading and the second-leading transverse momentum jets. For some specific observables, we will however also consider a y jet-tagging scheme in which the tagging jets are defined as the two jets with the most forward and most backward rapidity.

Total cross sections including top and bottom quark contributions
We start the discussion of the numerical results with a comparison of the total cross sections, for which we consider the effective theory predictions at LO and NLO (labeled as σ LO, eff  and σ NLO, eff respectively) and compare them with the full theory results at leading order when considering both top-quark and bottom-quark loops, called σ LO, m t,b , as well as with the case where only top-quark loops are taken into account, labeled σ LO, mt . In Table 1 we summarize the results for the different jet multiplicities and for center-of-mass energies of 13 TeV and 100 TeV. The effect of varying the renormalization and factorization scales by factors of 0.5 and 2 is reported as a relative variation with respect to the nominal value. As expected all the LO results both in the effective and in the full theory suffer from large scale dependencies, which become as large as 70% for H + 3 jets, and which reduce to 10 − 20% at NLO in the effective theory. The results of Table 1 are visualized in Figure 1, where we also include the ratios to the leading order result in the effective theory. For a better visibility we show two different ratio plots, both normalized to the LO result in the effective theory. The upper ratio shows the K-factor between LO and NLO in the effective theory. The lower one, with a much smaller range on the y-axis, highlights the differences between the LO in the effective and in the full theory.  By combining the LO prediction that includes the exact top-quark mass dependence and the NLO K-factor from the effective theory calculation, one could estimate the Higgs boson plus multi-jet cross section with exact top-quark mass dependence at NLO. This approach was used successfully for lower-multiplicity calculations in [15]. The much more demanding computation of the exact mass dependence of the cross section has been performed for inclusive Higgs production at NLO [11] and even at NNLO [53][54][55][56][57][58], but the exact result at NLO is not within reach for the large jet multiplicities considered here. A combination of the LO result with full top-quark mass dependence and the NLO K-factor from the effective theory would constitute the current best estimate of Higgs boson production in association with up to three jets. However, we refrain from quoting the corresponding number, as it does not add new information to our existing predictions.
Focusing on the central values, we observe that the leading-order contribution in the effective theory agrees in general very well with the predictions based on the full theory. Taking bottom-quark loops into account leads to corrections, which are as small as one percent for all three final-state multiplicities we are considering, and, as expected, they become even smaller at 100 TeV. However, it is interesting to note the change in the sign of these corrections with increasing jet multiplicity. While for H + 1 jet production at 13 TeV the cross section is reduced when bottom-quark loop contributions are included, for H + 2 jets and H + 3 jets the cross section increases instead. This is clearly visible in the first column of Table 1 and displayed in the second ratio plot on the left in Figure 1. If we compare the predictions at 100 TeV given in Table 1 for the two different transverse momentum cuts applied to the jets, we observe a sign flip in the interference effects for H + 1 jet. While for p T, jet > 30 GeV, the pattern is similar to the 13 TeV results, increasing the minimum p T of the jet instead yields a small positive overall contribution once both quark-loop contributions of the heaviest generation are included. This means that the effect of the interference on the cross section is destructive in the low transverse momentum region, whereas at high p T the sum of top-quark and bottom-quark contributions leads to constructive interference effects. We will come back to this in Section 4.2, where we discuss the impact of these interference effects on differential distributions.

Heavy-quark mass effects in differential distributions
Recalling the 100 TeV collider results from Table 1, we have already seen that an increase of the jet p T threshold leads to a noticeable change of the total Higgs boson plus jet cross sections. By studying differential cross sections for various classes of observables, we want to identify the phase-space regions that receive important corrections as a result of the finite-mass treatment of the heavy-quark loops. For a broader understanding of this issue, we consider different scenarios that are of relevance to ongoing and future hadron collider experiments.

LHC predictions for 13 TeV collisions
We start with the discussion of differential distributions relevant for the LHC operated at a collider energy of 13 TeV. In the figures presented here, we compare the effective theory predictions at LO and NLO with results obtained in the full SM. This provides us with a direct comparison of the size of the different corrections. We can decide more easily whether we need to pay attention to including the NLO effects in the effective theory or the finite-mass effects based on the full theory. We are also able to identify observables and/or kinematical environments where it will be mandatory to incorporate both effects in one way or another.
For the full theory calculations, we usually consider both of the heaviest quarks running in the loop, i.e. we take the top quark as well as the bottom quark contributions into account. The associated predictions will hence show the additional label 'm t,b ' in the figures. Depending on the specific observable, we will present all three or two of the H + n jets predictions together in one plot (with n = 3 being the maximum jet number). In the lower part of these plots, we display for each jet bin separately, the ratios of the three different types of predictions taken with respect to the corresponding LO result in the effective theory. For example, in Figure 2a, the upper, middle and lower ratio plots show the different ratios based on the predictions for H + 1 jet, H + 2 jets and H + 3 jets, respectively.
Transverse momentum distributions are known to receive significant corrections that lead to a softening of the distribution for larger p T values. We thus investigate this class of observables first, and summarize most of our results in Figure 2. For all three processes, i.e. for the production of H + 1 jet, H + 2 jets and H + 3 jets, Figure 2a displays next to each other the transverse momentum distributions of the Higgs boson (left panel) and the hardest jet (center panel), and the scalar sum of the transverse momenta of all jets (right panel). These three distributions clearly show the expected behaviour of p T -tail softening. For small transverse momenta (or small H T ), we find that the leading-order predictions based on the effective theory are in very good agreement with the respective leading-order predictions given by the SM. This is because the heavy top-quark approximation works very well in this       ' label) obtained in the full SM for various observables monitoring the transverse jet activity in the various Higgs boson plus jet production processes. The Higgs boson transverse momentum p T, H , is shown on the left while the leading jet transverse momentum, p T, j1 , and scalar p T sum of all jets, H T, jets , is presented in the middle pane and in the right pane, respectively. Note that the H + 1 jet (green curves) and H + 3 jets (red curves) predictions have been rescaled for better visibility. The smaller plots in the lower part of each panel show the ratios of the three different predictions normalized to the LO effective theory prediction. This is done separately for each of the H + 1 jet, H + 2 jets and H + 3 jets processes. H+3-jet vs. H+2-jet differential cross section ratios.
LO H+3/H+2 LO H+3/H+2 mt,b NLO H+3/H+2 (b) Differential H + 2 jets over H + 1 jet ratios (upper row) and H + 3 jets over H + 2 jets ratios (lower row) for the transverse momentum spectra of the Higgs boson, on the left, and the leading jet, in the middle, as well as the all-jets scalar p T sum, on the right. Each panel shows the ratios for the three theories considered here, the LO and NLO effective theory as well as the full theory at LO. low-p T region. Furthermore, we see that this statement holds for all three jet multiplicities considered here. Focusing on the pure p T distributions of Figure 2a, we observe that the point at which the effective theory approach starts to break down occurs around Higgs boson or lead-jet values of p T = 200 GeV and is to a good approximation independent of the jet multiplicity of the Higgs boson production processes. This observation would support a rather simple explanatory model, in which we assume that the resolution of the effective vertex is mainly driven by a quantity, which is very strongly correlated with the event's hardest single particle p T . The inner structure of the ggH vertex will therefore be probed with any interaction where the leading particle-p T exceeds the top-quark mass. In H + n jets production, the hardest particle is either the Higgs boson itself or the leading jet. This right away explains why the breakdown occurs for both the p T, H and the p T, j 1 spectra at the same scale. As we will argue further below, this assumption for the main resolution driver seems to also work well for the other examples of transverse observables discussed in this section.
Above the breakdown scale, the deviation from the full SM predictions becomes sizeable very rapidly, resulting in a strong suppression by a factor of 10 at p T ∼ 1 TeV. Compared to this, the NLO corrections in the effective theory lead to enhancements of the cross section, which are distributed in a relatively uniform way. Since we useĤ T /2 as our central scale, the differential K-factor between the LO and NLO effective prediction turns out to be flat for the production of H + 1 jet [20], whereas it has a non-trivial shape for H + 2 jets and H + 3 jets production. The latter two K-factors however approach 1 for transverse momenta that are about or larger than 600 GeV. It is therefore fair to say that the NLO corrections turn into a subleading effect, already at p T ∼ 400 GeV, as one has to contrast their behaviour with the strong p T dependence of the finite-mass corrections.
We note that similar observations regarding finite-mass effects have been made before. They have been pointed out in particular for the p T, H distribution [15,16,46]. In the one-jet case, we can use the leading-order exact statement that p T, H = p T, j 1 (= H T, jets ). It is however interesting to see that the differential ratios associated with p T, H and p T, j 1 (see the lower part of Figure 2a) are strikingly similar in their characteristics even beyond the one-jet case. In addition, they are also very similar among the different jet bins, suggesting that the relative 1/p 2 T scaling between the effective and full theory at LO can be applied in a more universal manner (cf. Section 1). In fact if we concentrate on the p T, H predictions, we observe that the suggested scaling holds to a fairly good extent. For example, at p T, H ≈ 400 GeV, the mass effects reduce the cross section to roughly 60% of the effective theory result. At p T, H > 1 TeV, this reduction then turns into an one-order of magnitude effect, which fixes the related cross section ratio at a value of dσ dp T, H p T, H = 1.0 TeV The above number (as given by our computation) can be compared with the number one expects from exploiting the relative scaling property between the effective and full theory predictions. Based on the additional suppression of the full result by two powers of p T, H , the expected value for the same cross section ratio amounts to (400 GeV/1000 GeV) 2 = 4/25, which is very close to the value extracted from the theory data. This result for the scaling does not change much among the different jet bins because the three ancillary plots in the lower part of Figure 2a show that the ratio between full and effective theory predictions only marginally loses some of its steepness for an increasing final-state multiplicity. The cumulative characteristics of the H T, jets observable shown in the right panel of Figure 2a leads to an amplification of the NLO effects in the effective theory (for obvious reasons), while the full theory distributions in the multijet cases fall off less severely at larger scales than they do for the single-object p T spectra discussed above. We also notice that the breakdown of the effective approach occurs at higher scales. In fact an increasing jet multiplicity tames the finite-mass effects further, i.e. yields a weaker scaling and pushes the breakdown scale out to larger values of H T . The reason for these changes becomes clear by looking at a fixed H T, jets point, for example H T, jets = 1 TeV. The transverse hardness is shared among all jets, which also means that the leading jet appears at a scale lower than 1 TeV. At this lower scale, the deviation of the full theory p T, j 1 prediction has not grown as large as for exactly 1 TeV. This has to be reflected by the finite-mass H T, jets distribution, which therefore cannot fall as quickly as the p T, j 1 distribution.
As we are interested in the scaling properties of the finite-mass effects, it is beneficial to study the ratios of successive differential jet cross sections. Given a specific observable, we define these ratios as Figure 2b visualizes the R 2 and R 3 ratios (i.e. the differential ratios for H + 2 jets/H + 1 jet and H + 3 jets/H + 2 jets cross sections) for the transverse momentum observables discussed above. The ratios are shown for each type of our predictions. Figure 2b therefore supplements Figure 2a greatly, as it clearly exhibits the relative importance of the respective subleading jet multiplicity at higher scales, and the robustness of this feature under finite-mass effects. For all three observables, we essentially find two regimes independent of the type of the prediction: at low transverse scales, the n-jet contribution is always significantly smaller than the (n − 1)-jet contribution, but rises quickly with increasing transverse scales. This has already been pointed out in Ref. [20]. Above a certain scale (which appears around 400 GeV for p T, H and p T, j 1 , and around twice that scale for H T, jets ), one enters the saturation or scaling regime where the R 2 and R 3 can be roughly described by a constant. This indicates (and confirms our earlier statements) that nearly the same scaling is in place in successive jet bins. The largest deviations from this behaviour and between the different predictions can be found for the R 2 (H T, jets ), which is no surprise again due to the cumulative nature of the observable. The R 2 generally level off at higher values than the R 3 where the inclusion of finite-mass effects yields a slight increase of the respective LO effective ratios. Again, the R 2 are somewhat more affected by this. The NLO corrections to the effective predictions work in the opposite direction. In all ratio distributions, they stabilize the constant behaviour in the saturation regime.     respective 'wimpiest' jet in all three H + n jets channels, i.e. it shows the leading jet in H + 1 jet, the second leading jet in H + 2 jets and the third leading jet in H + 3 jets production. On the right hand side of the same figure, the rapidity distributions of the Higgs boson are presented for the three cases of producing the Higgs boson in association with one jet, two jets or three jets. Compared to the observables discussed so far, the rapidity distributions show a completely different behaviour. Here the full theory and the effective theory agree throughout the entire rapidity range. This is expected since the regions of the phase space where the top-quark loop is resolved are more or less uniformly distributed in rapidity, and their contribution is suppressed by at least one order of magnitude (as shown by the p T spectra in Figure 2a) compared to the bulk of events, for which the full and effective theory approaches agree. The NLO corrections regarding the latter are sizeable although they mainly enhance the cross section while leaving the shape more or less unaltered.
As can be seen in the left panel of Figure 3, the effective theory approach starts to deviate at even smaller values of the transverse momentum, namely around 125 GeV or   100 GeV, if one considers the second leading jet in H + 2 jets production or the third leading jet in H + 3 jets production, respectively. This is a consequence of the p T ordering of the jets. In both cases there has to be a harder jet present in the event that is distributed according to dσ/dp T, j n−1 . For the leading jet in H + 2 jets and H + 3 jets events, the deviation between the effective and full theory results begins around 200 GeV (see center panel in Figure 2a). The distributions for the second hardest jet must therefore deviate around (200 − X) GeV where X > 0, and similarly, for H + 3 jets final states, the third-jet distribution must break down around (200 − X − Y ) GeV where Y > 0. Hence, the p T ordering of the jets translates into an ordering of breakdown scales. In other words, if the superior jet does not resolve the heavy-quark loop, the softer one will not do so at all.
For the two-jet and three-jet processes, it is interesting to study the invariant mass spectrum between the two hardest jets, which are also the tagging jets in the p T jet-tagging scheme. This observable plays an important role in the definition of kinematic constraints for VBF analyses. The other key observable needed in VBF studies is the rapidity separation of the same pair of jets. Both distributions are shown in Figure 4 for H + 2 jets and H + 3 jets final states. For the invariant mass distribution, one observes only a mild deviation between the full and the effective theory predictions, which becomes more pronounced towards the higher end of the kinematical range. As a matter of fact, large invariant tag-jet masses are not only generated in events with hard tagging jets. The geometry of the momentum flow is an important criteria for the production of dijet masses. For example, in a situation where the     two jets appear in a back-to-back configuration, large invariant masses can emerge despite the absence of energetic jets. In these cases, the effect of the heavy quarks is significantly reduced, and the effective theory therefore can be used to describe this observable accurately.
For the rapidity separation between the two tagging jets, shown on the right hand side in Figure 4, the situation is similar to that of the rapidity distribution of the Higgs boson discussed above. It is therefore clear that there is a good agreement between the effective theory and the full theory predictions for this observable. Figure 5 shows on the left hand side the radial separation ∆R H, j 1 between the Higgs boson and the leading jet. On the right hand side, we display the smallest of the radial distances between the Higgs boson and any of the jets in the event, ∆R min, H, j k . In H + 1 jet production at LO, the Higgs boson and the only jet present in the event are forced into a back-to-back configuration. The ∆R H, j 1 distribution has therefore a natural cut-off at π, where it also peaks. At NLO, the presence of a second jet, which can become unresolved, opens up the previously kinematically forbidden range between 0 and π. The H + 2 jets distribution at LO also has a kinematical constraint owing to the presence of two jets that    must be resolved. The radial distance between the Higgs boson and the leading jet can therefore not be smaller than the minimal azimuthal angle of ∆φ = π/2. The presence of at least a third jet (in H + 2 jets final states at NLO and H + 3 jets final states) allows one to finally populate the full kinematical spectrum. From the distributions, it is however clear that the Higgs boson preferably recoils against the leading jet, because independent of the jet multiplicity, the distributions are all peaked at ∆R H, j 1 = π. The finite-mass effects are only very mild in the H + 1 jet and H + 2 jets case. In the H + 3 jets case, they give a small correction that slightly increases the cross section at small radial separation, and decreases it at values larger than ∆R H, j 1 = π. This is a consequence of ∆R being derived from the rather robust variables ∆y and ∆φ.
For the same reason mentioned previously, the minimal radial separation between the Higgs boson and a jet has a kinematic edge at π in H + 1 jet production at LO. In all other cases, the distribution is spread over the entire kinematical range. It is interesting to notice that, contrary to the plot on the left, in the H + 2 jets and H + 3 jets final states, the    distributions flatten out for ∆R min, H, j k values between 2 and π. Based on these and earlier findings, a typical event where the Higgs boson is produced in association with several jets will likely have these features: the Higgs boson will tend to recoil against the leading jet, but clearly, as the multiplicity increases, it will occur more often in company of a close, rather soft jet. Finally we remark on the good agreement between the predictions from the effective theory and the predictions including the mass corrections. Only for H + 3 jets production, and for large radial separations between the jets do the two curves start to deviate. In contrast to this, the NLO corrections in the effective approach have a significantly larger impact, and this statement extents to all 2-body correlations considered in this section.

The case of massless bottom quarks
In this section we compare the predictions for differential distributions in the full theory with and without the b-quark loop contribution, with the aim to assess the effects of neglecting the bottom quark contribution. In Table 1 and Figure 1 above we already commented the impact on the total cross sections, and observed a change in the sign of these contribution between the H + 1 jet predictions and H + 2 jets and H + 3 jets predictions. This can now be quantified better at the differential level.   For this we focus our attention on a few selected observables. In Figure 6 we show the Higgs boson and the leading jet transverse momenta, in Figure 7 we compare the scalar sum of the jet transverse momenta H T,jets and the invariant mass of Higgs boson and leading transverse momentum jet, whereas in Figure 8 we plot the invariant mass of the tagging jets and their azimuthal angle difference ∆φ j1,j2 . All the plots show the corresponding observable as well as ratio plots for the three different jet-multiplicities. The ratio is given by the result in which only top-quark loops are considered, divided by the predictions where both topand bottom-quark contributions are taken into account. The color shaded areas denote the scale uncertainty. For a better visibility we do not show the full scale uncertainty band in the ratios, but rather zoom in around the central scale.
It is clearly visible that the scale uncertainty outweighs the bottom mass effects by far, for all the considered observables. The size of the effects strongly depends on the observable but never exceeds five percent. In general, the bottom-quark mass effects are most visible in the observables involving transverse momenta and sums thereof and in invariant masses involving the Higgs. It is however interesting to observe which of the two predictions is larger as a function of the kinematical region considered. The largest effects can be observed in the soft region of the observables. This is to be expected, since especially when the kinematical scales involved are not too large compared to the bottom-quark mass, bottom-quark loops can lead to sizable corrections to the predictions in which only the top quark is considered. Far away from these kinematic regions the bottom quark can be considered massless. Furthermore, as already discussed in section 3, the size and sign of the effect depends on the jet multiplicity.  This can be seen for instance in Figure 7. The destructive interference for H + 1 jet at the level of the total cross section stems from the soft region, whereas the net contribution becomes positive in regions where the b-quark can be considered massless. For H + 2 jets and H + 3 jets the destructive interference is considerably reduced, leading to an increase of the total cross section when bottom-quark loops are taken into account. For angular variables these contributions are instead flat, over the whole kinematical range. This is expected since the effects are uniformly distributed in these variables as already discussed in the previous section.

VBF measurements at the LHC
The production of a Higgs boson in association with two or more jets in the gluon-fusion channel is also the main background to the VBF production channel. Since the latter has a very characteristic topological signature, in which two jets are produced mainly at high rapidities with a large invariant mass and a large azimuthal separation, leaving little jet activity in the central region of the detector, this channel can be enhanced with respect to the background by additional cuts, similar to the one of Eq. 2.8. In this section we investigate the impact of the mass effects on the gluon-fusion predictions when these additional cuts are applied. As already demonstrated by Figure 4, both the m j 1 j 2 variable and the ∆y j 1 ,j 2 variable are almost unaffected by mass corrections. We can therefore expect that at the level of the total cross section the pattern stays similar as the case without VBF cuts. The total cross sections reported in Table 2 confirm this. Also the effect of the massive bottom-quark loops is very small leading to changes in range of 1 − 2%. Therefore, for the remainder of this section, we will not discriminate between top-quark only and combined top-quark and bottom-quark effects, instead always include both massive quark loop contributions at the same time.
Two of the key observables in the VBF scenario are the invariant mass distribution of the tagging jets m j 1 j 2 and their azimuthal angular separation ∆φ j 1 ,j 2 . These two observables are shown in Figure 9. The comparison between the full theory and the effective theory   predictions reveals a good agreement between the two, indicating that mass effects are rather small.
These effects become, however, considerably larger when one considers observables that have already been seen to be sensitive to heavy-quark loops in the previous sections. In Figure 10 we show the scalar sum of the transverse momenta H T,jets of the jets before (left hand side) and after applying the VBF cuts (right hand side). The effect is clearly visible when comparing the ratio plots normalized to the LO in the effective theory: after VBF cuts the mass effects become more severe, leading to larger deviations of the effective theory from the full theory for high transverse momenta. In the lowest row we show the differential ratio between H + 3 jets and H + 2 jets, which remains roughly unchanged.
Another important observable is the radial separation between the two tagging jets ∆R j 1 ,j 2 . This observable is considered in Figure 11. On the left and in the middle column we show this observable before applying VBF cuts. The first plot shows the observable for a p T jet-tagging, whereas in the second plot we adopt a y jet-tagging strategy. Although NLO effects in the effective theory lead to substantial differences between the two tagging schemes, the two leading order results in the full and the effective theory agree very well for both tagging schemes and for both multiplicities. At least at leading order the choice of the tagging scheme is therefore insensitive to mass effects. The plot on the right hand side shows the observable using the original p T jet-tagging but after applying VBF cuts. In this case mass corrections have an impact especially for ∆R ≈ 3 , where deviations can become larger than 20%. When the two jets are not too far in radial distance, the Higgs boson must   in general be harder and recoil against them. This explains the discrepancy between the two LO predictions. The bottom plots show the ratio between the two jet multiplicities. As expected, both the tagging scheme as well as the VBF cuts have a significant impact on this ratio. However, heavy-quark mass effects do not lead to deviations from the prediction of the effective theory.

FCC predictions for 100 TeV collisions
In view of a possible future circular collider operating at a center-of-mass energy of 100 TeV, we also investigate the mass effects for such a high collider energy. Values for the total cross section at the different multiplicities were already presented and discussed above in Section 3 and Table 1. In this section we focus on differential distributions. In Figure 12 we investigate the impact of the mass effects depending on the cut on the jets transverse momentum by looking at the Higgs boson rapidity distribution. On the left the transverse momentum cut on the jets is p T, jet > 30 GeV, whereas on the right it is Tagging jet geometric separation: ∆Rj1, j2  Tagging jet geometric separation: ∆Rj1, j2  Tagging jet geometric separation: ∆Rj1, j2  increased to p T, jet > 100 GeV. At 13 TeV the leading order contributions of the full and the effective theory agree very well and we find the same good agreement at 100 TeV for the looser transverse momentum cut. Requiring a minimum transverse momentum of 100 GeV leads to visible deviations between full and effective theory. This is clearly related to the fact that the bulk of the cross section comes from the softest allowed region of the phase space, where the mass effects play only a very minor role. Increasing the p T threshold, however, cuts away this large and mass-insensitive part of the cross section and the remaining contribution is much more affected by mass effects. In Figure 13 we show the transverse momentum of the Higgs and the leading jet as well as H T of the jets with a p T cut on the jets of 100 GeV. Owing to the increase in the cross section and the possibility to produce much harder jets and Higgs bosons, all these observables suffer from large mass effects, which for transverse momenta larger than 1 TeV lead to corrections which are bigger than one order of magnitude. The lowest panel shows the inclusive differential H + 2 jets/H + 1 jet and H + 3 jets/H + 2 jets ratios for the three different observables. These ratios remain unchanged for the transverse momentum of the Higgs boson, meaning that the relative importance of higher multiplicity contributions is stable under mass effect corrections, and we also see an only very mild deviation for the transverse momentum of the leading jet. However, for H T the ratio increases when passing from the effective theory predictions to the full SM. The massive quark loop effects are therefore stronger in the high transverse momentum tails for the lower multiplicities than     for the higher ones. Figure 14 shows again the radial separation between the two leading jets ∆R j 1 ,j 2 , this time for 100 TeV center of mass energy. The two plots show the impact of the finite mass corrections when the minimum transverse momentum threshold is raised from p T, jet > 30 GeV to p T, jet > 100 GeV. The very small differences between the effective theory predictions and the full SM curve at 13 TeV (Figure 11 -left) become larger, especially for H + 3 jets, when increasing the collider energy, even if the cuts are kept equal. On the lowest ratio of the left plot we observe a non-trivial shape of the mass corrections, which are minimal when the separation is about ∆R j 1 ,j 2 ≈ π. For smaller separations the two leading jets must be close in the (y, φ)-plane and combine such that the momentum flow through the effective vertex is increased, leading to a break down of the effective theory prediction. The tiny variations in the shape of the mass corrections dramatically increase when the transverse momentum of the jets is required to be above 100 GeV. The plots on the right reveal a very non-trivial dependence of the mass corrections from the radial distance, and overall the impact of these corrections is much larger than the NLO corrections in the effective theory.
As can easily be foreseen, the increased deviation of the full SM predictions from the effective theory is present in the observables that we will discuss in the following. We will only present plots for which p T, jet > 100 GeV, where the effects are much more visible. In Figure 15 we present the two components which combined give rise to the radial distance discussed above: on the left the rapidity separation ∆y j 1 ,j 2 and in the center the azimuthal angle separation ∆φ j 1 ,j 2 between the two leading jets. In the former case the mass corrections are roughly constant over the full kinematical range, whereas in the latter case they are much larger for small angle separation, and become almost negligible when the two jets are back-to-back in azimuth. The reason is similar to the one outlined previously for ∆R j 1 ,j 2 . On the right of the same figure we show the leading invariant dijet mass. As already stressed previously, this observable is particularly important when studying VBF scenarios. Compared to the curve shown for 13 TeV and a transverse momentum cut of 30 GeV (Fig. 4), where the mass corrections barely affected the distribution, we observe now a clear decrease of the cross section, which reaches −50% for invariant masses of the order of 3 TeV. Figure 16 shows the radial separation between the Higgs boson and the leading jet for Tagging jet geometric separation: ∆Rj 1, j2 10 0

LO H+3/H+2
LO H+3/H+2 mt,b NLO H+3/H+2 Figure 14: Geometric separation between the leading jets for a low and high jet threshold and the associated R 3 ratios using p T jet-tagging.
the two different jet tagging strategies. The plot on the left shows ∆R H,j 1 when using a p T -jet tagging strategy, whereas on the right we apply the y-jet tagging, which by definition needs the presence of at least two jets. This observable demonstrates that mass effects can lead to fairly complicated corrections with respect to the effective theory predictions. Apart from the H + 1 jet predictions, which because of the presence of only a single jet are not affected too much by mass corrections, the full SM predictions increase the cross section at small radial distance and decrease it for larger values of ∆R H,j 1 . The differences are slightly milder in the right plot when considering a y-jet tagging strategy. This is due to the fact that the tagging jets in the rapidity tagging are not necessarily hard jets, which means that the phase space region of hard jets is rather diluted across the observable. To conclude we compare the Higgs boson transverse momentum using full SM predictions with and without bottom-quark loops. Figure 17 shows the two results for a minimum jet transverse momentum of p T, jet > 30 GeV on the left, and for p T, jet > 100 GeV on the right.   The ratios allow to appreciate the difference between the two predictions, which is relevant mainly for H + 1 jet and when the looser jet cut is used. Increasing the jet cut or the final         state multiplicity leads to a flatter ratio in which the predictions with only top-quark loops are lower than the one with top-and bottom-quark by 0.5 − 2%.

Conclusions
The production of a Higgs boson in association with jets in gluon fusion is one of the key processes in precision Higgs physics. Accurate theoretical predictions are fundamental for a detailed understanding of the electroweak symmetry breaking mechanism. Usually calculations of higher order corrections to the production of a Higgs boson in association with jets rely on the approximation of an infinitely heavy top quark. In this paper we have computed the cross section at LO in perturbation theory in the full Standard Model considering a Higgs boson coupling to both top-quark and bottom-quark loops, including the interference between the two. Furthermore we have compared these results to the NLO predictions in the effective theory approach. We give quantitative predictions for a variety of observables for H + 1 jet, H + 2 jets and H + 3 jets, confirming that transverse-momentum related observables are particularly affected by these corrections for values above the top mass. We have calculated predictions for two center-of-mass energies, for the LHC at 13 TeV and for a possible future circular collider of 100 TeV. For the LHC, we also investigated the impact of finite mass effects when VBF selections cuts are applied on the tagging jets, in order to enhance the VBF signal. We find that mass effects typically play an important role leading to deviations up to one order of magnitude. The breakdown of the effective theory predictions is driven by the particle with the highest transverse momentum in the event and is largely independent of the final state multiplicity. This is of course highly dependent on the specific observable and scenario under consideration. In particular, since the corrections affect the harder transverse momentum regions, choosing a harder p T -cut in the analysis results in larger mass effects for all observables. We further find that the effect of including massive bottom-quarks in the loop has a mild impact. For the total cross section, the bottom-quark contribution (including its interference with top-quark loops) is around one percent for a 13 TeV LHC. Applying VBF cuts does not lead to significant changes, which is also true for pp-collisions at a 100 TeV collider. The bottom-quark effects are particularly visible in the low energy region, where they can lead to deviations of up to five percent.
In summary, the inclusion of mass effects and their control at an accuracy beyond leading order will be indispensable for reliable predictions for both the LHC, but even more for a future collider with considerably higher center-of-mass energy. Bjorken-x of incoming parton 1 x2 Bjorken-x of incoming parton 2 x1p x' for I-piece of incoming parton 1 x2p x' for I-piece of incoming parton 2 nuwgt Number of additional ME weights for loops and integrated subtraction terms usr_wgt[nuwgt] Additional ME weights for loops and integrated subtraction terms Table 3: Branches format of the Ntuples files as generated by Sherpa.
These two requirements led to the development of a new Ntuple format which we will refer to as EDNtuples (Exact Double Ntuples). In this new format an entry called ncount, was introduced, which keeps track of the number of trials between two good events during generation. This allows for an exact statistical treatment when events are reprocessed a posteriori. Furthermore the momenta are stored in double precision instead of float, to allow for a more precise kinematical reconstruction. This is needed for example when the branching history of an event is reconstructed for the MiNLO scale choice. Furthermore, in order to correctly map the subtraction counter-events to the appropriate real radiation events when performing the clustering in MiNLO, additional branches to store the information on the initial state flavours in the subtraction events had to be created. Finally, to be able to change the matrix element weight a posteriori, the phase space weight, which is already multiplied with the weight coming from the amplitude in the branches called me_wgt and me_wgt2, is now also stored separately, giving the possibility to change the weight of the amplitude. This last extension was of particular interests for this work since the events stored in the Ntuples could be reused when computing the amplitude in the full Standard Model theory, when the heavy quark loops are present at LO accuracy. The new entries added in the EDNtuples are summarized in Table 4.
Branch Description ncount Number of trials between the previous and current event during generation ps_wgt Phase space weight id1p PDG code of incoming parton 1 in subtraction event id2p PDG code of incoming parton 2 in subtraction event Table 4: Additional new branches introduced for the EDNtuples.