Top-Yukawa contributions to bbH production at the LHC

We study the production of a Higgs boson in association with bottom quarks (bb̄H) in hadronic collisions at the LHC, including the different contributions stemming from terms proportional to the top-quark Yukawa coupling (y2 t ), to the bottom-quark one (y 2 b ), and to their interference (yb yt). Our results are accurate to next-to-leading order in QCD, employ the four-flavour scheme and the (Born-improved) heavy-top quark approximation. We find that next-to-leading order corrections to the y2 t component are sizable, making it the dominant production mechanism for associated bb̄H production in the Standard Model and increasing its inclusive rate by almost a factor of two. By studying final-state distributions of the various contributions, we identify observables and selection cuts that can be used to select the various components and to improve the experimental sensitivity of bb̄H production on the bottom-quark Yukawa coupling.


Introduction
After the discovery of a scalar resonance with a mass of about 125 GeV [1,2], the accurate determination of its couplings to Standard-Model (SM) particles has become one of the major objectives of LHC Run II and beyond. Data collected at the LHC so far supports the hypothesis that this resonance is the scalar boson predicted by the Brout-Englert-Higgs mechanism of electroweak symmetry breaking [3,4] as implemented in the SM [5]: the Higgs couplings are universally set by the masses of the corresponding particles the Higgs boson interacts with. Global fits of various production and decay modes of the Higgs boson [6][7][8][9][10] constrain its couplings to third-generation fermions and to vector bosons to be within 10 − 20% of the values predicted by the SM. In particular, the recent measurement of Higgs production in association with a top-quark pair [11,12] provides the first direct evidence of the coupling between the Higgs boson and the top quark, thereby proving that gluon-gluon Higgs production proceeds predominantly via top-quark loops. The coupling of the Higgs boson to τ leptons has also been established at the 5σ level for some time [13,14], while the Higgs coupling to bottom quarks has been observed only very recently [15]. By contrast, to date, we have no experimental confirmation that the Higgs boson couples to first-/second-generation fermions, nor about the strength of the Higgs self-interaction. The ability to probe elementary couplings and to improve the experimental sensitivity strongly relies on precise theoretical predictions for both production and decay. The bottomquark Yukawa coupling (y b ) plays a rather special role in this context: despite having a relatively low coupling strength with respect to the couplings to vector bosons and top quarks, the H → bb decay dominates the total decay width in the SM for a Higgs-boson mass of about 125 GeV due to kinematical and phase space effects. The observation of this decay is, however, quite challenging because of large backgrounds generated by QCD, especially in the gluon-fusion production mode [16], and has for now only been searched for in vector-boson fusion [17,18] and Higgsstrahlung [16,19]. The latter is the most sensitive channel, yielding a signal strength for the decay branching ratio of µ bb = 1.0 ± 0.2 [15]. However, since the total Higgs width is dominated by H → bb, the corresponding branching ratio has a rather weak dependence on y b . As a result, the sensitivity of processes involving Higgs decays to bottom quarks on this parameter is in fact rather low.
Studying production modes featuring a bbH coupling is a promising alternative: on the one hand, Higgs production in the SM (inclusive over any particles produced in association) proceeds predominantly via the gluon-fusion process, where the Higgs-gluon coupling is mediated by heavy-quark loops. In particular, bottom-quark loops have a contribution of about −6% to the inclusive cross section, which can become as large as −10% for Higgs bosons produced at small transverse momentum [20][21][22][23][24][25][26][27]. On the other hand, the associated production of a Higgs boson with bottom quarks (bbH production) provides direct access to the bottom-quark Yukawa coupling already at tree-level [28]. It yields a cross section comparable to the one of the associated production with top quarks (roughly 0.5 pb at 13 TeV), which is about 1% of the fully-inclusive Higgs-production rate in the SM. Furthermore, the inclusive rate decreases dramatically once conditions on the associated b jets are imposed to make it distinguishable from inclusive Higgs-boson production.
The SM picture outlined above might be significantly modified by beyond-SM effects: while a direct observation in the SM is challenging at the LHC, bbH production plays a crucial role in models with modified Higgs sectors. In particular in a generic two Higgs-doubletmodel (2HDM), or in a supersymmetric one such as the MSSM, the bottom-quark Yukawa coupling can be significantly increased, promoting bbφ to the dominant Higgs production mode [29,30] in many benchmark scenarios, φ being any of the scalars or pseudo-scalars in such theories. Given that a scalar sector richer than that of the SM has not yet been ruled out experimentally, this is a fact that one must bear in mind, and that constitutes a strong motivation for theoretical studies of scalar-particle production in association with bottom quarks.
The production of bbH final states receives additional contributions the loop-induced gluon-fusion process (proportional to y 2 t ; y t being the top-quark Yukawa coupling), which in the SM is of similar size as y 2 b contributions, but have rarely been studied in the literature. In this paper, we consider Higgs production in association with bottom quarks for all contribu-tions proportional to y 2 b and y 2 t at NLO QCD, as well as their interference terms proportional to y b y t . The bbH process is particularly interesting also from a theoretical viewpoint in many respects. First, as for all mechanisms that feature bottom quarks at the level of the hard process, there are two schemes applicable to performing the computation. These so-called four-flavour scheme (4FS) and five-flavour scheme (5FS) reflect the issue that arise from different kinematic regimes, where either the mass of the bottom-quark can be considered a hard scale or bottom quarks are treated at the same footing as the other light quarks. Hence, the bottom-quark is considered to be massive in the 4FS, while its mass can be set to zero in the 5FS. The advantages of either scheme in the context of bbH production have been discussed in detail in ref. [31]. We employ the 4FS throughout this paper, owing to its superior description of differential observables related to final-state bottom quarks and the definition of bottom-flavoured jets, which is particularly striking in fixed-order computations. Another theoretical motivation lies in the nature of the loop-induced gluon-fusion process that leads to the contributions proportional to y 2 t . Being dominated by kinematical configurations where the Higgs boson recoils against a gluon which splits into a bottom-quark pair, this collider process features the cleanest and most direct access to g → bb splittings. Thus, as a bonus, our computation also allows us to study the effect of NLO corrections on such splittings.
Given that the NLO QCD corrections to bbH production for y 2 b contributions (and the LO y b y t terms) were studied in great detail in ref. [31], including the effect of parton showers, we focus here on the computation of NLO QCD corrections to the terms proportional to y t and analyse their behaviour with respect to the y 2 b contribution. We note that our computation of NLO corrections to the y b y t and y 2 t terms employs an effective field theory, where the top quark is integrated out from the theory and the Higgs directly couples to gluons, to which we refer as Higgs Effective Field Theory (HEFT). Besides a detailed description of the application of this approach to our problem, we will show that this approximation is quite accurate in the bulk of the phase space region which is relevant for this study.
Before introducing our calculation in the next section, we briefly summarise the status of the results for bbH production available in the literature. As far as 4FS computations are concerned, seminal NLO fixed-order parton-level predictions were obtained in refs. [32,33], and later updated to the case of MSSM-type couplings [34], and to SUSY-QCD corrections in the MSSM [35,36]. The presentation of differential results in these papers is very limited as the focus is on the total cross section. Given that computations in the 5FS are technically much simpler, far more results in this scheme exist in the literature: the total cross section are known at NLO [37,38] since a long time and even NNLO QCD [39] predictions were among the first computations at this level of accuracy ever achieved relevant for LHC phenomenology. Parton-level distributions were obtained at NLO for H+b and H+jet production [40,41], and at NNLO for jet rates [42] and fully differential distributions [43]. The analytical transversemomentum spectrum of the Higgs boson was studied up to O(α 2 s ) in ref. [44], while analytically resummed NLO+NLL and NNLO+NNLL results were presented in ref. [45] and ref. [46], respectively. 1 NLO+PS predictions for both the 4FS and the 5FS were presented for the first time in ref. [31], including a comprehensive comparison of the two schemes and the discussion several differential distributions with NLO QCD accuracy. Other NLO+PS results were later obtained in Powheg [49] and Sherpa [50]. At the level of the total cross section advancements have been made by first understanding the differences between results obtained in the two schemes refs. [51,52] and then by consistently combining state-of-the-art 4FS and 5FS predictions in refs. [53][54][55][56].
The paper is organised as follows. In section 2 our computation is described in detail. We first discuss the various contributions to the bbH cross section (section 2.1), then introduce the HEFT approximation to determine the y 2 t terms (section 2.2) and finally perform a comprehensive validation of the HEFT approximation for the y 2 t cross section (section 2.3); phenomenological results are presented in section 3 -see in particular section 3.1 for the input parameters, section 3.2 for SM results, section 3.3 for how to obtain the best sensitivity to extract y b in the measurements, and section 3.4 for our analysis on NLO corrections to g → bb splitting. We conclude in section 4 and collect relevant technical information in the appendices.
2 Outline of the calculation 2.1 Coupling structure of the bbH cross section Figure 1. Examples of Feynman diagrams for bbH production at LO and at NLO, which contain virtual and real diagrams proportional to y b , and virtual diagrams with a top loop proportional to y t . The corresponding amplitudes are named A The leading contribution to the associated production of a Higgs boson with bottom quarks in the 4FS starts at O α 2 s in QCD perturbation theory, and is mediated by the bottom-quark Yukawa coupling. Hence, the coupling structure of the LO process is y 2 b α 2 s . A sample Feynman diagram is shown in figure 1a. At the next order in α s the typical one-loop (figure 1b) and real-emission (figure 1c) diagrams are included, and yield a contribution of O y 2 b α 3 s . At the same order in α s additional one-loop diagrams appear featuring a closed top-quark loop which the Higgs boson couples to (figure 1c). These diagrams introduce for the first time a dependence on top-quark Yukawa coupling in the bbH cross section and lead to contributions of O y b y t α 3 s through their interference with y b diagrams as shown in figure 1a. At the next order in α s , the square of these y t amplitudes yields a contribution that starts at O y 2 t α 4 s . Thus, it is suppressed by two powers of α s with respect to the first non-zero contribution to bbH production of O y 2 b α 2 s and could be formally considered a NNLO contribution. However, it is easy to understand that a naïve power counting just based on the single parameter α s is not suitable for describing bbH production, since the strong hierarchy between the top-quark and the bottom-quark Yukawa couplings in the SM is such that y 2 b α 2 s terms turn out to be of a similar size as the y 2 t α 4 s contributions. In this respect, one also expects that α s corrections to the y 2 t contributions might turn out to be important, which are of O y 2 t α 5 s and formally part of the N 3 LO corrections with respect to the leading O y 2 b α 2 s terms. They enter via virtual and real diagrams of the type shown in figure 2a and in figure 2b, respectively. Collecting all relevant terms at different orders in α s , one can express the cross section as where the A (i) q amplitudes are introduced with the respective sample diagrams in figures 1-2, and dΦ denotes the appropriate phase space of the extra real emission with all relevant factors in each case. An equivalent, yet more appropriate and transparent way of organising the computation above is to consider a double coupling expansion in terms of y b and y t and then to systematically include α s corrections to each of these terms. Up to NLO, the cross section can be written as It is trivial to see that eq. (2.1) and eq. (2.2) feature exactly the same terms. In this formulation QCD corrections to y 2 b , y b y t and y 2 t terms, the ∆ (i) x contributions, are manifestly gauge invariant and can be calculated independently of each other at LO and NLO. All the coeffi- , and ∆ (0) y b yt ) were determined and studied already in ref. [31]. Our focus here is therefore on the calculation of the contributions involving y t in the 4FS, i.e., ∆   The evaluation of such diagrams is beyond current technology. Hence, in this section, we introduce the heavy top-mass approximation that can be employed for the computation of these amplitudes, and we rearrange the SM cross section in section 2.2 in the HEFT. In this effective theory, the top quark is integrated out and yields an effective point-like interaction between the Higgs boson and gluons. This approximation has been used successfully to compute a number of observables in the Higgs sector, with the gluon fusion cross section through N 3 LO as the most notable example [58][59][60]. By substituting top loops with a point-like coupling, the HEFT allows for significant simplifications of Higgs-related observables at the price of a limited range of applicability: the approximation is expected to break down when one of the scales appearing in the process, and in particular in the massive loop integrals, becomes comparable with the top-quark mass. The case at hand corresponds to H+jet (H+g) production with g → bb splitting either in the initial or in the final state. It has been shown that the HEFT provides an excellent approximation in that case as long as the scales of the process remain moderate [61,62], for example as long as the Higgs transverse momentum (p T H ) is below ∼ 150 GeV. In section 2.3, we provide a detailed assessment of the goodness of the heavy top-mass approximation. As we will show, the heavy-top mass approximation works extremely well (with differences from the full computation below 10%) as long as the probed momentum scales (Higgs or leading b-jet transverse momentum, or invariant mass of the b-jet pair) do not exceed 200 GeV.
Working in the HEFT allows us to avoid the computation of the highly complicated amplitudes in figure 1d and figure 2, and evaluate instead the diagrams shown in figure 3, which have a much lower complexity, being at most at the one-loop level. In addition, the HEFT has been implemented in an Universal Feynrules Output (UFO) model [63], and this calculation can be performed using existing automated Monte Carlo tools. Nonetheless, 2 The contributions ∆   [57,58]. These calculations, however, cannot provide information on final states specifically containing b quarks. present implementations neglect power-suppressed corrections to SM parameters generated by the heavy-top mass approximation, which play a crucial role in the case at hand. In particular the bottom-quark Yukawa coupling must be corrected in the following way: which generates additional terms of O y b y t α 4 s and O y 2 t α 5 s entering the bbH cross section at the perturbative order we are interested in. As a result, we insert y b → y HEFT b into eq. (2.2) to yield the bbH cross section in the HEFT and rearrange it as follows: where top-quark loops have been replaced by the HEFT contact interaction in quantities with a hat. In this cross section, the only contribution that could not be directly calculated using automated tools is the power-suppressed bottom-quark Yukawa correction, which we derive in the Appendix B. We find where α s and y SM b are understood to be renormalised in the MS scheme at a scale µ R . We have implemented by hand this modification in the HEFT model at NLO. This enables a complete calculation of the QCD corrections ∆ (1) y b yt in the heavy-top mass approximation in a fully automated way. We therefore can employ MadGraph5 aMC@NLO [64] to perform the calculation of the bbH cross section in the 4FS at parton level. We use the recently-released version capable of computing a mixed-coupling expansion [65] of the cross section in order to compute all six contributions (y 2 b , y b y t and y 2 t both at LO and NLO) with the appropriate MS renormalisation of y b simultaneously. 3 Besides computing eq. (2.4) in the HEFT, we also calculate the LO y 2 t contributions in the full theory in order to rescale the y 2 t contributions and to provide the best approximation of the bbH cross section in eq. (2.2). We refer to this approach as the Born-improved HEFT (BI-HEFT) in the following: (2.6) For differential distributions, eq. (2.6) is applied bin-by-bin.

Assessment of the HEFT approximation
In this section we assess the accuracy of the heavy-top quark approximation. To this end, we compare the LO y 2 t cross section σ SM y 2 t against its approximation in the HEFT σ HEFT . We use the same input parameters as for our phenomenological results in section 3, and refer to section 3.1 for details. We perform a validation for both the inclusive cross section and differential distributions. Since the topology of the process at LO is very similar to that of the H+jet process, we expect the HEFT to provide a good description in the relevant phase-space regions, in particular concerning the shapes of distributions. We stress again that in our best prediction, the BI-HEFT, we use the the HEFT only to determine the radiative corrections in terms of the NLO K-factor. Total cross section and kinematic distributions, obtained in the HEFT, are reweighted (bin-by-bin) by a factor equal to the ratio between the full theory and the HEFT, both evaluated at LO. This has been shown to be an excellent approximation for H+jet production as long as the relevant scales do not become too large [61,62].
We start by reporting the result for the inclusive cross section: The results lie within 5% of each other. Considering that the perturbative uncertainties are one order of magnitude larger, we conclude the inclusive cross section is well described by the heavy-top quark approximation. Furthermore, the accuracy of the BI-HEFT result can be assumed to be considerably better than this value, since top-mass effects are included at LO by the rescaling in eq. (2.6). As in the case of H+jet production, the dominant configurations are with the Higgs at low transverse momentum, which explains the quality of the approximation. Let us now turn to differential cross sections in figure 4. The main frame shows the SM (blue dash-dotted) and HEFT (green dotted) predictions. The lower inset shows their binby-bin ratios. The first three plots, figures 4a-4c feature the transverse-momentum spectra of   the Higgs boson, the leading and the subleading b jet respectively. As expected, we find that the HEFT provides a good description of the SM result, especially in terms of shapes. Only at large transverse the two curves start deviating with the HEFT result becoming harder. This happens after transverse momenta of ∼ 200 GeV for the Higgs and the leading b jet, and a bit earlier for the second-hardest b jet.
The Higgs rapidity distribution in figure 4d is hardly affected by the HEFT approximation, with the HEFT/SM ratio being essentially flat. Also for the invariant mass of the two b jets in figure 4e, the heavy-top quark result provides a good description as long as M (bb) 200 GeV. Finally, for the separation in the η-φ plane between the two b jets, shown in figure 4f , the agreement between HEFT and SM is very good up to ∆R = 4. Above this value, the distribution is dominated by large invariant-mass pairs, and the HEFT/SM ratio follows what happens for the invariant-mass distribution.
Overall, the heavy-top quark approximation used in the HEFT results works extremely well for this process over a large fraction of the phase space and in particular where the majority of events are produced. For the goals of our study, it is especially important to verify that the comparison of the angular separation of the b jets and of their invariant mass is well reproduced, as it indicates that we can safely explore the regime in which the two bottom jets merge into a single one. This regime is particularly interesting to study for the y 2 t terms as we will see in section 3. Furthermore, there is a reasonable range of b-jet transverse momentum where the process is correctly described, so that we can trust the prediction to study the impact of b-jet requirements on the relative importance of the y 2 b and y 2 t contributions. It should be noted, however, that in the two b-jet configuration, the HEFT prediction is rather poor over a larger range of transverse momenta for the subleading b jet. Nevertheless, this is not expected to have an impact on our phenomenological study in the upcoming section.

Phenomenological results
In this section we present differential results for bbH production at the 13 TeV LHC including all contributions proportional to y 2 b , y b y t , and y 2 t at NLO QCD, see eq. (2.2). We analyze the importance of radiative corrections and the relative size of the three contributions. Although we work in the SM, thanks to the separation of the cross section by the Yukawa coupling structure, our predictions are directly applicable to 2HDM-type extensions of the SM (for bbφ with a neutral Higgs boson φ ∈ {h, H, A}) by an appropriate rescaling of the top and bottom-quark Yukawa. Even for the MSSM such rescaling has been shown to be an excellent approximation of the complete result [67,68].

Input parameters
Our predictions are obtained in the four-flavour scheme throughout. We use the corresponding n f = 4 NNPDF 3.1 [69] sets of parton densities at NLO with the corresponding running and α s values. 4 The central values of the renormalisation (µ R ) and factorisation (µ F ) scales are set on an event-to-event basis to where the index i runs all the final-state particles, possibly including the extra parton from the real emission. Scale uncertainties are computed without extra runs using a reweighting technique [71], and correspond to independent (nine-point) variations in the range  [73]. Jets are reconstructed with the anti-k T algorithm [74], as implemented in FastJet [75], with a jet radius of R = 0.4, and subject to the condition p T j > 30 GeV and η j < 2.5. Results with a larger jet radius, R = 1, are available in appendix A. Bottom-quark flavoured jets (b jets) are defined to include at least one bottom quark among the jet constituents. A b jet containing a pair of bottom quarks is denoted as a bb jet. Within a fixed-order computation, we will use the word B hadrons to identify bottom quarks (the notation B will refer to bottom quarks, while the notation b to bottom-tagged jets). Bottom-quark observables are infrared safe owing to the finite bottom-quark mass in the 4FS.

Predictions for bbH production in the SM
We start by discussing integrated cross sections in table 1, both fully inclusive and within cuts. As far as the latter are concerned, we have considered various possibilities: the requirement that there be at least one or two b jet(s); that there be at least one jet containing a pair of bottom quarks (bb jets); and that the transverse momentum of the Higgs boson be larger than 50 GeV, 100 GeV, and 150 GeV (boosted scenarios). The residual scale uncertainties are computed by varying the scales as indicated in section 3.1. We present separately the results for terms proportional to y 2 b , y 2 t , and y b y t . The y 2 t contributions are provided in two approximations: using the HEFT, on the one hand, and our BI-HEFT prediction, computed by rescaling the HEFT result at NLO by the LO evaluated in the full theory, on the other hand. For completeness, we also quote the BI-HEFT prediction for the sum of all individual contributions. Besides LO and NLO of the cross sections we also provide the NLO/LO Kfactor to assess the importance of QCD corrections. Inside the bracket after the LO and NLO cross sections we quote the acceptance of the respective scenario, defined as the ratio of the cross section within cuts divided by the inclusive one. We refrain from quoting the acceptance for the y b y t interference terms since this quantity is meaningless on its own. The conclusions that can be drawn from the table are the following: • Already at LO the y 2 t terms yield a significant contribution to the SM bbH cross section. Due to sizable QCD corrections to the y 2 t terms (K ≈ 2.5), the inclusive NLO cross section is a factor of three larger after including the loop-induced gluon-fusion component than when considering only y 2 b contributions. Hence, the bbH cross section in the SM is substantially larger than generally assumed from y 2 b computations, which could make its observation much easier. At the same time, however, the sensitivity to the extraction of the bottom-quark Yukawa coupling is diminished. Below, we discuss how suitable phase-space cuts can be used to enhance the y 2 b over the y 2 t contributions and to retain sensitivity to the extraction of y b . Table 1. Cross sections (in pb) for different b-jet multiplicities or minimum p T H cuts.
• The relative size of y 2 t contributions further increases when considering the various scenarios with additional phase-space cuts. The reason is that the loop-induced gluonfusion component generates harder (b-)jet activity and the cuts favour hard configurations. For example, tagging one b jet has the effect of decreasing the y 2 b NLO cross section by −79%, while for y 2 t it is only −59%, and the y 2 t NLO cross section is four times as large as y 2 b in the ≥ 1b-jet scenario, to be compared to the factor of two in the inclusive case. Tighter b-jet requirements or the p T H requirements only have the effect of further increasing the relative size of the y 2 t contributions.
• It is interesting to notice that the ≥ 1bb jet category, which requires one jet containing two bottom quarks, receives contributions essentially only from y 2 t terms. This can be understood easily: a major part of the events for the loop-induced gluon fusion component features the Higgs recoiling against a hard gluon, which splits into a bb-pair. The two bottom quarks in these configurations are boosted and generally close together, which makes it more likely for them to end up inside the same jet.
• Two opposed effects render the measurement of the bottom-quark Yukawa coupling in bbH production complicated: as pointed before the relative size of terms proportional to y 2 b decreases as soon as b jets are tagged. Nonetheless, tagging at least one of the b jets is essential to distinguish bbH production from inclusive Higgs production, which predominantly proceeds via gluon fusion. 5 Therefore, it is necessary to select suitable phase-space requirements which increase the relative y 2 b contribution even in presence of at least one b jet without loosing too much statistics. Given our findings for the ≥ 1bb-jet category, one can require at least one b jet and veto all bb jets. This decreases the ≥ 1b-jet rate for y 2 t by roughly 20%, while having a negligible effect on the y 2 b rate. Below, we study differential distributions in order to find further requirements to enhance the relative size of the y 2 b contributions.
• The LO contribution of the mixed y b y t terms is negative in all scenarios, as has been observed already in ref. [31]. At NLO it yields a positive contribution only to the ≥ 1bbjet scenario. The NLO/LO K-factor of the y b y t term strongly depends on the scenario under consideration, which is expected given its interference-type contribution. By and large the impact of the y b y t terms is minor though, reaching at most a few percent at NLO.
• Overall, QCD corrections have an even larger impact on the y 2 t terms than on the y 2 b terms, but they are quite sizable in either case. This can be understood as follows: as pointed out in ref. [31] potentially large logarithmic terms of log(m 2 b /m 2 H ) enter the perturbative expansion of y 2 b contributions in the 4FS and cause large perturbative corrections. Contributions proportional to y 2 t , on the other hand, feature a logarithmic enhancement of g → bb splittings. These logarithms are taken into account for the first time up to NLO QCD in this paper, and yield an important correction to the cross section.
• Given the large QCD corrections, it is not surprising that perturbative uncertainties estimated from scale variations are relatively large as well. As expected they are largest for y 2 t terms. The inclusion of NLO corrections reduces the uncertainties significantly, but they are still at the level 30% to 40%. Their main source is again the logarithmic enhancement of the individual contributions pointed out above.
We now turn to discussing differential distributions. We first consider the NLO/LO Kfactor of the different contributions to the cross section in figures 5 and 6. These figures are organised according to the following pattern: there is a main frame, which shows histograms of the LO y 2 t (green dashed), NLO y 2 t (blue solid), LO y 2 b (purple dash-dotted), and NLO y 2 b (red dotted) predictions as cross section per bin (namely, the sum of the values of the bins is equal to the total cross section, possibly within cuts). In an inset we display the K-factor for each contribution by taking the bin-by-bin ratio of the NLO histogram which appears in the main frame over the LO one. The bands correspond to the residual uncertainties estimated from scale variations according to section 3.1. Figure 5a shows the transverse momentum distribution of the Higgs boson. As expected from the general hardness of the two different production processes leading to y 2 t and y 2 b (the Higgs being radiated off a top-quark loop, and the Higgs boson being coupled to a bottomquark line), the latter features the significantly softer p T H spectrum. The K-factors for both contributions grow with the value of p T H similarly, and become quite flat at large transverse momenta. However, as observed before, the size of QCD corrections is larger for the y 2 t terms, ranging from K ≈ 2 at small p T H to K ≈ 3 for p T H 150 GeV. For y 2 b contributions they are K ≈ 1.5 and K ≈ 2 in the same regions.
Also the transverse-momentum distribution of the leading b jet in figure 5b displays a harder spectrum for the y 2 t contributions. Note that for the p T b 1 distribution, as we select the leading b jet, the integral of the distribution corresponds to the ≥ 1b-jet rate. The behaviour of the K-factor is quite different in this case. While it is essentially flat and about K = 1.5 for the y 2 b terms, it is about K = 2.5 for y 2 t at low p T b 1 , decreases to K ≈ 2 for p T b 1 = 150 GeV, and turns flat afterwards.
In figure 6 we consider two observables which require the presence of at least two b jets: the left panel, figure 6a, shows the invariant-mass distribution of the b-jet pair, M (bb), and the right panel, figure 6b, shows their distance in the η-φ plane, ∆R(bb). It is interesting to notice the very different behaviour of y 2 b and y 2 t terms in the main frame of these two distributions. While y 2 t clearly prefers small invariant-masses and small separations between the b jets, y 2 b peaks around M (bb) = 100 GeV and ∆R(bb) = 3. The reason is clear: the dominant contribution for y 2 t originates from the g → bb splittings, which generate bottomquark pairs that are hardly separated and, hence, also have a rather small invariant mass. Looking at the K-factors in the lower inset, the one of the y 2 b terms turns out to be rather  close to one for a large part of the phase space. It slightly increases with both M (bb) and ∆R(bb). For y 2 t , on the other hand, the K-factor is around K = 2, and shows an even milder increase with M (bb) and ∆R(bb).
The scale-uncertainty bands in all four plots show the same features as the scale uncertainties discussed in table 1: Their size decreases upon inclusion of higher-order corrections, but overall they are rather large even at NLO. The y 2 t contributions feature a stronger scale dependence due to the logarithmic enhancement of g → bb splittings. By and large, LO and NLO at least have some overlap in most cases. Nevertheless, y 2 b contributions show the better converging perturbative series in that respect, which of course is directly related to their smaller QCD corrections.

Accessing the bottom-quark Yukawa coupling
We now return to the question of how to improve the sensitivity to the bottom-quark Yukawa coupling in bbH production. The goal is to increase the relative contribution of the y 2 b terms by suitable selections, while keeping the absolute value of the cross section as large as possible. First, in order to be able to distinguish bbH from inclusive Higgs production, we require to observe at least one b jet. Second, we have already noticed that by removing all b jets containing a pair of bottom quarks, we can decrease y 2 t rate to some extent, with a negligible impact on the y 2 b rate. The combination with additional phase-space requirements provides the most promising approach to further improve the sensitivity to the bottom-quark Yukawa coupling and to recover bbH production as the best process to measure y b directly. To this end, we consider the relative contribution of y 2 b and y 2 t terms to the bbH cross section for various differential observables in figures 7-9. All the plots in these figures have a similar layout: the main frame shows NLO predictions for the y 2 t contribution (blue dash-dotted), the y 2 t contribution (red dotted), and the sum of all contributions, including the interference (black solid). The ratio inset shows the relative contributions of the y 2 b and y 2 t terms to bbH cross section. The bands reflect the residual uncertainties estimated from scale variations according to section 3.1.
We start in figure 7 with the transverse-momentum distribution of the Higgs boson. The three panels show this distribution with different requirements: figure 7a displays the inclusive spectrum, figure 7b is in the ≥ 1b-jet category, and figure 7c is in the same category, but vetoing bb jets. This observable constitutes one of the strongest discriminators between y 2 b and y 2 t terms. The reason is the significantly softer spectrum of the terms proportional to y 2 b , which we already observed before. By looking at the three plots in figure 7 one can infer that the relative y 2 b contribution is maximal in the inclusive case and at low Higgs transverse momentum, even exceeding 50% in the lowest part of the spectrum. If we require at least one b jet the situation becomes worse, with the y 2 b term reaching at most 40%, again in the lowest transverse-momentum bins. If we require at least one b jet and veto bb jets, the relative contribution of y 2 b at low transverse momentum is mildly increased. In the three cases (inclusive, ≥ 1 b jet and ≥ 1 b jet without bb jets) the relative y 2 b contribution quickly decreases with p T H , being less than 20% already at p T H = 50 GeV. At the level of the cross  section, in the ≥ 1b−jet category, the relative y 2 t and y 2 b contributions are respectively 81% and 19%. In the ≥ 1b−jet and no bb-jet category their relative contributions become ∼ 77% and ∼ 23%, respectively. All in all, the gain coming from vetoing bb jets is moderate. Another strategy, which can be combined with the bb-jet veto, consists in discarding events with the Higgs transverse momentum larger than a given value. For example, with an upper cut on p T H at 50 (100) GeV, in the category with at least one b jet and no bb jet, we can increase the relative contribution of y 2 b terms to about 36% (27%), while keeping about 50% (90%) of its rate. Hence, restricting the phase space to small p T H values allows us to increase the relative size of y 2 b terms, while the impact on the rate is moderate due to the quite strong suppression at large p T H . We continue in figure 8 with the transverse-momentum distribution of hardest b jet. The general features of the p T b 1 spectrum are similar to the ones of p T H . However, as this observable clearly does not help very much in distinguishing between y 2 b and y 2 t contributions, we do not suggest any additional cut on p T b 1 . It becomes clear from these plots, though, that a lower p T b threshold used in the definition of b jets would increase the relative size of the y 2 b terms. In the present study jets are defined with a p T j threshold of 30 GeV. A value of 25 GeV or even 20 GeV could be feasible at the LHC, and would further increase the sensitivity to the bottom-quark Yukawa coupling in bbH production. We note that additional modifications of the b-jet definition, for example the usage of a different jet radius (as shown in appendix A), or of jet-substructure techniques, can provide further handles to improve the discrimination of the y 2 b contribution. Finally, we consider figure 9, where we show the invariant-mass distributions of the two b jets, figure 9a, and their distance R, figure 9b, and the corresponding distributions for B  hadrons, figures 9c and 9d. Note that the M (bb) and ∆R(bb) distributions by construction require the presence of two b jets, while M (BB) and ∆R(BB) are shown with at least one b jet and no bb jet. Clearly, all of these observables, especially those related to B hadrons, could in principle provide information to discriminate between y 2 b and y 2 t contributions. However, in practice, their usefulness is limited, due two main reasons, both related to statistics. First, the two b-jet distributions require the presence of at least two b jets and the corresponding rate is significantly reduced (by roughly one order of magnitude) with respect to the ≥ 1b-jet one. Second, the bulk of the B-hadron distributions feature B hadrons which are quite soft, and therefore possibly not accessible in the measurements. We therefore conclude that, despite showing useful features, neither of these distributions can significantly help in obtaining additional sensitivity to the bottom-quark Yukawa coupling.

QCD corrections to g → bb splitting
Besides its phenomenological relevance for the extraction of the bottom-quark Yukawa or as a background to other Higgs production processes at the LHC, bbH production induced by the top-quark Yukawa coupling offers a clean and simple theoretical setting to study the dynamics of g → bb splitting in presence of a hard scale. Cases of interest at the LHC where such splitting plays an important role, and is in fact one of the main sources uncertainties, are ttbb [77,78] and bbZ [50,79], i.e., irreducible backgrounds for Higgs production in the ttH and ZH production modes, respectively. In these cases, however, the production of  a pair of b quarks proceeds through different mechanisms and it is difficult to study them independently. 6 In the previous sections, we have found that bbH production is dominated by the topquark Yukawa contribution. At the lowest order of the y 2 t contribution the Higgs boson recoils against the bb pair coming from a gluon splitting, either in the initial or in the final state. When the splitting occurs in the initial state, a gluon typically produces a b quark going forward and the other interacting at high Q 2 , while when the splitting happens in the final state, a gluon has already been scattered at high Q 2 . Therefore, asking a pair of b quarks at high p T mostly selects the mechanism of gluon splitting in the final state.
This LO picture is modified at NLO, in particular in presence of real radiation where additional configurations can appear: most importantly, the Higgs can recoil against a hard light parton, with the bb pair being soft/collinear. Such configurations can give a large contribution as they are possibly enhanced by soft and/or collinear logarithms. In fact, rather than treating them as NLO corrections to bbH, such contributions can be thought as higher-order corrections to Higgs + jet production. In this case, they can be described either with a parton-shower or by employing a gluon fragmentation function and its evolution. Both approaches resum (with different accuracy) large logarithms of the form log(p T /m b ), with p T being the transverse momentum of the light parton.
The y 2 t contribution to bbH production enables a direct assessment of the importance of these configurations. This can be done by studying very simple observables. To this aim, we consider the fraction of energy (or equivalently of transverse momentum) of the b jet which is carried by particles other than the b quark. At NLO accuracy, where only one extra light parton, dubbed g, can be emitted, this fraction can be defined for the i-th (b) jet as Jets featuring hard b quarks and soft gluon emissions are characterised by z g 1, while b jets with a hard light parton and soft b quarks yield z g 1. In the latter case, one would rather consider the b quark to be originated from the evolution of the light parton. The limiting cases are easily identified: z g = 0 means that the jets are constituted only by one or two b quarks, while z g = 1 corresponds to light jets where g is the only constituent.
In the following, we consider the momentum fraction carried by light partons in the hardest b jet, z g (b 1 ). We start by showing, in figure 10, the (normalised) z g (b 1 ) distribution for the y 2 b (red dotted) and y 2 t (blue dash-dotted) contributions to bbH production, as well as the complete bbH cross section (black solid). The different behaviour in the two cases is manifest: for the y 2 b contribution, z g (b 1 ) is monotonically decreasing, and configurations with a hard gluon inside the b jet are very suppressed. On the contrary, the y 2 t contribution shows a plateau in the range 0.6 < z g (b 1 ) < 0.95. The integrated cross section for z g (b 1 ) > 0.6 amounts to 1.8% of the ≥ 1 b-jet one.   Figure 11. The momentum fraction of the first b jet carried by light partons, z b1 g , for the y 2 t contribution to bbH production, for different b-jet acceptances (11a) and different Higgs-p T cuts (11b).
Next, we focus on the y 2 t contribution, and consider the z g (b 1 ) distribution with different acceptance cuts. In figure 11a, we show the jet acceptances considered in section 3.2 and two b jets while in figure 11b the Higgs transverse-momentum cuts are shown. From the first figure, we conclude that requiring a bb jet leads to a suppression of configurations with a hard light parton. This is reflected in the mild shape enhancement for z g (b 1 ) 1 when a second, separate b jet is required. In the second figure we appreciate how, at large Higgs transverse momenta, configurations with a hard light parton are more and more enhanced. When a minimum p T H cut of 100 GeV is required, z g (b 1 ) even starts to increase as z g (b 1 ) > 0.8. In this case, the fraction of cross section for z g (b 1 ) > 0.6 is 2.3%, and it reaches 3% for p T H > 150 GeV. Although the relative importance of such configurations with respect of the total rate is marginal, the enhancement is manifest.
Being aware of the limitations of a fixed-order approach in this context, our findings could motivate a more complete study to explore the possibility of enhancing the y 2 b contribution to bbH production over the y 2 t one by requiring, for instance, an energy threshold for the B hadron inside the hardest b jet, and/or vetoing double b-tagged jets. Alternatively, substructure techniques could be employed to reveal the internal details of jets and classify events more efficiently.

Summary and outlook
The precise determination of the Higgs-boson couplings will be one of the main goals of the LHC programme of the coming decades. In this work we have presented for the first time the computation of the contributions proportional to the top-quark Yukawa coupling to bbH production at NLO in QCD using the 4FS. Given that the exact NLO computation is beyond the reach of the current multi-loop technology, we have employed an effective field theory approach, where the top quark is integrated out and the Higgs boson couples directly to gluon fields via a dimension-five operator. In order to reach NLO accuracy in the HEFT, in addition to the usual tree-level, virtual and real terms, we have also determined for the first time the finite O(y t ) corrections to the bottom-quark Yukawa, by matching the HEFT to the full theory at two loops. We have argued that the HEFT approximation is suitable to describe the phase space region where the bulk of the bbH cross section resides, i.e., for the Higgs boson, up to p T H 200 GeV. The main result of our study is that the bbH final state is largely dominated by the topquark Yukawa contributions in all regions of the phase space. This is contrary to the common lore and intuition that the bbH final state gives direct access to y 2 b , as much as ttH gives access to y 2 t . The failure of such simple-minded approach is mostly due to the large hierarchy between y 2 t /y 2 b which makes up for the relative α 2 S and loop-squared suppression of the topinduced contribution. While being expected to some extent by existing LO computations of the y 2 t terms, such conclusion becomes glaring (and robust) at NLO due to the very large K-factor (almost 3) associated to the y 2 t contribution. Apart from the decreased sensitivity to the bottom-quark Yukawa coupling, this result also entails a larger cross section of associated bbH production and henceforth a better chance of measuring bbH at the LHC. We have then investigated in detail how the two main contributions, those proportional to y 2 b and y 2 t , can be disentangled by using suitable observables and selection cuts. By systematically studying kinematical distributions of the final states at NLO accuracy, we have identified two main handles: first, the largest relative contribution from y 2 b terms resides at small Higgs transverse momentum. Second, the y 2 t terms are strongly dominated by configurations with gluon splitting into a bb pair, appearing already at LO, often leading to a high-p T jet consisting of two bottom quarks. Our results indicate that the y 2 t contribution will always provide a significant fraction of the irreducible background to the y 2 b one, and therefore the y 2 t contribution should be included in global fits, such as those based on the rescaling of SM couplings (kappa framework) [80].
Finally, predictions for bbH total and differential rates at NLO in QCD have a wide range of phenomenological relevance and applications, which go beyond what was was discussed in this work and are worth exploring in the future. The first natural extension of our work will be to promote the NLO results to fully exclusive ones by matching to parton showers. While technically straightforward, this step entails understanding and controlling the interplay between fixed-order and resummed radiation from the bottom quarks as well as the gluon-splitting mechanism, a topic which has been the subject of several recent investigations [77][78][79]81]. As briefly explored in this work, the y 2 t contribution is very sensitive to g → bb splittings and therefore could provide an optimal testing bench for further studies. The second extension will be a careful re-assessment of bbH as potential background to other final states featuring the Higgs boson, in the SM measurements as well as in Beyond the SM (BSM) searches, mostly due to the large enhancement of the cross section at NLO. For instance, in the search for ttH with H → γγ, only very weak constraints on additional (b-)jet activity beyond the two photons (on the Higgs mass shell) are required, and bbH production can contribute to the signal region. Similarly, HH searches where at least one of the Higgs bosons decays into a bb pair will be affected by associated bbH production, as the bbH rate with m bb > 100 GeV is comparable to the HH cross section and it is dominated by y 2 t terms as soon as H has a moderate p T . Other channels, currently searched for in 3b-jet final states, such as associated heavy-quark and double-scalar production in extensions of the SM like the 2HDM, could also be affected by a sizable bbH background. and 694712 'pertQCD'. ND further acknowledges the hospitality of the CERN TH department while this work was carried out. FM has received fundings from the European Union's Horizon 2020 research and innovation programme as part of the Marie Sk lodowska-Curie Innovative Training Network MCnetITN3 (grant agreement no. 722104) and by the F.R.S.-FNRS under the 'Excellence of Science' EOS be.h project n. 30820817. The work of MW is supported by the ERC Consolidator Grant 614577 HICCUP. The work of MZ is supported by the Netherlands National Organisation for Scientific Research (NWO).

A Jet rates with R = 1
In this appendix we show results for the jet categories considered in section 3 using R = 1 as jet-radius parameter. The choice of such a value is motivated by boosted-Higgs searches. More in general, a larger jet-radius enhances the ≥ 1bb-jet category, thus making a more efficient reduction of the y 2 t contribution possible. Apart from the jet-radius parameter, we employ exactly the same setup as described in section 3.1.
−34% (0.8%) 2.13 · 10 −3 +1% −19% (0.5%) 1.0 Our results are shown in table 2. For convenience of the reader, we also quote in the upper part of the table the jet rates computed with R = 0.4 (taking the results directly from table 1), while the bottom part of the table displays results with R = 1. In order to keep the table minimal, we do not show the y b y t interference and the BI-HEFT contribution, nor the sum of all contributions to the total cross section. As displayed in table 1, the effect of including the full top-mass dependence in the jet rates is small, in particular on the acceptance.
The conclusions that can be drawn from the table are the following: • Choosing R = 1 has the effect of mildly increasing the ≥ 1b-jet category, with a slightly more pronounced effect for the y 2 t contribution than for the y 2 b one. This can be easily understood as R = 1 makes it possible to cluster the radiation more inclusively. Events with both b quarks slightly below the jet-p T threshold may more easily lead to a b jet with a larger jet radius. However, one should also keep in mind acceptance effects: if one of the b quark is outside the jet-acceptance rapidity window, the resulting jet may not fall inside the acceptance, and hence be discarded. The fact that the cross section is larger with R = 1 than with R = 0.4 hints that acceptance effects should be less important than the inclusive clustering of the radiation. However, the former effects explain why the exclusive one-b jet category (obtained by subtracting the ≥ 2b and ≥ 1bb from the ≥ 1b one) is larger for R = 0.4 than for R = 1, as it can be trivially computed from the numbers in the table.
• Concerning the ≥ 2b-jet category, the larger jet radius leads to a relative 30 − 40% reduction of the y 2 t rate. The y 2 b term rate is slightly reduced at LO while it is increased of about 10% (relative) at NLO. Again, two competing effects should be considered in order to explain the behaviour: on the one hand, a larger jet radius requires the b jets to be more separated, leading to a reduction of the ≥ 2b-jet rate; on the other, it leads to a more inclusive clustering of the QCD radiation, thus instead giving an increase of the rate. Given the tendency of the two b jets to lie closer in the y 2 t than in the y 2 b contribution to the cross section (see also figure 6b), the first effect will be more pronounced on the former contribution, and the second on the latter.
• Finally, the ≥ 1bb category is where the effect of using R = 1 is most pronounced. The relative increase of the y 2 b contribution is very large, more than a factor 6 (8) at LO (NLO). However, the absolute rate remains negligible to all practical purposes, with an acceptance below 1%. For the y 2 t contribution the relative increase is still large, about a factor 2, and the acceptance now exceeds 20%.
The above remarks support vetoing fat bb jets as a way to further reduce the effect of the y 2 t term and thus increase the sensitivity on the bottom-quark Yukawa. As we already mentioned in the conclusions of our work, the natural follow-up of these findings would be to perform a more detailed study based on fully exclusive final state, which includes matching with parton showers.

B The HEFT bottom-quark Yukawa at O(1/m t )
In this appendix, we derive the power-suppressed correction to the bottom-quark Yukawa coupling in the HEFT, required to compute the complete NLO coefficients ∆ ) in the heavy-top mass approximation. As mentioned in section 2.2, the matching of the HEFT to the SM requires not only the introduction of an effective point-like coupling between the Higgs boson and gluons, but also corrections, suppressed by inverse powers of the top-quark mass, to renormalisable couplings. In particular, as we shall see in the following, the bottomquark Yukawa coupling is modified at O(1/m t ) by a term scaling like y t α 2 s m b /m t . We obtain this correction by evaluating a form-factor contribution to the amplitude of the H → bb decay at O y t α 2 s , first in the HEFT and then as an expansion in inverse powers of the top-quark mass in the SM, and by matching the two results.
The amplitude for the H → bb decay can be expressed as follows: whereū and v are the spinors associated to the bottom and the anti-bottom quarks, respectively, σ and σ are their spin indices, h and h are their helicities, and i and i are color indices. We consider the following form factor where p b and pb denote the bottom and anti-bottom momenta, respectively, and C A is the SU (N ) adjoint Casimir (C A = N ).
In the HEFT, two types of diagrams contribute to M at O α 2 s y t /m t , the contribution to the tree-level diagram from the bottom-quark Yukawa correction δy b and the one-loop diagram with an effective Higgs-gluon coupling, shown in figure 12. The one loop diagram is UV-divergent and is rendered finite by the renormalisation of the bottom-quark Yukawa. In the MS scheme, we find that we need to define the following counterterm for the HEFT bottom-quark Yukawa: where C F is the fundamental representation Casimir of SU (N ), is usual dimensional regularisation parameter in d = 4 − 2 spacetime dimensions. We can therefore write the following expression for the bare HEFT bottom-quark Yukawa where ∆ F is the finite contribution to y HEFT b that we seek to obtain. We find the following expression for the O(y t α 2 s ) part of the renormalised form factor in the Euclidian region (m 2 H > Figure 13. Diagrams contributing to M at O y t α 2 s in the SM.

0):
A HEFT . The form factor is expressed in terms of multiple polylogarithms G [82,83] defined iteratively by G(a 1 , . . . a n ; x) = To derive ∆ F , we need to compute the heavy-top quark limit (the leading term in 1/m t ) of the SM expression for M. At the perturbative order of interest two two-loop diagrams contribute, shown in figure 13.
The SM form factor can be evaluated using modern multi-loop calculation techniques. We generate FORM [84] expressions for these form factors using our own QGRAF [85] interface. After the spinor and tensor algebra is performed with FORM, we obtain a scalar expression in terms of kinematic invariants involving the external momenta and the loop momenta. We define the following family of integrals: where the denominators are defined as (B.10) The form factor is expressed as a linear combination of integrals in this family using Mathematica. We reduce the set of integrals required for the expression of the amplitude to a basis of master integrals using LiteRed [86]. We obtain a set of 23 master integrals shown in eq. (B.11), which we need expand in 1/m t :  1, 1, 1, 1, 1, 0) .
These master integrals can be expressed in the parametric Feynman representation and their expansion for a heavy top-quark mass 7 is done using the Mathematica package ASY [87] to perform an expansion by region [88]. The expansion significantly simplifies the integrals and most can readily be identified with combinations of Euler Γ, B integrals and logarithms, with the exception of J(1, 1, 1, 1, 1, 0, 0) and J(1, 1, 1, 1, 1, 1, 0). These two integrals can be expressed in terms of multiple polylogarithms and can be straightforwardly evaluated using the algorithm described in Appendix C of [89], whose application is significantly simplified by the package PolylogTools [90] which implements the reduction of multiple polylogarithms to the canonical form defined in [89] and many useful automated tools to integrate multiple polylogarithms in canonical form. The expression for all master integrals in the Euclidean region (m 2 H < 0) can be found in appendix C. These integrals have been checked by comparing their evaluation with GINAC [91] to the numerical integration provided by FIESTA [92] in the Euclidean region, showing excellent agreement.
By insertion of the expanded master integrals into the form factor and expansion of their coefficients for large top-quark masses, we obtain the expression for the SM version of M at order 1/m t . 8 The form factor is UV-finite in the SM, we therefore do not need to consider renormalisation effects and can take all parameters to be MS-renormalised and evaluated at a scale µ R . We ultimately obtain the following expression for the form factor in the SM: In the last step of our calculation we match the form factor in the two theories. We find that the polylogarithmic dependence of the two expressions is exactly the same, leaving us 7 This is technically achieved by rescaling the invariants with a spurious parameter ρ as m b → ρ m b and mH → ρ mH and taking the limit ρ → 0. 8 We verified that the master integrals were all expanded to a sufficient order in 1/mt by adding dummy higher order terms to their expressions and checking that they vanish in the 1/mt term of the form factor.
with the condition for the two renormalised expressions to be equal: The fact that all of the non-trivial transcendental functions appearing in each form factor drop out of their difference provides a strong check of the calculation.