N$^{3}$LO predictions for the decay of the Higgs boson to bottom quarks

We present a fully-differential calculation of the $H\rightarrow b\overline{b}$ decay at next-to-next- to-next-to-leading order (N$^3$LO) accuracy. Our calculation considers diagrams in which the Higgs boson couples directly to the bottom quarks, i.e. the perturbative order we consider is $\mathcal{O}(\alpha_s^3y_b^2)$. In order to regulate the infrared divergences present at this order we use the Projection-to-Born technique coupled with N-jettiness slicing. After validating our methodology at next-to-next-to-leading order (NNLO) we present exclusive jet rates and differential distributions for jet observables at N3LO accuracy using the Durham jet algorithm in the Higgs rest frame.


Introduction
The discovery of a Higgs boson [1,2] at CERN's Large Hadron Collider (LHC) represents the most significant result in high energy physics in recent history. Over the next couple of decades continued measurements of the properties of the Higgs will result in increasinglystringent tests of the predictions from the Standard Model (SM). These studies will continue to take place at the LHC (including the future high-luminosity upgrade) and putative future colliders, which are currently in the early design phases [3][4][5]. From a Higgs precision viewpoint, one strongly-motivated future accelerator is a lepton collider, capable of producing a large data set with small experimental uncertainties and thus allowing precision studies of the Higgs boson akin to what was successfully performed at LEP for the Z boson. In order to achieve these goals, it is vital for the theoretical community to provide precise predictions for Higgs-related observables with accuracies at the few-percent to per-mille level.
For the 125-GeV Higgs boson the predominant decay mode is to a pair of bottom quarks (bb), whose partial width accounts for around 60% of the total. An accurate measurement of H → bb is therefore crucial, since the Higgs-bottom Yukawa coupling (y b ) enters every LHC Higgs measurement through the total width. In a hadronic environment the measurement of H → bb is particularly challenging due to the presence of large QCD backgrounds. In order to overcome these obstacles, experimental analyses typically focus on associated (V H) production modes, which have more manageable backgrounds [6,7]. However, using jet-substructure techniques it is also possible to access H → bb through the gluon-fusion production mode (at high transverse momenta) [8].
Given its importance to Higgs physics, the H → bb decay has been studied in the literature for many years [9][10][11][12][13][14][15]. Currently, higher-order corrections from QCD are known up to N 4 LO (i.e. up to order O(α 4 s )) [16]. Additionally, the electroweak (EW) corrections have been known for some time [17,18], as well as the mixed QCD×EW corrections (O(αα s )) [18,19] 1 . It is thus fair to say that the theoretical knowledge of the inclusive partial decay width for H → bb is at an advanced level, with accuracies in the desired per-mille range. In order to study the Higgs in a collider setting it is also desirable to have theoreticallyprecise differential predictions, which allow for the application of experimental phase-space cuts for arbitrary infrared-safe observables. In this case our knowledge is not as advanced as at the inclusive level. Fully-differential predictions at NNLO in QCD were computed several years ago [21][22][23], while more recent studies [24,25] have focused on interfacing the decay at this order to V H production, which is also known at NNLO in QCD [26][27][28]. The principal aim of this paper is to extend the knowledge of the H → bb decay differentially to N 3 LO accuracy.
Significant progress has been made over the past five years in regards to the computation of differential predictions at NNLO accuracy in QCD. For most 2 → 2 LHC processes NNLO predictions have been computed, and currently the frontier lies in the computation of the challenging 2 → 3 two-loop corrections. A crucial aspect of this advancement has come from an increased ability to deal with the infrared (IR) divergences which affect the component parts of a NNLO calculation (but cancel upon summation in an IR-safe observable). A novel way of dealing with IR divergences at NNLO was presented in Ref. [29] and is now known as the Projection-to-Born (P2B) method. This method, initially applied to vector boson fusion (VBF), uses the knowledge of the inclusive cross section of the process under consideration and of the exclusive cross section of the process with one extra final-state jet to construct local counter-terms for the matrix elements, projected onto a LO phase space. At NNLO this method has since been applied to VBF production of two Higgs bosons [30].
An alternate approach to pursuing NNLO calculations is to utilize physical observables and factorization theorems to construct non-local counter-terms. One such approach, known as N -jettiness slicing [31,32], uses the N -jettiness [33] variable together with a factorization theorem derived from Soft Collinear Effective Field Theory (SCET) [34][35][36][37] to perform NNLO calculations.
Compared to NNLO, very few processes are known differentially at N 3 LO accuracy, although significant progress has been made over the last year. One of the flagship LHC processes, Higgs production, has recently been computed differentially at this order [38] (using a non-local q T -based subtraction method [39]) and analytic results for the pseudorapidity distribution have also been computed [40,41]. These results are built upon our knowledge of the inclusive Higgs-production cross section at this order [42,43]. The P2B method has also been deployed at N 3 LO, specifically for jet production in deep inelastic scattering [44,45] and, for certain differential distributions, VBF and VBF di-Higgs [46,47].
The aim of this paper is to provide, for the first time, fully-differential predictions for the H → bb decay at N 3 LO accuracy. Herein we focus on the contributions with the most challenging infrared structure, namely those that are proportional to y 2 b . We will deploy the P2B method mentioned above and present a first application of this method in conjunction with a non-local subtraction mechanism (N -jettiness slicing in our case) at both NNLO and N 3 LO. Our paper is constructed as follows. In Section 2 we present a discussion of the general framework for our calculation. We detail the P2B+SCET method in Section 3 and first validate our results using the H → bb process at NNLO . We use our calculation to make predictions for a variety of physical observables at N 3 LO accuracy in Section 4 and draw our conclusions in Section 5.

Overview of the calculation
A general overview of our theoretical setup is included in our companion paper on the calculation of H → bbj at NNLO accuracy [48]. Here we provide a short summary for completeness. Representative Feynman diagrams included in our calculation of H → bb at N 3 LO are shown in Fig. 1. At this order there are four phase-space configurations that contribute. The two-body phase space includes terms of up to three loops (which have been computed in Ref. [49]), while the remaining phase spaces correspond to those with three or more partons in the final state and are the component pieces needed for the calculation of H → bbj at NNLO. In our calculation we will set the b-quark mass to zero kinematically, but retain it in the Yukawa coupling. A comparison of the radiative corrections at NLO with or without the b-mass phase-space effects was first performed nearly forty years ago [9]. It was shown that the sizable differences between the full and "massless" theories arising from the b-mass terms can be compensated by running the b-mass to the Higgs scale (and thus recapturing some of the missing logarithms of the form log (m 2 b /m 2 H )). Dropping the b-quark mass kinematically results in dramatic simplifications in the calculation of the inclusive partial width, which in the case of H → bb is known up to O(α 4 s ) in the massless theory.
In this work our primary interest lies in computing the H → bb process differentially at N 3 LO. At this order, the partial width can be written as follows: where we have explicitly expanded in terms of both α s and the Yukawa couplings to the bottom and top quark y b and y t respectively. The dependence on the top-quark mass first comes in at NNLO and corresponds to diagrams in which the Higgs boson couples to a closed loop of top quarks. These diagrams can then interfere with the LO diagram to create a mixed y b y t term at O(α 2 s ). In our theoretical framework this interference is exactly zero due to the requirement of a helicity flip between the massless bottom quarks (since the bottom quarks couple to a spin-1 gluon in the y t term and to the scalar Higgs in the y b term). Such an interference term mandates a mass inclusion kinematically to be non-vanishing and is therefore not present in our calculation. In other words, the interference terms are suppressed by a power of m b /m H . However, since the ratio y t /y b is large, this mixed y b y t term is phenomenologically relevant. It is IR finite, and a commonly-used approximation is to integrate out the top-quark loop and thus work in an effective theory in which there is a clear hierarchy of scales m b << m H << m t [12,14]. In this approximation the mixed term accounts for around 30% of the NNLO correction. Given that m H is not dramatically lighter than m t , one may also worry about missing terms that are formally of order (m H /m t ) 4 and could therefore result in a significant correction. Such a study was recently undertaken [50] keeping the exact dependence on m b , m t , and m H , and found that the difference with respect to the exact form of the NNLO partial width are indeed small and can be neglected at the inclusive and differential level to good accuracy. At O(α 3 s ) a second class of diagrams enters. This contribution corresponds to diagrams in which the Higgs does not couple to the final-state b quarks at all, but instead is proportional to the closed loop squared, thus creating a term proportional to y 2 t at this order in Eq. (2.1). Additionally, the interference term which arose at NNLO now receives corrections and develops a more intricate IR structure. The y 2 t term has particularly troublesome IR behavior since it does not factor onto the tree-level H → bb , but instead factors onto H → gg. For this term there is also no helicity suppression and therefore this contribution is large and relevant for phenomenology. The Higgs coupling to partons through a top-quark loop, integrated out via an EFT approach, has been well studied in the literature [51][52][53] and is not the principal aim of this paper (where we focus on the y 2 b term which has a more complicated IR structure at N 3 LO). However, we note that these terms should be included before a full phenomenological study at N 3 LO can be completed. We leave this work to a future study, stressing that the terms that we neglect are at most NLO (for y b y t ) and therefore readily amenable using existing tools to implementation in a future Monte Carlo generator.

Regulation of infrared divergences at N LO
In this section we discuss the methods we utilize to regulate the IR singularities present in our N 3 LO calculation. We primarily focus on the P2B method, since the N -jettiness slicing method is discussed in more detail in our companion paper [48]. Firstly, we recap the inclusive partial width, which is a prerequisite for the P2B method we use here.

The inclusive partial width
An ingredient for our calculation is the inclusive decay width for H → bb at N 3 LO . This was originally computed over two decades ago [15] and is now known up to N 4 LO accuracy [16]. At O(α 3 s ) the inclusive partial width Γ H→bb can be written as follows The LO partial width is defined as with y b ≡ y b (µ) the bottom Yukawa coupling at the renormalization scale µ, m H the Higgs mass, and N c the number of colors, while the corrections at each order can be written as H→bb up to n = 3 are: where L = log (µ 2 /m 2 H ) and the explicit expressions for s i , β i and γ i m are presented in Appendix A. For reference, at µ = m H the inclusive partial width numerically evaluates to

Γ N3LO
H→bb (µ = m H ) = Γ LO H→bb 1 + 5.66667 Finally, we will employ the following definition of the N 3 LO coefficient for the inclusive width, which reinstates the dependence on the LO phase space (evaluated in d = 4 dimensions):

Projection to Born at N 3 LO
The H → bb differential decay width at N 3 LO is constructed as follows where dΓ V V V H→bb represents the triple-virtual contribution to the decay width, dΓ RV V H→bb the real double-virtual contribution, dΓ RRV H→bb the double-real virtual contribution, and dΓ RRR H→bb the triple-real contribution. Each parton-level contribution belongs to a different phase space Φ i (with i = 2, . . . , 5 respectively) over which it is integrated. The measurement function F m i (Φ i ) uses an IR-safe jet algorithm to cluster the i final-state partons onto m final-state jets and thus defines the observable O m . The triple-virtual contribution contains explicit poles in the dimensional regularization parameter = (4 − d)/2 (with d the number of space-time dimensions), whereas the triple-real term contains only implicit poles that become manifest as at least one and at most three particles become unresolved. The RVV and RRV contributions consist of mixtures of explicit poles and implicit phasespace singularities. The triple-virtual piece can be obtained from the results presented in Ref. [49], and real double-virtual in Refs. [48,54], while the calculation of H → bbj at NNLO accuracy is discussed in our companion paper [48]. This means that all the individual terms in Eq. (3.10) are known, but need IR regulation to be combined in a physically-meaningful way.
We define the Born-projected inclusive partial width as follows, where Φ B = Φ 2 corresponds to the LO phase space and O B m represents the observable O m evaluated for LO kinematics. We note the insertion of the two-body measurement function F m 2 (Φ B ) into the integrand in relation to Eq. (3.9). Expanding out the various component pieces of the Born-projected inclusive width yields the following The fully-differential N 3 LO coefficient can then be written as and Eq. (3.13) represents the master equation for the Projection-to-Born technique [29,44] and is equivalent to Eq. (3.10) by explicitly substituting Eqs. (3.12), (3.14), and (3.15). It can finally be rearranged as follows Inspection of the above formula reveals that the P2B subtraction regularizes singularities which cancel when an implicit pole turns to an explicit one via phase-space integration, i.e. this subtraction accounts for the "last emission". Based on the above equation, the full N 3 LO H → bb coefficient can be readily computed provided that the NNLO H → bbj differential partial width is available in a suitable format. More specifically, since the P2B method above regulates the singularities associated with the last emission, all the other IR divergences present in the last three lines of Eq. (3.16) (namely in the construction of the differential cross section of the process with one extra final-state jet) have to be previously regulated and canceled by means of a different subtraction scheme. Thus far, applications of the P2B method have utilized Catani-Seymour dipoles [55] (for applications at NNLO) and antenna subtraction [56] (for applications at N 3 LO) for this purpose. Both these regulators are clearly a good fit for the method, since neither explicitly requires a jet in the construction of the local counter-terms. Thus far no method that employs a jet-based physical observable to regulate divergences at NNLO has been applied to P2B. We address this in the subsequent section.

P2B with N -jettiness slicing
At first inspection the application of Eq. (3.16) with N -jettiness slicing seems problematic, since the application of N -jettiness slicing requires the definition of a jet observable (in this case 3-jettiness) in order to operate. Here we address this issue, starting with a brief summary of the method which is by now well established for NNLO calculations. The central idea of any slicing-based method is to consider an observable which allows one to separate the computation into two parts. At NNLO, the first part will contain all of the doubly-unresolved regions of the phase space and will be computed using a simplifying approximation (typically a factorization theorem). The second region will capture all of the singly-unresolved and fully-resolved regions of phase space and thus corresponds to a NLO calculation with one additional parton in the final state. In N -jettiness slicing, the separating variable is the N -jettiness variable τ N [33]. For an n-parton event it is defined as where p j represent the momenta of the n partons, while q i represent the momenta of the N most energetic jets clustered with any IR-safe jet algorithm (in our case the Durham jet algorithm [57,58]). Q i are the hard scales in the process, which we take as Q i = 2E i with E i the energy of the i-th jet. In order to separate the phase space into two regions, we introduce a variable τ cut N . In the region τ N > τ cut N at least one of the n partons is resolved (so that the term 2q i · p j in Eq. (3.17) is non-vanishing). The NNLO decay width for a generic H → N j process can be then computed in this region as the NLO calculation of the H → (N + 1)j process. On the other hand, in the region τ N < τ cut N no parton is resolved and the NNLO decay width can be approximated with the following convolution, derived from SCET [33,59]: In the above equation the terms J i represent the jet functions [60,61], S denotes the soft function for N colored partons, and H is the process-specific hard function. In our application of N -jettiness slicing we consider N = 3 and therefore we need the NNLO 1jettiness soft function with arbitrary kinematics [62] 2 and the hard function computed in our companion paper [48]. We also note that Eq. (3.18) is accurate up to terms of O(τ cut N ), which formally vanish in the limit τ cut N → 0. One should therefore set τ cut N as small as possible to ensure the validity of the factorization formula.
In order to apply N -jettiness slicing in conjunction with Eq. (3.16), let us consider the types of partonic configurations that can occur in our calculation. As an example, let us focus on the five-parton phase space (the triple-real contribution in Eq. (3.16)). In the Higgs rest frame, after jet clustering each phase-space event will belong to one of four possible topologies: a two-, three-, four-, or five-jet topology. We assume now that we are calculating an observable that requires the complete N 3 LO technology and thus we fix the measurement function to demand exactly m = 2 jets (any observable with three or more jets requires at most a NNLO calculation). In the triple-real contribution to Eq. (3.16) there are two measurement functions: F 2 5 (Φ 5 ) and F 2 2 (Φ B ). The latter will always produce two jets (in the rest frame) since it acts on the LO phase space Φ B . It is therefore unaffected by the number of jets obtained upon clustering of the five-parton phase space (assuming for now that no p T or rapidity cuts are applied to the LO phase space). On the other hand, F 2 5 (Φ 5 ) will pick out the various jet topologies given an input jet algorithm, in this case vetoing any event with more than two jets (since we fixed m = 2). This means that upon generation of a phase-space event there are two possibilities: a) the five-parton event corresponds to a ≥ 3-jet topology, is vetoed by F 2 5 (Φ 5 ) and therefore only the P2B subtraction term is non-zero, or b) the parton-level event produces two jets. In the latter case both terms in the last line of Eq. (3.16) survive, producing events with exactly-opposite weights, with the measurement functions applied on different phase spaces (which match in the triple-unresolved limit producing the desired subtraction).
For events belonging to category a) it is straightforward to compute the 3-jettiness Figure 2. The dependence of the H → bb NNLO coefficient for the two-jet partial width on the N −jettiness slicing parameter τ cut 2 . The physical jet cut is set to y cut = 0.1. The coefficient is normalized to the prediction obtained from the difference of the inclusive result and the NLO (inclusive) three-jet rate.
variable τ 3 and apply the cut τ cut 3 since there are (at least) three jets in the event (this is indeed simply a rephrasing of the existing NNLO methodology). Attention must be given to category b) two-jet events for which it is in principle unclear how a 3-jettiness cut can be constructed. In other words, in this case we must extract a three-jet observable from events with a two-jet topology. In order to achieve this, we first decluster the jets (in a similar spirit to the ideas behind jet-substructure techniques). Specifically, we reverse the last stage of the clustering algorithm, resulting in exactly three sub-jets. We then apply "N -subjettiness" slicing, taking the momenta of the three sub-jets as the momenta q i in Eq. (3.17). Crucial to the success of this approach is the lack of explicit dependence on the jet algorithm in the factorization formula of Eq. (3.18). Furthermore, since events in category b) have zero weight as explained above, the total two-jet rate at N 3 LO inherits the overall τ cut N -dependence of the parent NNLO calculation. In this regard, we do not expect significant worsening of the power corrections when applied to our N 3 LO calculation relative to our NNLO application. We investigate this behavior more carefully in the next section. Finally, the same line of reasoning can be applied to the double-real virtual and double-virtual real contributions.
We conclude this section by defining the Born phase-space events that enter the P2B subtraction terms. For each event we simply define the following Born phase-space point: where n j is the three-dimensional unit vector pointing in the direction of the leading jet (defined as the jet with the largest energy component).

Validation at NNLO
In order to validate our implementation of the P2B method at NNLO we have implemented an independent calculation at this order using the N -jettiness slicing approach. As discussed in previous sections, this method uses the predictions of SCET to establish a factorization theorem which can be used at small values of the physical N -jettiness observable τ N (which Figure 3. The dependence of the differential distribution for the maximum jet energy in the NNLO two-jet rate on the N −jettiness slicing parameter τ cut 2 . The physical jet cut is set to y cut = 0.1. in this instance corresponds to a 2-jettiness cut, τ 2 ). One therefore must ensure that the τ cut 2 variable is taken to small enough values that the missing power corrections in Eq. (3.18) are negligible. Our parameter choices are as follows. We take the mass of the Higgs boson to be m H = 125 GeV. As input we take the mass of the b-quark to be m b = 4.7 GeV, which enters into the Yukawa coupling y b (and is set to zero kinematically). In order to compensate for higher-order effects arising from the b-quark mass we run the mass to the Higgs scale. At NNLO we use the three-loop running, resulting in an effective b-quark mass of m b (m H ) = 2.94 GeV. Our remaining electroweak inputs are G F = 0.116639 × 10 −4 GeV −2 and m W = 80.385 GeV. We take α s (m Z ) = 0.118 and we evolve the coupling using three-loop running. For our subsequent predictions at N 3 LO we keep the three-loop running of α s and m b for simplicity (the difference between three-loop and four-loop running is very small [64]). All of the results for partial widths in this paper are in units of MeV. Our results presented herein have been produced using a fully-flexible Monte Carlo code, for which we have extensively used the existing structure of MCFM 8.0 where applicable (specifically for phase-space generation, Catani-Seymour dipoles [65], N -jettiness slicing [66], and OMP and MPI compatibility [67]). Our subsequent extended Monte Carlo is thus in a suitable format to be interfaced with MCFM and be released publicly in the future.
As a first check on the correctness of our results we compute the NNLO coefficient for the two-jet rate for jets clustered with the Durham algorithm [57,58] with y cut = 0.1. This algorithm starts from a parton-level phase-space point and computes the following quantity y ij for all pairs of objects i and j: where E i is the energy of particle i, θ ij is the angle between particles i and j, and Q is the hard scale of the process, which in our case is Q = m H . If y ij < y cut , the two objects are combined into a new one with four momentum p µ i + p µ j . The procedure is then iterated until no more clustering is possible and the final objects are classified as jets. In addition to the independence on the slicing parameter, a further check of our implementation of  Figure 4. Comparison of three different methodologies for computing the differential NNLO partial width. Shown are results obtained using Projection-to-Born with Catani-Seymour dipoles (P2B+CS), Projection-to-Born with N -jettiness slicing (P2B+SCET), and N -jettiness slicing (SCET). Results are normalized to those obtained using P2B+CS. The left-hand plot shows the pseudo-rapidity, while the right-hand plot shows the transverse momentum of the jet.
the N -jettiness slicing calculation of the NNLO two-jet rate can be constructed by taking the difference between the NNLO total inclusive rate and the inclusive three-jet rate at NLO. We compare this prediction to our results obtained with N -jettiness slicing in Fig. 2 observing excellent agreement in the asymptotic region τ cut 2 < 0.1 GeV. In order to ensure that the dependence on τ cut 2 in the differential distributions is also small we present the differential ratio for two different choices of τ cut 2 for the E max /m H observable in Fig. 3. Again, we observe excellent agreement for different choices of τ cut 2 . We use the prediction with τ cut 2 = 0.05 GeV for our subsequent comparisons with the P2B method.
We now compare the predictions from N -jettiness slicing to our implementation of P2B at NNLO. We have implemented the P2B method at NNLO using two different subtraction methods for the NLO part of the calculation: one with Catani-Seymour dipoles, and a second one using N -(sub)jettiness slicing. In the Higgs rest frame the most physicallyrelevant observables are delta functions at LO (for example the jet energy or the jet mass). In general, there is no special direction in momentum space with which to construct more elaborate observables. In order to fully test the cancellation of IR singularities it is most useful to construct an observable which has a non-trivial distribution at LO. In this paper we therefore introduce the following two quantities: the transverse momentum of the leading jet (the jet with highest energy) p max T,j and the pseudo-rapidity of the jet |η max j |. These two jet observables are measured with respect to the "z"-axis which we take to be a fictitious beam axis (i.e. we imagine that the Higgs was formed in a µ + µ − collision with an operating energy √ s = m H ).
The calculation of these observables at NNLO is presented in Fig. 4. We set µ = m H for these predictions and maintain the same parameter choices as before. We choose a value of τ cut 2 = 0.05 GeV for both of the calculations which require N -jettiness slicing. We observe excellent agreement within the sub-percentage Monte Carlo uncertainties for all three predictions. Our proposed method of P2B+N -jettiness slicing is thus validated at NNLO and we proceed to use this method to obtain results at N 3 LO accuracy in the next section. Figure 5. Jet fractions at orders α s , α 2 s , and α 3 s . Each prediction is normalized to the total partial width at that order.

Results
The results presented in this section are obtained using the same parameter choices as discussed in Section 3. We begin by computing jet rates at O(α 3 s ). At this order, possible topologies consist of two-, three-, four-, or five-jet events, which are accurate respectively to N 3 LO, NNLO, NLO, and LO in perturbation theory. Since the inclusive partial width is known at N 3 LO, the two-jet rate can be inferred directly from the knowledge of the other components at their respective orders. Therefore, we can use the NNLO three-jet results taken from our companion paper [48], compute the exclusive NLO four-jet and LO five-jet rates as a function of the y cut parameter, and obtain the two-jet rate at N 3 LO.
Our results are presented in Fig. 5, where we present the fractional jet rate at different orders in O(α s ), each prediction being normalized to the total partial width at that order. As it may be expected, the characteristics are broadly the same as similar calculations for e + e − → Z → jets computed at the same order [68,69]. For Z → jets, copious data from LEP is available for a comparison between theory and data. A future lepton collider should therefore be able to make the same sort of plot and compare to our predictions here. Expecting similarities with the Z data, as the order in perturbation theory increases the agreement with data for the jet rate is expected to improve. At smaller y cut the two-jet rate turns negative at each order in perturbation theory (beyond LO). However, for O(α 3 s ) the fractional rate is very small and negative for the smallest values of y cut considered here. Specifically, at y cut = 10 −4 the two-jet fractional rate at NNLO is −24%, whereas at N 3 LO the rate is only −4%. One may therefore optimistically hope that at N 4 LO the two-jet rate will remain physical to even very small values of the jet-clustering parameter. The change in slope for small values of the jet-clustering parameter is clearly visible when comparing the NNLO plot (middle plot, red line) to the N 3 LO one (right-hand plot, purple line). For the remainder of this section we will turn our attention to N 3 LO predictions which cannot simply be inferred from the NNLO three-jet inclusive rate. We will focus on the choice y cut ∼ 0.1, since a) this is the value for which perturbation theory should do a good job at describing collider data, and b) this value corresponds to jets that are somewhat similar to LHC anti-k T jets (assuming transverse momentum scaling of the form p T ∼ y cut m 2 H ). Before proceeding further we first quantify the residual dependence of our N 3 LO predictions on the 3-(sub)jettiness slicing parameter τ cut 3 . We present the τ cut 3 -dependence of the N 3 LO coefficient for y cut = 0.1 in Fig. 6. We have normalized the coefficient to the total inclusive correction ∆Γ N3LO H→bb at this order. To illustrate the size of the power corrections we additionally show the function −2.35 − 0.00289 τ cut 3 ln 3 (τ cut 3 /m H ) in the plot. We observe that the τ cut 3 -dependence for this jet clustering is not dramatic, only changing 10% over the range [0.02 − 0.3] GeV. The dependence between τ cut 3 ∼ 0.02 − 0.05 GeV is around one percent. Our differential predictions obtained at this order have MC uncertainties around a few percent (on the N 3 LO coefficient) and therefore our results are insensitive to τ cut 3 when τ cut 3 ≤ 0.03 GeV. We predominately use τ cut 3 = 0.02 GeV for the subsequent differential predictions in this section (supplemented by additional runs with τ cut 3 = 0.03 GeV to improve MC uncertainties in some distributions) . The two-jet rate is around a factor of −2 times the inclusive correction at this order, illustrating that there is a large cancellation at this order across jet bins and reminding us that, when exclusive jet quantities are considered, the smallness of an inclusive correction does not necessarily transfer to all distributions and all regions of phase space.
Our final state consists of two jets clustered with the Durham jet algorithm. We distinguish the two jets based upon which has the largest energy component (and refer to them as the max and min jets hereafter). As discussed previously, the dynamics of the rest-frame observables is somewhat limited, since physically-relevant distributions such as the energy of the jet and the mass of the jet are delta functions at LO. Therefore, higher-order corrections factorize onto corrections to LO observables O LO which contain contributions from every phase-space region and to observables O = O LO which contain (at most) corrections from one order lower and lack of the two-body phase space. This restricts the ability to study the delicate cancellations that must occur at N 3 LO. To overcome this, we reintroduce the fictitious collision axis of Section 3, and assume that the z-direction is special and corresponds to a beam axis. We then measure the transverse momentum p T and pseudo-rapidity η with respect to this axis. This defines non-trivial observables at LO, allowing us to test our predictions more stringently. These predictions also confirm that we can compute jet observables relevant for LHC physics (i.e. if desired we could impose phase-space cuts on these observables).
Our results for |η max j | and p max T,j /m H are shown in Fig. 7. We present the NLO, NNLO, and N 3 LO predictions (suppressing LO for clarity). In each case the upper panel presents the differential distribution, while the middle panel illustrates the ratio to the NLO prediction and the lower panel the ratio to the NNLO prediction. Since a scalar particle at rest decays isotropically, the rapidity distribution is sculpted only by the phase-space integration of the final-state jets. For this reason the higher-order corrections are flat and do not noticeably alter the shape of the distribution. As the order in perturbation theory increases, the scale variation drops considerably (we vary the scale between m H /2 ≤ µ ≤ 2m H ). This observable inherits the scale variation from the total jet rate and is similar to the scale variation presented in Appendix A for the total width. At NLO the scale variation is around {+3.5, −5}% across the entire distribution. For NNLO and N 3 LO the rate obtained with the scale choice µ = m H is close to the maximum rate (again as in the inclusive rate in Appendix A), and as such the scale variation band is set by µ = m H and µ = m H /2. At NNLO the variation is around −1.2% and at N 3 LO it drops by a factor of two to around −0.7%. The p T distribution is more dynamic, especially in the region p T ∼ m H /2. Here the kinematics of the region is sensitive to the emission of additional soft radiation and thus experiences sizable corrections in the perturbative expansion. At NLO for p T ∼ m H /2 an artificial cancellation of the scale dependence occurs, resulting in essentially no scale dependence in this bin at this order. As the order increases to NNLO and N 3 LO the corrections are around −10% and −15% compared to NLO. Across the remaining phase space the corrections are positive and between 5% in the softest bin increasing to around 15% in the penultimate bin. Comparing N 3 LO to NNLO in the lower panel we see that the N 3 LO corrections reside at the very edge of the scale variation band at NNLO, which corresponds to around a 2% to 5% correction to the NNLO rate in the bulk region and −8% correction in the p T ∼ m H /2 bin. This bin has the largest scale variation at N 3 LO corresponding to around ±4%. Away from this bin the scale variation at N 3 LO is much smaller, around 1%.
We now turn our attention to the more physically-relevant observables that do not require the introduction of an arbitrary reference direction, namely the energy and invariant mass of the maximum-energy jet. Our results for the (m H -rescaled) energy distribution are presented in Fig. 8. This observable can broadly be classified into three regions: the δcomponent defined by the LO phase space at E max j = m H /2, the "bulk" region defined by 0.5 < E max j /m H < 0.6, and the "tail" defined by E max j > 0.6 m H . We discuss the δcomponent first, which corresponds to the first bin of our histogram. As can be seen from the middle and lower panels, there is a large (negative) correction in going from NLO to NNLO (∼ −30%), while the correction in going from NNLO to N 3 LO is much smaller (around −2%), indicating a good convergence of the perturbation series here. The major change in this region at N 3 LO is the dramatic reduction in scale variation compared to NNLO, which has gone from ±15% to +3%. In the bulk region the observable is one order lower in the perturbation theory, i.e. NLO behaves like LO etc. In our case the N 3 LO correction acts like a NNLO calculation, with the scale variation growing as a function of E max j from a few percent at the softer end to around 10 − 15% at the more energetic range of the region. The tail region corresponds to a region of phase space which is inaccessible to two-and three-parton phase-space configurations. Therefore in this region the observable behaves like a calculation two orders lower in perturbation theory. As such, the NNLO calculation becomes LO-like (the scale variation in the tail at NNLO is flat since we are merely comparing the overall factor Since the observable is "LO", we see large corrections > 2 and large scale dependence in going from NNLO to N 3 LO. We note that there exists a "super-tail" region not shown in the figure in which E max j > 0.65 m H . In this region only the five-parton phase space contributes and therefore the N 3 LO prediction behaves like a LO prediction. We present the invariant mass of the jet (with the largest energy) m j max , divided by the Higgs mass, in Fig. 9. At LO all jets are made of single partons and therefore have zero mass 3 . The region near the LO boundary is highly sensitive to soft and collinear radiation, and this observable should be resummed (for instance in a parton-shower prescription) to fully capture the physics. In this region of phase space one demands that the most energetic jet be almost massless, which pushes the calculation into the region of phase space in which the two jets are almost-massless partons scattering back to back. In order to obtain a physically-sensible prediction at fixed order one must ensure that the bin near m j = 0 is inclusive enough to carry out an adequate cancellation of IR singularities into an IR-safe observable. In other words, if the prediction is binned too finely, the perturbation theory breaks down and undesirable effects (such as a negative differential cross section) can occur. We therefore combine the first four bins into one larger bin in our differential prediction shown in Fig. 9. This is actually insufficient to ensure a physically-reliable prediction for all scale choices at NNLO, but is sufficient at N 3 LO (in which we are primarily interested here).
To ensure a positive-definite prediction at NNLO the first five bins need to be combined. We note in passing that at NLO no combination is necessary since the prediction consists only of a three-body phase space (which diverges to +∞ at δ(m max j /m H )) and of the two-body phase space (which diverges to −∞ at δ(m max j /m H ))). Given the poor convergence of the perturbation series in this region, both higher-order corrections, and the subsequent scale variations, are large. Away from the troublesome δ-region the observable behaves much in the same fashion as the E max j observable discussed previously. Specifically, we observe a bulk region in which the observable is NNLO and the corrections are (reasonably) small and a tail region in which the three-body phase space is not present and the observable becomes NLO, resulting in large corrections at N 3 LO.

Conclusions
In this paper we have presented N 3 LO predictions for the H → bb decay process. We focused on the piece with the most intricate infrared structure, corresponding to diagrams in which the Higgs boson couples directly to the final-state bb pair. In order to regulate the IR divergences present at this order we used the Projection-to-Born (P2B) method, employed for the first time with N -jettiness slicing as the IR regulator for the NNLO+j contribution.
We developed a method of dealing with the requirement of observing a jet direction in the N -jettiness slicing approach, namely effectively declustering the last stage of the jet algorithm and using the substructure of the jets to produce three (sub)jet directions. We validated our method at NNLO using three different methods to regulate the IR divergences.
We used our calculation to present jet rates at O(α 3 s ) and differential distributions for several physical observables using the Durham jet algorithm with y cut = 0.1. The method discussed in this paper is readily applicable to more complicated Higgs processes, such as associated production of a Higgs boson with a vector boson at the LHC or future collider. We demonstrated this by computing jet observables with respect to an artificial collision axis. Our calculation can also be used outside of the Higgs rest frame. Indeed, since the Higgs is a scalar particle, there is no correlation between decay and production mechanisms. One can therefore always boost any event into the Higgs rest frame, perform the N -jettiness regulation (which need not match exactly the requirement of the measurement function, i.e. one could still employ Durham clustering if desired), then boost back to the laboratory frame and impose additional selection criteria. We leave this study, together with the inclusion of the remaining top-induced contribution to the H → bb process at O(α 3 s ), to future work.

A The inclusive H → bb decay width
We present the explicit expressions for the coefficients s i , β i and γ i m of Eqs. (3.4)-(3.6) following the notation of Ref. [15]. The coefficients s i read: 2Nc , and N f the number of quark flavors. The coefficients of the QCD β function explicitly read: with T R = 1 2 . The coefficients γ i m are taken from Eq. (12) of Ref. [70] and their expressions are:  Fig. 10. The values of α s and y b at different scales are obtained using the Mathematica package RunDec [64]. As expected, the inclusion of higher-order corrections stabilizes the inclusive decay width, which shows very small scale dependence at N 3 LO in the primary region of interest {1/2, 2}m H .