N$^3$LO+N$^3$LL QCD improved Higgs pair cross sections

We report a new calculation of the soft-gluon threshold resummation for the Higgs boson pair production in the dominant production mode -- gluon-gluon fusion -- up to the next-to-next-to-next-to-leading logarithmic (N$^3$LL) accuracy. After matching N$^3$LL to the next-to-next-to-next-to-leading order (N$^3$LO) QCD calculation in the infinite top quark mass approximation, we show that the central values of the inclusive cross sections are quite stable with respect to N$^3$LO, while the conventional renormalisation and factorisation scale uncertainties are reduced by a factor of two, reaching to the subpercent level. Our study further consolidates the good asymptotic perturbative convergence. After combining with the full top-quark mass dependent next-to-leading order QCD results, our most advanced predictions are presented for both the inclusive total cross sections and the differential invariant mass distributions of the Higgs pair.


Introduction
The studies of the 125 GeV Higgs boson h at the Large Hadron Collider (LHC) turn into the second decade since its discovery [1,2]. Many Higgs properties, such as its mass, width, spin and CP, have been thoroughly investigated. Its interactions with the force carriers (Z, W, γ, and gluon g) and the third-generation charged fermions (t, b, τ ) have all been established, which are compatible with the Standard Model (SM) predictions within ∼ 10% accuracy [3,4] 1 . The observation of the Yukawa coupling between the Higgs boson and the first second-generation fermion muon is in reach. One of the central questions remaining to be answered at the LHC is that whether the Higgs boson interacts with itself. A remarkable prediction of the SM is that, analogous to the way of Higgs boson interacting with other massive elementary particles, the self-interaction strength of the Higgs boson scales with its mass. The latter has been measured to be around 0.1 to 0.2% accuracy. The knowledge of the Higgs self interactions is essential to scrutinize the Higgs field potential, a key to understand the electroweak spontaneous symmetry breaking mechanism and on how the subatomic particles acquire their masses. The only thing we know about the potential so far is from the mass m h of the Higgs boson, which is the second derivative of the potential with respect to the Higgs field around the (local) minimum of potential, at which the Higgs field averagely takes the non-zero vacuum expectation value v ≈ 246.2 GeV.
The shape of the Higgs potential has in fact the far reaching influences on cosmology (see e.g., refs. [5,6]). The distinct fates of our universe are determined by the vacuum stability or instability of the potential. In the former case, the current minimum at v ≈ 246.2 GeV is the global minimum, while it is just a local minimum in the latter case. Thus, our universe will decay through quantum tunnelling at some point in the future if the vacuum is unstable [7]. This is the case predicted in the SM [8][9][10][11]. Such an instability can be cured in many beyond the SM (BSM) theories. The role of the Higgs field is also crucial in the early universe. In particular, at around a nanosecond after the big bang, the universe underwent from the electroweak symmetry unbroken phase to the broken phase [12]. Such a phase transition could be a first or second order as well as a crossover depending on the Higgs potential. Lattice calculations [13,14] show that it should be a crossover in the setup of the SM. The (strong) first-order phase transition provides one necessary condition [15] for generating matter-antimatter asymmetry at the electroweak scale. Besides, the first-order transition may leave a few detectable cosmic imprints, such as primordial 2 gravitational waves [16,17] and magnetic fields [18], and topological defects.
As aforementioned, the crucial input to pin down the Higgs potential is by measuring the Higgs self interactions. These self interactions are accessible by studying the multiple-Higgs boson production from high-energy particle collisions, such as at the LHC [19] or the future colliders [20,21]. The most viable production mode is the production of a pair of Higgs bosons. We will concentrate on the process of the Higgs boson pair production at hadron colliders in this paper. Such a process is vital in directly probing the Higgs trilinear self coupling λ hhh [22] and in indirectly constraining the Higgs quartic self coupling [23,24], where in the SM λ hhh is uniquely fixed by the Higgs mass and the vacuum expectation value via λ SM hhh = m 2 h 2v 2 ≈ 0.129. Due to the pattern of the couplings between the Higgs boson and other SM particles, Higgs boson is very hard to be experimentally detected (partially) because of its low production rates and the small branching fractions of the background-clean decay modes. The di-Higgs is more than doubling the challenges. The di-Higgs production cross sections are three orders of magnitude smaller than the single Higgs counterparts, in which the two Higgs bosons are predominantly produced in the fusions of two initial gluons emitted by protons (see figure 1).
The paper will focus on quantifying the high precision Higgs pair production cross sections via the gluon-gluon fusion (ggF) channel. We want to stress that the target precision is strongly tied to determine how large portion of the BSM scenarios can be probed (cf., e.g., figure 1 in ref. [25]). A question of "if we measure a deviation of the trilinear Higgs coupling with respect to the SM value δ λ ≡ κ λ − 1 ≡ λ hhh The existing direct measurements of the Higgs boson pair cross sections at the LHC only loosely bound on the trilinear Higgs self-coupling λ hhh so far. The current best constraints −1.24 < κ λ < 6.49 and −0.6 < κ λ < 6.6 at 95% confidence level were placed by the CMS [4] and ATLAS [28] collaborations respectively. 3 Although the results are very encouraging, the limit does not yet reach to most of the interesting regions of the vast BSM scenarios (see, e.g., refs. [29][30][31][32][33]). It is even just touching the range |δ λ | = |κ λ − 1| < 5 set by the perturbativity argument [34]. The precision of κ λ will be largely improved at the HL-LHC with 3000 fb −1 integrated luminosity [20] and the envisaged future hadron colliders [21,35,36]. δ λ is estimated to be achievable better than ±50% (around ±5%) at the HL-LHC (a future 100 TeV pp collider). Before moving on, we briefly describe how the cross sections of the ggF process gg → hh are calculated in the SM. The perturbative calculations for the process are quite challenging due to the absence of the tree-level interactions between the Higgs boson and the gluon. The leading order (LO) cross section is derived from the square of the one-loop amplitude dominantly via the virtual top quark loop, where two representative Feynman diagrams can be found in the upper row of figure 1. The higher-order QCD perturbative calculations are very difficult due to the complicated multi-loop multi-scale Feynman integrals. Thanks to the new technology development of the numerical approaches [37][38][39], the next-to-leading order (NLO) results are presented in refs. [40][41][42][43][44][45]. The NLO calculation has been further 3 In ATLAS paper [28], the best constraint by combining the Higgs pair cross section and the loop effects from the single Higgs process is −0.4 < κ λ < 6.3. The electroweak loop effects from the single Higgs production can indirectly probe λ hhh . However at the HL-LHC, the indirect limit would be harder to be improved since it is dominated by the systematic errors. 4 There is, however, a caveat in order to interpret these numbers. The values of δ λ were derived based on several assumptions extrapolated from what have been understood or used in the LHC Run 2 analysis, where the theoretical uncertainties have been assumed to be reduced by a factor of two with respect to the state of the art at the time.
improved either by soft-gluon resummation [46,47] or after matching to parton showers [48][49][50]. These are the best results achieved so far by taking into account the full top quark mass m t dependence. They leave a few theory challenges to be tackled in the future. First of all, the NLO results are plagued with the large theoretical uncertainties both from the scale variations [40] and the top-quark mass scheme dependence [42]. Some differential distributions may differ significantly by adopting different matching schemes [48] or using different shower scales [49].
A second useful way of improving the cross sections is applying the so-called infinite top quark mass approximation, 5 denoting as m t → ∞. In such an approximation, the top quark loops shrink to the contact points (cf. the bullets in the lower row of figure 1) by taking the heavy top quark mass approximation m t m h in the amplitude. Under this working approximation, we have circumvented the aforementioned complicated multi-loop Feynman integrals. One is therefore able to go beyond much further than the loop-induced case, albeit still with a lot of theoretical efforts needed. The state-of-the-art perturbative QCD calculation in the m t → ∞ approximation is next-to-next-to-next-to-leading order (N 3 LO) [51,52], which was certainly impossible without the decades of theoretical efforts [53][54][55][56][57][58][59][60][61][62][63][64][65][66][67][68][69][70][71][72] by the whole community. Although N 3 LO QCD corrections only enhance the central values of next-to-next-to-leading order (NNLO) by 3%, the renormalisation and factorisation scale uncertainties are dramatically reduced by a factor of 3 to 4 to reach to the percent level accuracy. The soft-gluon resummation effects have been studied up to NNLO plus next-to-next-to-leading logarithmic (NNLL) accuracy in ref. [73]. 6 The finite top quark mass corrections can be included either by taking into account more terms in the  [43,[75][76][77][78][79] or by rescaling m t → ∞ amplitudes (cross sections) with the known full-m t dependent amplitudes (cross sections) [52,[80][81][82]. Moreover, there are also many interesting attempts to compute the two-loop ggF di-Higgs amplitudes in the analytic forms by taking various approximations [83][84][85][86][87]. Recently, partial results of the NLO electroweak corrections, namely from the top Yukawa coupling induced, become available [88,89].
In this work, we further improve the perturbative QCD calculations of gg → hh with the threshold resummation up to next-to-next-to-next-to-leading logarithmic (N 3 LL) accuracy in the infinite top quark mass approximation. We discuss the theoretical uncertainties that are stemming from the ambiguities in resummation. The final results, ready for phenomenological applications, are reported as N 3 LL matched to N 3 LO (N 3 LO+N 3 LL), which are also augmented with the known full m t -dependent NLO calculation.
The remainder of the paper is organised as follows. We describe our theoretical framework in section 2. Our results are reported in section 3. The conclusions are drawn in section 4. The concrete expressions of the universal coefficients appearing in the N 3 LL resummation for gg → hh can be found in appendix A.

General structure
For a general hadronic process with hadrons h 1 and h 2 collide to produce a heavy colourless final state with the invariant mass square M 2 , the differential invariant mass distribution in QCD improved parton model takes the form of a convoluted integral: Here, s is the square of the hadronic center-of-mass energy, µ F is the factorisation scale and the dimensionless quantity τ is defined as τ ≡ M 2 s . The partonic luminosity φ ab is defined as the convolution of parton distribution functions (PDFs) f a/h 1 and f b/h 2 : While the partonic coefficient function ∆ ab is perturbatively calculable in QCD, and is expanded in terms of a s ≡ α s /(4π) with α s being the UV renormalised strong coupling constant. In the collinear factorisation, the partonic coefficient function is expressed in terms of the bare partonic cross sectionσ ab and the splitting kernels Γ at the scale µ F : with all the initial collinear singularities encapsulated in Γ, in terms of the regularised Altarelli-Parisi (AP) splitting functions 7 . The µ F dependence, in principle, should cancel order by order at the hadronic level after convoluting ∆ ab (z, M 2 , µ 2 F ) with the partonic luminosity. µ in eq. (2.3) is the regularisation scale in the dimensional regularisation with d = 4 − 2 spacetime dimensions. The symbol ⊗ refers to the convolution operator defined for arbitrary functions f i (x), i = 1, 2, . . . , n as: (2.4) The bare partonic cross sectionσ ab can be expanded inâ s ≡α s 4π with the bare strong coupling constantα s : where b is the power ofâ s at the lowest order and S ≡ (4πe −γ E ) in the spacetime dimension d = 4 − 2 with γ E being the Euler-Mascheroni constant. Beyond the lowest order inâ s , the coefficientσ (k) ab involves UV and IR divergences. The UV renormalisation 7 The lowest order is Γ −1 ab = δ ab where δ ab is the Kronecker delta function.
procedure removes the UV singularities and introduces a dependence on the renormalisation scale µ R . In the MS scheme, the renormalised coupling constant a s is related to the bare couplingâ s viaâ which satisfies the renormalisation group running If the amplitude depends on other bare couplings like the Higgs Yukawa couplings or the other effective couplings, it is understood that the similar UV renormalisation should be carried out. For simplicity, we ignore the dependence on other couplings in the following context. The α s series expansion given in eq.(2.5) is usually referring as the fixed order perturbative calculations. However, the perturbative expansion has limitations in its applicability if the coefficients at each order involve large logarithms. Such cases could arise from the soft-gluon emissions. Despite the cancellation of soft singularities between virtual and real radiative contributions, soft gluon effects can yield potentially large logarithms in the kinematic configurations where high imbalance between real and virtual gluons happens. In such cases, the threshold resummation is suitable to be performed by resuming these large logarithms to all orders in α s , which is exactly the subject of the present paper.

Soft-virtual limit
Let us define the threshold limit in terms of the partonic variable z ≡ M 2 s → 1, withŝ being the square of the partonic center of mass energy. As the name implies, in this phase space region, almost all the incoming partonic center of mass energy goes to produce the Born-like final state, and the residual energy is only sufficient to generate soft gluons.
In this limit, the partonic coefficient function can be organised into two parts: In such a decomposition, the term ∆ SV ab (z, M 2 , µ 2 F ) involves only the singular terms when z = 1 with the distributions of the following forms: where δ(1 − z) is the Dirac delta function and D i (z) is a plus distribution. The latter is defined as that for any test function f (z), there is The superscript SV denotes the soft-virtual part, accounting for the real emission corrections at the threshold limit and the pure virtual corrections. The remaining regular term ∆ Reg ab (z, M 2 , µ 2 F ) contains either ln i (1 − z) terms or the polynomials of (1 − z). Although the logarithms ln i (1 − z) could also be large, they are subdominant by a factor of (1 − z) compared to the soft-virtual contribution. In the following discussion, we ignore these subleading power terms.
Since we focus on soft-virtual part in the context, only the flavour diagonal terms of Γ in eq.(2.3) contribute. Hence, the parton flavour sum in eq.(2.3) can be safely omitted, giving rise to a simpler relation: Again, the superscript SV indicates that we only keep those terms which give rise to δ(1−z) and D i . The soft-virtual partonic cross section can be further factorised into the soft and hard (or virtual) parts. In this respect, the partonic coefficient function further takes the form: The form factor F ab captures pure loop corrections and starts from 1 as the lowest term. The implicit dependence of the form factor on external particle momenta, colours and spins is left understood. The soft function S ab embeds the real soft-gluon corrections. Each piece on the right-hand side of eq. (2.13) is not UV renormalised yet. Hence they involve UV divergences, which should be removed by the UV renormalisation. F ab and S ab contain IR poles as well. ∆ SV ab , on the other hand, is both UV and IR finite, the details of which will be discussed shortly.
In dimensional regularisation, the IR divergences in the form factor stemming from the soft and collinear loop-momentum modes can be renormalised as 8 : (2.14) 8 In the following, if an object has the dependence on the renormalisation scale µR, it means we have carried out the UV renormalisation for the coupling constant αs unless we explicitly write out its dependence onâs orαs. After the UV renormalisation, there is no explicit µ dependence anymore, because the regularisation scale in the dimensional regularisation is always in the form of (µ 2 ) for each loop-momentum integration.
A few comments are in order. The coupling constant is renormalised on the right-hand side. Both Z IR ab and H ab are perturbatively expanded in terms of the renormalised coupling constant a s . Z IR ab factorises all the IR divergences coming from the virtual amplitudes. The hard function H ab is finite and depends on the Born kinematics, such that the lowest order equals to the Born amplitude squared. From the IR factorisation of virtual amplitudes, it is well-known that Z IR ab is a universal quantity [90][91][92][93][94][95][96][97][98][99]. For colourless particles production, it is insensitive to the hard process under study and only depends on if the process is initial quark or gluon induced. More specifically, the IR singularities of the virtual contribution are solely determined by the nature of the initial particles. This in turn leads to the fact that the combined contributions from the soft function S ab and the kernels Γ −1 must exhibit the same universal IR divergences. Absorbing Z IR ab into the second line of eq.(2.13) and performing the coupling constant renormalisation at the scale µ R yields: both H ab and S Γ,ab are free of IR poles. The hard function H ab is obtained from the IR subtracted loop corrections 9 . On the other hand, the structure of soft-collinear part S Γ,ab is obtained by computing real corrections and the AP splitting kernels in the soft limit. Alternatively, one can also predict the structure of soft-collinear part S Γ,ab by employing its universal behaviour. This leads to a framework to resum the large logarithms present at the threshold limit, which we discuss briefly in the following.
We begin with the ingredients in eq.(2.13). The structure of splitting kernel Γ aa is well-known. It satisfies the renormalisation group evolution equation with respect to µ F : where P aa is the regularised flavour diagonal (unpolarised) AP splitting function 10 . Note here that in the above equation, we have dropped the contributions from off-diagonal splitting functions since we focus on the soft-virtual contributions only. The diagonal splitting function at the production threshold takes the form: in terms of the light-like cusp A a 11 and the virtual B a 12 anomalous dimensions, which depend only on the nature of incoming particle a. In particular, we notice that A a (a s ) = The expressions of Γ (k) aa z, up to k = 4 in the MS scheme can be found in eq. (33) of ref. [107] by replacing ε with −2 . In particular, Γ The structure of the soft function S ab could be determined from the behaviour of the Sudakov form factor through its evolution under the momentum transfer M 2 , which leads to its exponentiation [108][109][110], and the renormalisation group evolution of the splitting kernels Γ aa and Γ bb . The universal factorisation of IR singularities guarantees that the soft function could be expressed in terms of a first order differential equation. From eq.(2.13), we can derive where we have defined the finite remainder of the soft function as S fin ab . The coefficient K ab is resulting from the IR pole part, which only involves the soft divergences in the real contribution. It can be related to Z IR ab as following: Expanding them in the bare couplingâ s gives: ab ( ) solely depends on the cusp (A a ) anomalous dimension and the beta function. The relationship is given in eq.(35) of ref. [111] by replacing ε with −2 and A I k 11 The 4-loop cusp anomalous dimension has been computed analytically in refs. [103,104]. 12 The complete 4-loop virtual anomalous dimensions are firstly known in ref. [105] for quarks and in ref. [106] for gluons.
a . Further, the renormalisation group invariance of S ab (i.e., d This is obtained from the peculiar structure of ln at every order in the perturbative expansion: Up to three loops in QCD, the cusp and soft anomalous dimension satisfies a Casimir scaling relationship given by The collinear anomalous dimension, which is analytically known up to 4 loops [113], can be related to the virtual and soft anomalous dimensions via γ collinear From eq.(2.23), G ab takes the form: where the first term on the right-hand side is the boundary term at µ R = M , whose explicit form is determined from the soft limit of real corrections. In light of the differential equation (2.20) and the renormalisation group invariance eq.(2.23), the solution takes the form 14 : Alternatively, with an understanding on the structure of G The four loop quark soft anomalous dimension is given in ref. [105], while the 4-loop gluon counterpart is obtained in ref. [106] from the quark one with the conjecture that they satisfy the generalised Casimir scaling rule [112]. 14 The M 2 independent term in ln S ab (z, M 2 , µ 2 , ) from the general solution of the first-order differential one can also formulate a better functional form for ln S ab (z, M 2 , µ 2 , ): which satisfies all the aforementioned differential equations. Here we have used the relationship The exact form of G ab ( ) is given in eq.(37) of ref. [111] by replacing ε with −2 and identifying the subscript P as S (therefore δ P,SJ = 0). In particular, we have G . Equation (2.28) obviously shows that when one takes µ ∼ O(M (1 − z)), the soft function is free of any large logarithms, which should take the initial scale of its renormalisation group evolution. The proper initial scale is just at the order around M (1 − z), thus suffering from the ambiguities of the initial scale choices. In eq. (2.28), the are from two body phase space and the real matrix element squared respectively. For more details on this structure and for explicit results, we refer readers to refs. [107,111,114]. The form of the soft function as given in eq. (2.28) is same or universal for any arbitrary production process of colourless final states. This is evident since the soft enhancements only depend on the nature of the external coloured particles, i.e., the initial partons. It is also interesting to find that S ab satisfies maximally non-abelian property up to third order in the strong coupling constant [107,111,115,116], by which the quark and gluon lines are related by the Casimir constants in the adjoint and fundamental representations. In other words, we have S gg S qq = C A C F . However, whether the validity of this property holds beyond third order with the so-called generalised Casimir scaling relation [112] needs to be addressed in future. In the present paper, we only need the knowledge of the 3-loop soft function S gg .

Threshold resummation
Threshold logarithmic corrections dominate when the partonic scaling variable z approaches to unity. This is manifested in eq. (2.28) in terms of {δ(1 − z), D i (z)} which is evident by using the relation eq. (2.29). In this kinematic limit, the large logarithm ln(1 − z) multiplying with the small coupling a s produces O(1) terms, i.e., a s ln(1 − z) ∼ O(1) . This potentially spoils the perturbative convergence if we truncate at a fixed order in a s . In such a case, the threshold resummation procedure [117,118] provides an alternative perturbative expansion.
In order to construct the resummation framework, we employ the structure of soft function S ab , AP splitting kernel and the master equation (2.15) that we have discussed in the last subsection. Using the relation eq. (2.29), the soft function can be decomposed into the Dirac delta part and the plus distribution part: ln S ab = ln S ab,δ δ(1 − z) + ln S ab,D , (2.30) where ln S ab,δ is proportional to δ(1 − z) and ln S ab,D involves only plus distributions of the form D i . Rearranging eq. (2.28), we get: The above expression involves poles and finite terms, where the former will cancel against the IR poles of the virtual correction Z IR ab and of the splitting kernels Γ −1 aa , Γ −1 bb . Similarly, the distribution part of the soft function is given by: (2.32) This gives us an integral representation for ln S ab,D : In eq. (2.33), we have used the relation The IR poles in the second line in eq. (2.33) cancel against the poles of the D 0 term of the splitting kernels exactly. The finite remnants of the splitting kernels depend on ln(µ 2 F /µ 2 R ) when expressing eq. (2.19) in the renormalised coupling constant a s at the scale µ R .
Convoluting the soft function with the AP splitting kernels and Z IR ab produces the IR finite function S Γ,ab defined in eq. (2.16): As mentioned above, the poles of ln S ab,δ and ln S ab,D cancel against the respective terms from Z IR ab , Γ aa and Γ bb . Decomposing the splitting kernels into the Dirac delta and plus distribution parts: where we have defined Here, the first line comprises of the finite part of ln S ab,δ and the finite remnants arising from the δ(1 − z) terms of the splitting kernels. Finally, taking into account the finite hard function H ab , as in eq. (2.15), it gives us the partonic coefficient function: The symbol P refers to the operator for the "path-ordered exponential", which has the following expansion for an arbitrary function f (z): (2.43) The second line in eq. (2.41) involves enhanced logarithms in the limit z → 1 which have been resummed. Since this term is in the form of convolutions, it is convenient to solve it in the Mellin N -space, where all the convolutions turn into normal products. The Mellin transform of a function f (z) is generally defined as: In the Mellin space, eq. (2.41) reads as: Solving the above integral in the Mellin space with an expansion over the resummation parameter ω ≡ 2β 0 a s (µ 2 R ) ln N results into: are also implicitly understood. Taking out all the N -independent terms from the exponent gives us: The total N -independent prefactor is = H ab M 2 , µ 2 R × exp 2 ln S fin ab,δ − ln Γ fin aa,δ − ln Γ fin bb,δ +C 0,ab . (2.48) As stated earlier, the resummation coefficients g k,ab (ω) are universal and depend only on the nature of incoming partons. However,g 0,ab is process dependent. It can be expanded in a s (µ 2 R ) asg Each term in the exponent in eq. (2.47) has been resummed to all orders in α s . At the N k LL accuracy, we should take into account both the logarithmic terms up to g k+1,ab (ω) in the exponent and the N -independent terms up tog (k) 0,ab ing 0,ab . It is also implicitly understood that the hard function, as well as g 0,ab andg 0,ab , may depend on the other underlying Born kinematic variables, which we have not yet explicitly written out.
When the resummation calculations match to the fixed-order computations, we need to subtract the double counting between the two. At the N k LO + N k LL accuracy, the double counting is just the N k LO soft-virtual term encoded in the partonic coefficient function. In other words, it is the a s (µ 2 R ) series of ∆ res ab 16 up to the O(a b+k s ) term.

Ambiguities in resummation and resummation schemes
Up to a given logarithmic accuracy, the right-hand side of the resummation formalism eq. (2.47) is not uniquely defined. One has the freedom to choose which N -independent piece should be absorbed into the exponent. In other words, we can rewrite the resummed N k LO partonic coefficient function in the Mellin space as where the functionC 0,ab (M 2 , µ 2 F , µ 2 R ) is an arbitrary N -independent function that can be expressed in an a s (µ 2 R ) series. The subscript N k LO means that the whole expression in the bracket should be kept up to the first k + 1 terms in the a s series. This essentially leads to the ambiguities in the threshold resummation formalism at a given logarithmic accuracy. 17 In this subsection, we present a few concrete forms ofC 0,ab to discuss such ambiguities. First of all, in eq. (2.47), we resum the large logarithms of the form ln N , with the resummation parameter given by ω. However, it is observed that in the large Nexpansion, ln N is always associated with the Euler-Mascheroni constant γ E . Alternatively, one usually takes the resummation parameter as ω ≡ 2β 0 a s (µ 2 R ) ln N with N ≡ N e γ E . In this paper, we consider four different resummation schemes as follows. 16 Note that ω also depends on as(µ 2 R ), and it should be expanded into the as series too. 17 Such resummation ambiguities have already been studied in the Drell-Yan process at N 3 LL accuracy in ref. [119].
• N 2 scheme : In this scheme, we have the coefficient (2.55) Notice that in this scheme the γ E -dependent pieceC 0,ab,γ E has been absorbed into the transforms of N → N and ω → ω in the exponent.

N 3 LL resummation for Higgs boson pair production
With the general formalism for the threshold resummation of any colourless final state, we are now in the position to say something specifically for the Higgs pair production in ggF. We begin with the master formula for the threshold resummation given in eq. (2.47). The quantities in the exponent given by g k,gg (ω) are known up to k ≤ 4, 18 which were first applied for the single Higgs production up to the N 3 LL accuracy [121]. These expressions can be fully recycled for the Higgs boson pair process. Their expressions up to the fourth order are provided in appendix A for the completeness.
The N -independent coefficientg 0,gg is not universal, whose process dependence is only stemming from the hard function H gg . The universal N -independent contributions from the soft function and the splitting kernels could be found in appendices D and E of ref. [114]. For the convenience of reader, we also provide their explicit results for the ggF processes up to the three loops in appendix A. Combining them together with the hard function returnsg The hard function H gg up to 3 loops for the N 3 LL threshold resummation of the di-Higgs process in the infinite top quark mass approximation has been documented in appendix A of ref. [52]. For the sake of completeness, we still briefly mentioned it here. The effective Lagrangian given in eq.(2.1) of ref. [52]  with L t ≡ ln(µ 2 R /m 2 t ) and m t is the on-shell top quark mass. It is important to notice that the Wilson coefficients are at least O(a s ). Thus, we can organise the amplitude square into three parts dubbed as class-a, b, c, which feature 2, 3, and 4 insertions of the effective vertices respectively. This means that the hard function can be written as whose expressions in terms of the amplitudes are given in eq. (A.10) of ref. [52]. The class-a hard function H a gg has the similar topologies as the single Higgs process, known up to four loops [122]. The class-b and -c hard functions need the knowledge of other topologies. In eq. (A.6) of ref. [52], besides the trivial typo M  is too lengthy to be directly used in the numerical code. In our calculation, for the practical usage, we have to generate the numerical grids in Mathematica with 100-digit precision and with the help of the package PolyLogTools [123] first. By knowing the hard function, the N -independent coefficientg 0,gg in eq. (2.48) becomes: 0,gg , andg (2) 0,gg respectively. We have carefully checked our implementation. For example, the hard function has been checked against ref. [52], and its scale dependence satisfies the renormalisation group equation. In order to verify our numerical setup, we have also tested the N 3 LL resummation calculation of the single Higgs process with ref. [124]. In the N 1 scheme, we find good agreements with ref. [73] at the NNLL accuracy for the di-Higgs process.

Results
Before we present our results, we fix our setup for the numerical calculation. In order to recycle the N 3 LO results that we have calculated before, we take the same setup as in ref. [52]. Namely, we take the Higgs mass m h = 125 GeV and the vacuum expectation value of the Higgs field v = ( √ 2G F ) − 1 2 = 246.2 GeV. The top-quark pole mass m t = 173 GeV enters only into the Wilson coefficients C h and C hh of the effective Lagrangian. The same PDF PDF4LHC15 nnlo 30 [125][126][127][128], along with its α s renormalisation group running, provided by LHAPDF6 [129] will be used regardless of the α s or logarithmic order we are considering. The default central scale is chosen as the half of the invariant mass of the Higgs boson pair, i.e., µ 0 = m hh /2. The scale uncertainty, for estimating the missing higher order terms, is evaluated through taking the envelope of the 7-point variations of the factorisation scale µ F and the renormalisation scale µ R in the form of µ R/F = ξ R/F µ 0 with ξ R/F ∈ {0.5, 1, 2} after excluding the two extreme points (ξ R , ξ F ) = (0.5, 2), (2, 0.5).

Impact of QCD corrections in the infinite top quark mass approximation
We study the cross sections in the infinite top quark mass limit in this subsection. Both the inclusive cross sections and the Higgs pair invariant mass distributions will be reported.
Let us first focus on the inclusive cross sections that have been integrated with the whole phase space. As aforementioned, in the resummation calculations, we have the additional theoretical uncertainties stemming from the ambiguities in the resummation formalism discussed in subsection 2.4. Figure 2 reports the resummed results from NLO+NLL to N 3 LO+N 3 LL in the four typical center-of-mass energies covering the energy ranges of the LHC ( √ s = 13, 14 TeV) and future colliders ( √ s = 27, 100 TeV). Concrete numbers can also be read from table 1. The four resummation schemes, dubbed as N 1 , N 2 , N 1 , N 2 , have been defined in subsection 2.4. We have checked our NLL and NNLL calculations against those in ref. [73], where the latter only provides the N 1 results. As anticipated, such a scheme dependence reduces from NLO+NLL to N 3 LO+N 3 LL. However, the different schemes give us both the different central values and error bars due to 7-point scale variations. In particular, the differences are more significant between N 1 and N 2 schemes than N 1 vs N 1 or N 2 vs N 2 . The scale uncertainties are reduced only marginally from N k LO to N k LO+N k LL in the N 1 and N 1 schemes, while a factor two reductions have been observed in the N 2 and N 2 schemes. For example, the N 3 LO error bars are almost same after including the N 3 LL resummation in the N 1 scheme. Besides the scale uncertainties, it is interesting to notice that the central values of N k LO+N k LL in the N 2 and N 2 schemes are much closer to N k+1 LO than the other two. This potentially tells us that our best results N 3 LO+N 3 LL in the former two schemes can be extrapolated to predict the next order calculation, i.e., the unknown next-to-next-to-next-to-next-to-leading order (N 4 LO). Although it seems that the N 2 and N 2 schemes are equivalently good from the above perspectives, we decide to choose the N 2 scheme in the following context as our best predictions, whose scale errors are more symmetric than the N 2 scheme. We want to stress that it is just our choice without any preference from first principles. An alternative attitude that is more conservative is to quote the theoretical uncertainties due to the scheme dependence instead of picking up one scheme. For instance, if we take the envelope of the N 3 LO+N 3 LL central values in the four schemes, we have σ N 3 LO+N 3 LL = 38.70 +0.85% −0.87% scale +0.08% −0.39% scheme fb at 14 TeV, where the second error due to the resummation scheme ambiguities is in anyway subdominant. Alternatively, if one takes the envelope of all the predictions in different schemes including the corresponding scale uncertainties, the cross section at 14 TeV becomes σ N 3 LO+N 3 LL = 38.70 +0.89% −2.6% scale+scheme fb. Nevertheless, according to the aforementioned arguments, we will stick to the N 2 scheme for the following discussions.    of N 3 LO+N 3 LL are a factor 2 smaller than N 3 LO and almost a factor 4 smaller than NNLO+NNLL, reaching to the sub-percent level now. The cross sections in a wider energy interval from √ s = 7 TeV to √ s = 100 TeV are displayed in figure 3. In the left plot, we compare various resummation orders from LO+LL to N 3 LO+N 3 LL. The error band of N k LO+N k LL, representing the 7-point scale uncertainties, resides completely in the band of the previous order when k ≥ 2. 19 Such an observation is analogous to the 19 Without surprising, LO+LL clearly underestimates the missing higher order by using the manner of fixed order results [51,52], with the exception that in the fixed order the N 3 LO bands are outside the NLO counterparts. NLO+NLL increases LO+LL by 40% (50%) at 13 (100) TeV. NNLO+NNLL further enhances NLO+NLL by 6% and 8% respectively at these two energies. Furthermore, N 3 LO+N 3 LL improves the NNLO+NNLL cross sections by only 0.4% and 1% at √ s = 13 and 100 TeV respectively. The right plot of figure 3 compares our best prediction N 3 LO+N 3 LL to the previous state-of-the-art results known in the literature (NNLO+NNLL and N 3 LO). Because the (partial) higher orders beyond N 3 LO have been included in the NNLO+NNLL resummation calculation, strictly speaking, N 3 LO does not supersede NNLO+NNLL. In contrast, N 3 LO+N 3 LL is superior to both of them. This is why the comparison in the right plot would be also interesting. From the ratios in the lower panel, the difference in the central values of N 3 LO and N 3 LO+N 3 LL is almost invisible, while the difference between NNLO+NNLL and N 3 LO+N 3 LL is visible but within a percent. However, the N 3 LO error band is completely enclosed in the NNLO+NNLL band, and the N 3 LO+N 3 LL error band largely lies in the N 3 LO band. This again points to the good asymptotic perturbative convergence. are computed only at some approximation at this order [52] so far. The invariant mass distribution is a typical observable for demonstrating the impact of threshold resummation calculations. In the m t → ∞ approximation, we report the dσ dm hh distributions in figure 4 at four distinct perturbative orders N k LO+N k LL (k = 0, 1, 2, 3). The inclusion of higher order QCD radiative corrections dramatically stabilises the perturbative calculations of the invariant mass differential distributions. In particular, the shape of the distribution is almost unchanged from NLO+NLL to N 3 LO+N 3 LL except the m hh 2m h regime, while the scale uncertainties are significantly reduced as what have been observed in the inclusive cross section case. The results in four nominal energies in the figure show that the m hh spectrum is harder from 13 TeV to 100 TeV because of the wider phase space with larger √ s. From k = 1, the error bands of N k+1 LO+N k+1 LL are completely enclosed within the previous-order N k LO+N k LL uncertainty bands. We compare our best N 3 LO+N 3 LL predictions to N 3 LO and NNLO+NNLL in figure 5. The message is again similar. The N 3 LL QCD corrections only marginally modify the shapes, and the N 3 LO+N 3 LL results with sub-percent scale uncertainties are in both the N 3 LO and NNLO+NNLL error bands.

N 3 LO+N 3 LL improved cross sections with full top-quark mass dependence
Unlike the single Higgs case, it is widely acknowledged that it would be crucial to include the finite top quark mass corrections in gg → hh for the realistic phenomenological applications. However, as aforementioned, the calculations in the full top quark mass dependence are very challenging, which are only best known at NLO+NLL [47]. In our paper, we will only use the NLO full m t -dependent results (dubbed as "NLO mt " along the notation used in ref. [52]), which can be directly computed with the public code [48,50]. In order to provide the best theoretical predictions, a common way people usually do is to combine the m t → ∞ results, which are precisely known, with the full m t results. However, there are ambiguities on how to combine the two, featuring different underlying theoretical working assumptions. Several combination approaches have been studied in refs. [52,80,82]. In this paper, we will opt to improving the (differential) cross sections by multiplying the (differential) QCD K factors obtained in the m t → ∞ calculations, i.e.
Except the so-called "finite top (FT)" approximation [80,82], which requires additional work that is beyond the scope of our paper, it is arguable that (N k LO + N k LL) ⊗ NLO mt (and N k LO ⊗ NLO mt ) is the best combination among those presented in section 3 of ref. [52]. Therefore, we refrain ourselves from presenting our results in various combination schemes here, but instead just show (N k LO + N k LL) ⊗ NLO mt and N k LO ⊗ NLO mt . For NLO mt and N k LO ⊗ NLO mt , the results are essentially identical to ref. [52] except that we use 7-point scale variations in the current paper, while 9-point scale variations have been adopted in ref. [52]. The cross section in each scale choice is evaluated following eq.(3.4) in ref. [52]. In other words, the relative scale errors in (N k LO + N k LL) ⊗ NLO mt and N k LO ⊗ NLO mt are identical to those in N k LO+N k LL and N k LO in the infinite top quark mass approximation. Though we have only taken into account NLO mt , we want to emphasize that the most advanced theoretical predictions can be easily obtained from our calculation even after the full m t dependent NNLO result was available. In particular, we suggest the current recommendation values proposed by the LHC Higgs Cross Section Working Group [22] can be improved with our results.
The total cross sections after including NLO mt are listed in table 2 at the 4 centerof-mass energies √ s = 13, 14, 27 and 100 TeV. The results are obtained after applying eq.(3.1) to the total cross sections at NNLO+NNLL, N 3 LO and N 3 LO+N 3 LL respec-tively. In the table, we show the original NLO mt results without any further improvement too. The residual scale uncertainties range from 10% to 15%. There is another source of big theoretical uncertainties due to the top-quark mass scheme and the scale arbitrariness in the MS top quark mass that we have not yet quoted in the table, because such uncertainties are not expected to be reduced in our calculation with the infinite top quark mass approximation. In the inclusive total cross sections, these uncertainties are amount to +4%, −18% at the above four energies [42,45]. In the following, we will only focus on the conventional renormalisation and factorisation scale uncertainties. In the N 2 resummation scheme, (NNLO + NNLL) ⊗ NLO mt enhances the central values of the NLO mt cross sections by around 20%, while the scale uncertainties have been reduced to ∼ ±3%. Meanwhile, N 3 LO ⊗ NLO mt increases the NLO mt cross sections by a similar amount, but reduces the relative scale uncertainties to −0.5%, +2.5%, being very asymmetric and almost a factor 2 smaller than (NNLO + NNLL) ⊗ NLO mt . Our most advanced predictions (N 3 LO + N 3 LL) ⊗ NLO mt shrink such uncertainties by another factor of two, reaching to the sub-percent level. Our best predictions for the di-Higgs cross sections are 33   In principle, the infinite top quark mass approximation works better in the lower m hh regime. This is indeed indicated in figure 6 that the m t → ∞ improved bands are only in the NLO mt bands when m hh < 300 GeV. Beyond 300 GeV, the K factors are almost constants around 1.2, which can be evidently seen in the lower panels. Due to the manner of our combination, the remaining messages are essentially similar as what have been discussed in the previous subsection.
Before closing this section, we assess the missing top quark mass uncertainties beyond NLO. This kind of uncertainty is stemming from the lack of full m t -dependent calculations at NNLO and beyond. Since we are working in the infinite top quark mass approximation, we do not expect the inclusion of higher order in α s will improve such an uncertainty. In other words, the missing top quark mass uncertainty at (N 3 LO + N 3 LL) ⊗ NLO mt should be as large as those at NNLO ⊗ NLO mt and N 3 LO ⊗ NLO mt . The latter has been discussed in ref. [52] by comparing NNLO ⊗ NLO mt with respect to the so-called FT approximation at NNLO [82]. This missing top quark mass uncertainty is found to be around 5% at the LHC energies to 9% at 100 TeV. Our best predictions at (N 3 LO + N 3 LL) ⊗ NLO mt is envisaged to be further improved by combining with the NNLO FT approximation calculations, which we leave it for the future work.

Summary
We have improved the (differential) cross section calculation of the Higgs boson pair production via gluon-gluon fusion with the threshold resummation up to N 3 LL in the infinite top quark mass approximation. Such a resummation calculation has been consistently matched to the known N 3 LO results. We first study the theoretical uncertainties at various perturbative orders due to the ambiguities in the resummation formalism. Though such resummation scheme uncertainties are in anyway sub-dominant at N 3 LO+N 3 LL, we argue that the N 2 and N 2 schemes should give us the best predictions. With the N 2 scheme, the residual renormalisation and factorisation scale uncertainties are less than one percent, which are a factor of 2 smaller than N 3 LO and a factor of 4 smaller than NNLO+NNLL, while the central values are almost identical to N 3 LO. This further consolidates the observation that the asymptotic convergence in the strong coupling constant α s series has reached at the fourth order [51,52]. Such a conclusion holds for both the inclusive total cross sections and the differential invariant mass m hh distributions.
Our results have been finally combined with the full top quark mass dependent calculation at NLO QCD. Such combinations are essential for phenomenological applications. The new predictions of the di-Higgs hadroproduction cross sections have been reported after taking into account the finite top quark mass corrections.
Regarding the prospects for the future, it is clear that how to reduce the large topquark mass scheme uncertainty remains an open question. In addition, the NNLO QCD and NLO electroweak calculations in the SM would be quite desirable, which are however quite technically challenging. The top-quark mass scheme uncertainty is expected to be reduced if the full NNLO QCD calculation in the SM was available. The minor percent level bottom-quark-loop contributions would turn out to be interesting too given the subpercent level scale uncertainty we have reached in this paper.

A The universal N 3 LL resummed coefficients
In this appendix, we provide the explicit expressions of the process-independent ingredients in the resummed partonic coefficient function ∆ res gg (N, M 2 , µ 2 F ) [eq. (2.47)] for the gluongluon induced colourless final states in QCD with n q = 5. For the purpose of the discussions in subsection 2.4, we would like to splitC 0,gg into two pieces C 0,gg =C 0,gg,γ E +C 00,gg,ζ , (A.1) where the first pieceC 0,gg,γ E is proportional to the Euler-Mascheroni constant γ E and the second term solely depends on the Riemann zeta function which can be obtained by setting γ E = 0 inC 0,gg . The first N -independent universal part has the following a s series 2 ln S fin gg,δ − 2 ln Γ fin gg,δ +C 0,gg,γ E = k=1 a k s G