A fragmentation-based study of heavy quark production

Processes involving heavy quarks are a crucial component of the LHC physics program, both by themselves and as backgrounds for Higgs physics and new physics searches. In this work, we critically reconsider the validity of the widely-adopted approximation in which heavy quarks are generated at the matrix-element level, with special emphasis on the impact of the collinear logarithms associated with final-state heavy quark and gluon splittings. Our study, based on a perturbative fragmentation-function approach, explicitly shows that neglecting the resummation of collinear logarithms may yield inaccurate predictions, in particular when observables exclusive in the heavy quark degrees of freedom are considered. Our findings motivate the use of schemes which encompass the resummation of final-state collinear logarithms.


Introduction
The production of heavy quarks in association with other particles at hadron colliders represents a crucial testing ground for our understanding of perturbative Quantum Chromodynamics (QCD) in the presence of several energy scales. This class of processes is governed by at least two scales, namely the heavy-quark mass m and the (invariant) mass M of the particle(s) produced along with the heavy quark. In these cases, large collinear logarithms of the ratio M m may jeopardise the convergence of the perturbative expansion of relevant theoretical predictions. Fortunately, the impact of these logarithmic contributions can be controlled by resumming them to all orders in α s , via a scheme in which the heavy quark mass m is neglected at the level of the matrix element. Such a scheme is often referred to as massless or five-flavour scheme (5FS), in case the heavy quark is identified with the bottom quark. As far as heavy quarks in the initial state are concerned, this procedure amounts to introduce a suitable parton distribution for the heavy quark. An analogous JHEP01(2020)196 procedure for heavy quarks in the final state involves the use of fragmentation functions, and is the subject of the present work. A scheme in which the heavy quark is produced at the matrix-element level and is not treated on the same footings as the light quarks is dubbed as massive scheme or four-flavour scheme (4FS). The resummation of powers of log(M/m) in a 5FS is performed by solving the evolution equations (usually referred to as Dokshitzer-Gribov-Lipatov-Altarelli-Parisi, or DGLAP, equations), at the price of discarding power corrections of O(m 2 /M 2 ), and thus of yielding less accurate theoretical predictions for the observables related to the heavy-quark degrees of freedom.
In [1,2], it has been shown that, for processes in which the heavy quarks (more specifically bottom quarks) are dominantly produced via initial-state (spacelike) splittings, the theoretical predictions in 4FS are not spoiled by initial-state collinear logarithms. This is due to two main factors, one of dynamical and the other of kinematical origin. The first is that the effects of the resummation of the initial-state collinear logarithms are relevant mainly at large x and, in general, keeping only the explicit logs appearing at NLO in the 4FS is a good approximation. The second reason is that the scale which appears in the collinear logarithms turns out to be proportional to the hard scale of the process but is suppressed by universal phase space factors that, at hadron colliders, reduce the size of the logarithms for processes taking place. This result makes it not only possible, but also advisable -owing to the better perturbative description of the differential observables involving the heavy quark(s) -to employ the 4FS for the exclusive description of these processes. This has been shown explicitly to be the case in single-top production [3,4], bbH [5][6][7][8][9][10][11][12][13][14][15][16] and bbZ/γ production [16][17][18][19][20][21][22][23], and also for processes predicted by extensions of the Standard Model (SM), such as heavy charged Higgs boson production in a two-Higgs doublet model or in supersymmetry [24][25][26][27][28][29][30][31][32][33][34]. On the other hand, the calculations of the total rates in the 5FS display a faster perturbative convergence and exhibit a smaller scale uncertainty associated with missing higher orders. Methods that combine the 4F and the 5F schemes, retaining the advantages of both, are actually available, but they are generally tailored to a few specific observables. The FONLL scheme, first proposed for the transverse momentum spectrum of bottom quarks produced in hadronic collisions [35], has the advantage of being universally applicable and of allowing one to combine 4FS and 5FS calculations performed at any perturbative order. The formulation of the FONLL scheme has been extended to deep-inelastic scattering (DIS) [36] and adapted to the computation of the total cross section for Higgs and Z production in bottom-quark fusion [37][38][39]. Various recent attempts to consistently include both the resummation of initial-state collinear logarithms and mass effects also for differential and parton-shower matched observables have been recently put forward, see for example the five-flavour-massive scheme proposed in [16,40] or a similar approach based on multi-jet merging [41]. Improvements at the inclusive and at the differential level are on-going. Finally, consistent b-quark PDFs to be used in association with massive initial states have also been defined [43], thus allowing the bottom quark to be endowed with a standard PDF satisfying DGLAP evolution equations, yet treating it as massive in hard matrix elements.
While initial-state collinear logarithms have been studied in details in the abovementioned literature, the situation is much less clear for processes in which final-state JHEP01(2020)196 (timelike) splittings into heavy quarks contribute significantly to the process. It has to be mentioned that, for what concerns the production of flavoured jets, higher-order corrections are generally not finite (or they are logarithmically enhanced in a massive scheme) unless dedicated jet algorithms are employed, see ref. [42]. Processes featuring final-state splittings into heavy quarks include bbW production [18,19,[44][45][46][47][48][49], the top-mediated contribution to bbH production [50], ttbb production [51][52][53][54][55][56] and multi-b final states [57,58] (mostly relevant for di-or triple-Higgs searches [59][60][61]). While the importance of the resummation of collinear logarithms has been partially investigated for Q → Qg splittings [35,62,63], no assessment of the impact of the logarithms of M m exists to date, as far as the g → QQ splittings are concerned.
An interesting process which involves bottom quarks in the final state is the production of ttbb. This process is an important background to Higgs and top associated production, a unique probe of the Yukawa coupling between the Higgs scalar and top quarks, and therefore it is of great relevance for present-day analyses [64,65]. Different tools are available to simulate this process in a 4FS, including NLO QCD corrections and matching with parton showers. However, even when tuned comparisons are performed [66], the predictions obtained by different tools display rather large differences, which have a dominant impact on the systematic uncertainty in the determination of the top Yukawa coupling. 1 The work necessary to improve this situation by increasing the perturbative order of the computation is not straightforward. Given the high multiplicity and the number of scales involved in this kind of processes, the NNLO corrections in the 4FS are very hard to compute. On the other hand, NLO QCD predictions for ttbb plus one light jet have recently become available [67]. Assuming that the distortion due to parton showers is small, these calculations could help to validate the light-jet spectrum. If the resummation of collinear logarithms associated with the final-state splittings of gluons and bottom quarks is found to have a strong impact on this observable, then a matched calculation could solve the observed discrepancies. It is the purpose of the present work to make a first step in this direction.
In this paper, we assess the impact of missing powers of log M m associated to final-state splittings by means of fragmentation functions (FFs). Heavy quark FFs can be computed in perturbation theory in QCD, starting from initial conditions at a reference scale µ 0 and employing the timelike DGLAP evolution equations to evolve them up to any other scale. Initial conditions for the gluon-and heavy-quark-initiated fragmentation into a heavy quark are known at order α s [62,68] and have been computed at order α 2 s [69,70], while the DGLAP evolution equation is implemented in public codes such as QCDnum [71], ffevol [72], APFEL [73] or MELA [74], up to NNLL logarithmic accuracy. The codes have been benchmarked in [74]. An approach based on FFs will enable us to study the dynamics of the bottom fragmentation in details and in an isolated environment. In particular, the importance of the resummation of potentially large logarithmic contributions can be assessed by comparing resummed predictions to the truncation of the FF at a given order in α s , and the impact of the resummation of sub-leading logarithms can be studied up to JHEP01(2020)196 NNLL accuracy. It must be stressed that, while the importance of resumming collinear logarithms in bottom-quark initiated fragmentation has been known for a long time at NLL [35,62,63], much less attention has been devoted to the role of gluon-initiated fragmentation to heavy quarks (one exception is ref. [75] in which the case of charm meson production was studied). Such a negligence was justified the past by the sub-dominant importance of this mechanism at LEP and at Tevatron, but this is no longer the case at the LHC for the g → bb splitting, and will not be the case for the g → tt splitting at future colliders.
The paper is organised as follows. We review the details of timelike DGLAP evolution in section 2, where we also discuss how to truncate the evolution at a given order in α s . In section 3, we discuss the setup employed for this computation, while results are presented in section 4. In the light of our results, in section 5 we comment on how to simulate processes in which b quarks are dominantly created in final-state splittings. In section 6 we link our findings to those obtained in the context of heavy quark multiplicity estimates. We draw our conclusions in section 7, where we also discuss future outlooks of our work. In appendix A, we provide supplementary material, namely the explicit expressions of the truncated FFs up to order α 3 s together with the discussion of their numerical validation.

Timelike DGLAP evolution
In this section we review the formalism of scale evolution for fragmentation functions, with the main purpose of fixing notations and conventions. We also set the ground for the derivation of explicit formulae for heavy-quark fragmentation functions at fixed order up to order α 3 s , which are reported in appendix A.

Strong coupling constant
We adopt the following notation for the evolution of the running coupling constant α s (µ) of strong interactions: , µ 0 is a fixed reference scale, and As usual, T F = 1 2 , C A = 3 and C F = 4 3 for three-colour QCD. The number of active flavours n f will always be set to 5.
We will need the expansion of α s (µ) in powers of α s (µ 0 ) truncated at O(α 3 s ), which is given by The truncated expansion of α s (µ 0 ) in terms of α s (µ) can be trivially obtained by swapping µ and µ 0 in the above equation obtaining

Fragmentation functions
We consider the differential cross section dσ dx for a generic process with a heavy quark Q of mass m in the final state, where x is the energy fraction carried by the heavy quark: where E Q is the energy of the heavy quark, and E the energy of the parton originating from the matrix element. Then, standard factorisation implies where dσ i dz is the partonic cross section for parton i in the final state with energy fraction z, and D i (x, µ, m) is the fragmentation function of parton i into the heavy quark.
The fragmentation functions depend on the factorisation scale µ according to the evolution equations The timelike splitting functions P ij have a power expansion in α s , whose coefficients have been computed up to NNLO, and can be found in [76][77][78]. Note that, in the case of timelike evolution, we have contrary to what happens in the spacelike case. The timelike splitting functions are the same as the spacelike ones at LO, while they differ at NLO and higher. The DGLAP evolution equations are conveniently solved for Mellin-transformed quantities, because Mellin transformation turns the integro-differential DGLAP equations into ordinary differential equations. We define the Mellin transform f (N ) of a generic function f (x) by (2.9) We will use the same symbol for a function and its Mellin transform; this does not lead to confusion, as long as functional arguments are explicitly indicated. We rewrite the timelike DGLAP evolution equations as (2.11)

Initial conditions
We will need suitable initial conditions for the fragmentation functions. The perturbative initial conditions have been computed at order α s in refs. [63,68] and α 2 s in refs. [69,70]: Initial conditions at order α 3 s are currently unknown, and will be neglected in the following. It is interesting to notice that the initial condition for the b quark fragmentation function contains a non-logarithmic term already at NLO, namely contrary to the case of the b quark PDF in the spacelike evolution [82,83]. The initialscale gluon fragmentation function has instead only a logarithmic term that vanishes when µ 0 = m:

JHEP01(2020)196
It is customary to separate singlet from non-singlet evolution in the DGLAP equations. To this purpose, we define the combinations with the valence contributions evolving according to the non-singlet (V ) timelike evolution equations and the triplet contributions evolving according to the non-singlet (+) timelike evolution equations. The evolution of the singlet combination D Σ is coupled with the gluon. The bottom quark fragmentation is given by 20) and the non-vanishing initial conditions are given by To determine D b and D g up to NNLO and their expansions up to O(α 3 s ), we need solutions of the evolution equations for both the singlet and the triplet combinations for D T 24 and D V b respectively, and d dt Note that eq. (2.26) differs from eq. (2.4) of ref. [74] only formally, because in the latter γ qg is the element of the spacelike singlet matrix evolution, that features a factor 2n f . 3

Truncated solution of DGLAP equation in Mellin space
The DGLAP equations are usually solved in order to resum large logarithmic contributions; the fragmentation functions are evolved from a reference scale µ 0 to a generic scale µ through an evolution operator, which in turn is given by an expansion in powers of fixed. Here, we would like to compare such a logarithmic (resummed) expansion with a truncated one, that is, a solution expressed as a power series in α s (µ), up to a certain order.
We rewrite eq. (2.10) in matrix form, and with some of the functional arguments omitted, to keep notation simple: and T is the time-ordering operator. The matrix γ has a Taylor expansion in α s , given in eq. (2.11), which starts at order α s . Therefore, it is easy to truncate the expansion of U(t, 0) to any given order. Given that we are interested in the solution up to NNLO, we keep terms up to order α 3 s in U(t, 0). We find . Next, α s (µ 1 ), α s (µ 2 ) are expanded in powers of α s (µ 0 ) as in eq. (2.3), and the integrals easily performed. Finally, α s (µ 0 ) is re-expressed in terms of α s (µ), according to eq. (2.4).
In the following, we will call LO, NLO and NNLO truncated FFs, respectively the expression obtained by evolving the initial conditions of eq. (2.16) with the evolution operator in eq. (2.30), and retaining terms up to order α s (µ), α 2 s (µ) and α 3 s (µ) respectively. 4 The full expressions are reported in appendix A.

Setup of the computation
The results presented in this paper are obtained by means of a private computer code which links the public MELA (Mellin Evolution LibrAry) library [74]. MELA is an evolution program in Mellin space, developed specifically to provide a simple and user-friendly framework complementary to (and also serving as a validation of) the code APFEL [73], which works in x space.
For the running of α s , we use the routines implemented in MELA with α s (M Z ) = 0.11856 and M Z = 91.187 GeV, that solve the renormalisation group equation for α s (µ) consistently with the DGLAP timelike equations. The charm and bottom thresholds are set to m c = 1.4142 GeV and m b = 4.7 GeV respectively. The top quark mass m t is set to infinity, so that n f = 5 at all scales. The timelike splitting functions at LO, NLO and NNLO in the N space are taken directly from MELA. Note that, due to the complexity of the expressions entering the NNLO splitting functions, MELA implements the approximate representation of ref. [85]. It was checked in [78] that, except for very small values of x, such approximate expressions deviate from the exact ones by less than one part in a thousand. The N -space solution of the timelike DGLAP evolution equation at LL, NLL and NNLL are also taken from MELA. MELA implements the analytical solutions of the DGLAP evolution equations as in PEGASUS [86], both the truncated solution and the iterated solution. In the former, the resummed solution to the DGLAP equations in N space is exact up to terms of higher orders in the perturbative expansion with respect to the order of the DGLAP solution.
In the iterated solution, all orders are kept in the solution of the DGLAP equation. The N m LL solutions differ in terms of order n > m. In our case, we have verified that the effect of the resummation of collinear logarithms does not depend on the settings of the solution of the DGLAP evolution equation.
The initial perturbative conditions have been implemented in our own code in the N space up to order α 2 s by numerically Mellin-transforming the x-space expressions of refs. [69,70] for real N . Di-and tri-logarithms appearing in these expressions are evaluated with Chaplin [87]. Since we are currently lacking an analytically-continued Mellin transform of the O(α 2 s ) terms of the initial conditions 5 the evolved expressions cannot be inverted to x 4 Note that D b starts at order α 0 s (µ). 5 More specifically: we succeeded to obtain analytically-continued Mellin transforms for d q , following the results and algorithms of refs. [88][89][90][91][92], (as implemented in the codes ancont and ancont1), while for d (2) g some terms are still missing. We plan to report on the complete analytic continuation of the initial conditions in a following publication.

JHEP01(2020)196
space if these initial conditions are included. As their impact in Mellin space is found to be mild and rather flat in the N space, as we will show explicitly in section 4, we argue that they will not play an important role in the x space. The numerical inversion of the N -space truncated and resummed fragmentation functions from N space back to x space is performed by means of an implementation of the Mellin inversion based on the Talbot-path algorithm [93]. Matching conditions are implemented in the treatment of flavour threshold crossing in the evolution of the fragmentation functions.

Impact of collinear resummation and results
In this section, we present results for the resummed and truncated FFs at different mass scales, in Mellin (N ) space as well as in the physical space of the energy fraction x carried by the heavy quark.
By comparing the truncated and resummed predictions for the FFs for different values of the scales µ and µ 0 and at different orders, we can: i) assess the typical size of the effects due to the resummation of final-state collinear logarithms, in particular with respect to an approximation in which only logarithms up to a given order in perturbation theory are included; ii) compare the behaviour of the bottom-quark and gluon initiated FF; iii) determine the importance of the inclusion of initial conditions at order α 2 s , computed in [69,70].
In particular, the last point and the importance of the gluon FF have been neglected so far in the literature, see e.g. [63].
We start by presenting results in N space, for the D b and D g fragmentation functions, in figures 1 and 2 respectively. The layout of the figures is the following. Shades of red (blue), from lighter and more finely dashed to darker and solid, are used for truncated (resummed) predictions of increasing perturbative orders, computed without initial conditions at order α 2 s . Symbols are used for the NLO (NLL) predictions which include the full initial conditions up to order α 2 s . Left panels show results for the FFs, while right panels show the corresponding ratios w.r.t. the NNLL prediction. In the top panels, an initial scale µ 0 = m = 4.7 GeV is employed, while in the bottom ones it is set to µ 0 = 2m. Finally, each panel shows results at four different value of the scale µ: µ = 10, 30, 100, 300 GeV, from left to right and from top to bottom.
First, we inspect the behaviour of D b (N ), displayed in figure 1: in the left plots, we can see how the resummed predictions are hard to distinguish one from another, and also how the NNLO curve is close to them. This is not the case for the LO Figure 2. Same as figure 1, for the gluon fragmentation function.
The N range in figure 1 is chosen to be 0 ≤ N ≤ 40 for illustrative purposes. However, this is not really consistent, because at very large values of N (which correspond to values of x close to 1) the initial conditions, computed at a fixed order in α s , get large corrections from higher order contributions due to the presence of large powers of log N in the perturbative coefficients. The effect of such large logarithms is that the Mellin transforms of the fragmentation functions in figure 1 become negative around N ∼ 20, and consequently the ratio plots display a peak in that region. This problem was pointed out in [94], where it is also argued that a resummation of large N logarithms in the initial condition would push the zero of the fragmentation functions toward much larger values of N . Even large-N resummation, however, would not make the fragmentation functions positive in the whole JHEP01(2020)196 range, due to non-perturbative effects, or equivalently due to the presence of the Landau pole in the strong coupling at very small energy scales. Perturbation theory, even in its resummed version, cannot provide a reliable description of fragmentation functions for x larger than approximately 1 − Λ QCD /µ.
In the ratio plots the features described above can be appreciated with more details. In particular, it can be appreciated how the NNLO-accurate prediction starts to depart from the NNLL at large scales and large values of N , while in general, with the exception of the aforementioned spike, resummed computations show a better agreement with each other at large N . Increasing the initial scale µ 0 , reduces the differences between the resummed and truncated predictions, as it is expected and as it was studied in details in [95]. However the global pattern is unchanged. Finally, we observe how the impact of initial condition at order α 2 s is rather mild (at or below the 10% level), regardless of the scale. Turning to D g (N ), in figure 2, we observe that the truncated expressions depart very quickly from the resummed ones, and how the perturbative series wildly oscillates between negative and positive values for N already as large as 8, with large differences from the resummed curves. On the other hand, resummed curves lie rather close to each other (again, with the exception of the spike at N = 20, induced by the initial conditions of the quark FF), with differences that decrease with the scale, because of the running of α s . Differences are reduced when the initial scale is doubled, and the impact of initial conditions is mild and rather constant with N .
In figures 3 and 4 the x-space results are displayed. They generally reflect the pattern of the Mellin-space results, giving a more direct feeling of the physics of the final-state splittings. Looking at D b (x), in figure 3, we appreciate how close the three resummed predictions are, for the four values of µ considered. Differences among the LL, NLL, and NNLL predictions are always within 10% and with flat ratios, with the exception of very small and very large x (x < 0.1 and x > 0.9). The first regime may only partially be accessible, since the physical regime is typically x > m µ . The behaviour in the second (large-x) regime may be improved by resumming large-x logarithms on top of the DGLAP ones [94]. As far as the truncated predictions are concerned, they generally show a harder shape than the resummed ones (more steeply peaked towards x = 1), and the hardness decreases as higher orders are included. This is consistent with the fact that higher order effects (i.e. extra radiations) soften the b quark during the fragmentation, and in the case of resummed predictions these effects are included to all orders. If we take µ = 100 GeV as a representative scale, µ 0 = 4.7 GeV, and consider the range 0.1 < x < 0.9, the NLO-truncated prediction undershoots the NNLL resummed one of -25% at small x, and overshoots it of +50% at large x. At NNLO, differences are much reduced, at the level of -10% and + 15%.
The gluon-initiated FF, D g (x), on the other hand, exhibits much larger differences between the truncated and resummed predictions. The most visible feature is that the LO FFs is symmetric around x = 0.5, while all the others are not. This is directly related to the symmetry of the P qg splitting function, which is the only term at LO, as it can be seen from the first line of eq. (A.4). 6 As a consequence, the shape of the LO-truncated 6 The initial condition d (1) g is also proportional to Pqg.  Figure 3.   prediction does not change with the scale. Again, higher-order predictions soften the shape of the splitting function, with rather dramatic effects both going from LO to NLO and from NLO to NNLO. In section 5 we will show that these effects are dominantly due to the radiation from the parent gluon. As we did for D b (x), considering the case µ = 100 GeV, µ 0 = 4.7 GeV, it is apparent how the (N)NLO prediction exceeds the NNLL baseline by 80% (15%) at large x, and undershoots it by -20% (-5%) at small x. Comparing resummed predictions among themselves shows, again, that the effect of sub-leading logarithms is rather mild and, as anticipated by studying the behaviour in Mellin space, it is reduced when the scale µ is increased. Finally, some pathologic behaviour is visible both at small  and large x. The latter can likely be cured by resumming large-x logarithms in the quark initial conditions [94], while for the former small-x resummation and coherence effects need to be considered, two ingredient which are crucial in order to obtain correct predictions for heavy-quark multiplicities [96], as we will discuss in section 6.

JHEP01(2020)196
We conclude this section by discussing the dependence of truncated and resummed predictions on the initial scale µ 0 . The effect of changing the initial scale in the case of perturbatively generated bottom PDFs has been studied in details in [95]. It is interesting to compare the effects in the case of perturbative bottom fragmentation functions. This is shown in figure 5, in x space only, both for D b (left panels) and D g (right panels). In this figure we plot, for each of the truncated and resummed predictions, the ratio D p (µ 0 = 2m)/D p (µ 0 = m), for the same values of µ as before. First, we observe that at LO, no µ 0 dependence is there, neither for D b nor for D g . This can be easily understood by looking at the initial conditions in eqs. (2.17) and (2.18), and at the truncated expressions in eqs. (A.5) and (A.4): the coefficient of the anomalous dimension, in both cases, will be log For the other predictions, both truncated and resummed, we observe how the µ 0 dependence is rather mild (less than 10% for D b and 20% for D g for µ ≥ 100 GeV) for intermediate values of x and it decreases with the scale, because of the DGLAP evolution. Truncated predictions exhibit a more unstable behaviour at large x, with a divergent structure in the pathologic region where D p (µ 0 = m) vanishes, and displaying larger uncertainties for higher perturbative orders. The same behaviour, albeit with reduced µ 0 dependence, is exhibited by resummed predictions. Overall, the µ 0 dependence cannot be advocated to explain the large differences between fixed-order and resummed predictions discussed earlier in this section.

JHEP01(2020)196
∼ P gg ∼ P qq Figure 6. The g → bb splitting, dressed with extra gluon radiation. In the collinear limit, radiation off the parent gluon (red) corresponds to factors of P gg , while radiation off the quarks (blue) to factors P qq .

On the simulation of processes with b quarks originated by timelike splittings
The results presented in section 4, in particular those regarding the gluon-initiated FF, can provide instructive information on the dynamics of final-state g → bb splittings. As mentioned in the introduction, such a mechanism is relevant for processes such as bbW , y t -induced bbH, ttbb and multi-b production. We can schematically represent the g → bb splitting, including extra gluon emissions, as in figure 6. In that figure, the radiation off the parent gluon is shown in red, while the radiation off the originating bottom quark is shown in blue. Given the large effects observed in section 4, a natural question to ask is whether the former or the latter type of radiation play a dominant role. At least two arguments can be used to show that the largest effects originate from the radiation off the gluon. The first argument is related to color factors: in the collinear limit, each splitting from the parent gluon corresponds to a factor P gg , proportional to C A . Conversely, radiation off the quark corresponds to P qq , proportional to C F ; since C A 2C F , one expects the former effect to dominate over the latter. The second argument is that, as it is visible in figure 3, higher order effects distort the LO gluon-initiated FF towards small x, and P gg is the only splitting function which is singular in that regime. We support these arguments by explicitly showing, in figure 7, the NLO and LL predictions for D g (x) when setting P gg = 0, and comparing them to the full predictions (note that at NLO -second, third and fourth line of eq. (A.4) -the logarithmically-dominant term has either a single emission from the parent gluon, or one from the bottom quark). We choose µ = 100 GeV, µ 0 = 4.7 GeV as a representative example. We can clearly infer the importance of the emissions from the parent gluon, particularly in the case of the NLO prediction. In that case, the single emission from the quark only mildly affects the symmetry of the FF. Also in the LL-resummed case, the prediction with P gg = 0 is much closer to the LO than to the complete LL prediction.
These findings bear quite important consequences for the simulation of exclusive observables or in general observables sensitive to the b-quark degrees of freedom, in particular when predictions matched to parton shower are considered. Since a parton shower radiates from external partons, if the bottom quarks appear in the hard-scattering process, only JHEP01(2020)196 the radiation off the bottom quarks (blue in figure 6) will be generated, while the radiation off the parent gluon will be included only at a given order in perturbation theory, typically NLO (with the exception of the results in [67] which may be considered partly NNLO.) This is clearly not optimal. In general, resummed predictions exhibit a better perturbative convergence with respect to finite-order calculations al ready at leading log. Hence, in regimes dominated by the splitting mechanisms, it may be more appropriate to generate b quarks by shower-evolving light partons, thus generating both kinds of radiation shown in figure 6, rather than to include them at the matrix-element level. 7 Of course, some caveats must be considered. The above statement holds in case of exclusive observables, for which larger effects are expected. More inclusive observables (typically those related to the b-jet degrees of freedom) will display smaller effects. In section 6, for example, we will show that this indeed the case for heavy quark multiplicities: the effect of the resummation of final state collinear logarithms is much milder. Furthermore, an important assumption we are making is that fixed-order computations with a timelike g → bb splitting follow the pattern of the FF at the corresponding order. This is certainly reasonable to assume, at least in those kinematic regions in which the g → bb splitting topology dominates. However other effects must be taken into account: for example, mass effects typically affect the endpoint (x → 1 and x → 0) behaviours of the FF, although not in a dramatic way. A further aspect is that the collinear approximation underlying a FF-based approach neglects the fact that the radiation recoil is spread among the other particles in the final state (see the extensive discussion in the case of ttbb in [67]). An assessment of the impact of these effects on physical observables requires a convolution of fragmentation functions with a suitable partonic cross section. This task is left for future work.

JHEP01(2020)196 6 Relation with heavy quark multiplicities in gluon jets
Theoretical predictions for jet multiplicities, and in particular of heavy quark multiplicities, has witnessed an important effort of the theory community during the 80's (see e.g. [96,97] and references therein). It is instructive to illustrate if, and in case how, FFs can be employed to predict such multiplicities. We start by considering the main result of ref. [96], namely the probability of a gluon with virtuality Q 2 to split into a pair of quark-antiquark with mass m: where the gluon multiplicity n g is given by 8 The exponent a has the value a = − 1 , but it can be set to zero as long as one's interest is restricted to the leading-logarithmic behaviour, which is the case we are considering in this section. The authors of [96] mention that, in order to get the correct multiplicity, coherence effects have to be accounted for in a systematic way. We will discuss the corresponding effects in the case of FFs. We proceed by expanding eq. (6.1) in powers of α s (Q 2 ) neglecting all m effects in the integrand, and by comparing such an expansion with the leading-logarithmic terms in our truncation for D g , eq. (A.4). The expansion of eq. (6.1) to order α 2 s (Q) reads: This expression should be compared with the first moment of the gluon fragmentation function, as given in eq. (A.4), expanded to second order in α s (Q). Keeping only the leading-log terms in eq. (A.4) we get where t = log Q 2 µ 2 0 , and we have used eqs. (2.12)-(2.15) for N = 1: We see that eqs. (6.3) and (6.4) actually coincide, with the choice µ 0 = 2m, apart from the term proportional to C A , which is singular as N → 1. This singularity arises from the 8 With respect to the original eq. 1.2 in [96], we have replaced log Q 2 Λ 2 by 1 b 0 αs(Q) (and similarly for log K 2 Λ 2 ).

JHEP01(2020)196
small-x behaviour of the fragmentation function, which diverges as 1 x . One may regularise this singularity by restricting the integration range to x min ≤ x ≤ 1, with x min of order m 2 Q 2 . This already provides the extra power of log m 2 Q 2 which appears in the C A term in eq. (6.3), but fails to reproduce the coefficient of the C A term in eq. (6.3). A refinement of this procedure is achieved by including a kinematical constraints in the form of a x dependence of the scale argument of the fragmentation function [98,99]. This is motivated by the observation that the virtuality of a particle scales approximately as x when it decays into two bodies with energy fractions x and 1 − x, in the x → 0 limit. The corresponding integral reads which is a factor 2 larger than the corresponding term in eq. (6.3). The origin of this residual discrepancy is due to dynamic (rather than kinematic) effects related to color coherence and angular ordering, as it was shown explicitly in [100]. In general, at order α p s (Q), an extra factor 2 p−1 will appear in the most singular term (∼ C A p−1 log 2p−1 Q 2 4m 2 ) of the FF-based prediction for the heavy quark multiplicity. 9 Color coherence effects are not included in our framework, and require matching with small-x resummation in order to be fully accounted for.
We conclude this section by showing, in figure 8, the comparison of the truncation of eq. (6.1) with the full result. The effect of the truncation can be appreciated both by considering the absolute multiplicity (left panel), and the ratio of the truncation up to order α s (Q), α 2 s (Q) and α 3 s (Q) over the complete result. We do not set the exponent a to zero in this case. Owing to the inclusiveness of this observable, effects are much milder than those observed in section 4: at Q = 100 GeV, the LO prediction is about 70% of the total, while the NLO one approximates the total by less than 10%. 10 This indicates how the importance of the effects described in this paper changes when inclusive observables are considered, and motivates further works assessing the impact of resummation of collinear logarithm on realistic cross sections.

Conclusions and outlook
One of the major obstacles to precision physics at the LHC and at future hadron colliders is currently given by our limited understanding of the associate production of heavy quarks (typically bottom quarks) and heavier objects. Because of their multi-scale nature, the description of such processes in perturbative QCD is highly non-trivial. In particular, processes where heavy quarks are dominantly produced via final-state splittings are JHEP01(2020)196 affected by the largest theoretical uncertainties, both due to missing higher orders and to parton-shower and matching systematics. In this paper, we make a step forward in the comprehension of the dynamics underlying these processes. We adopt a FF-based approach, which allows us to assess the impact of logarithms appearing at each order in perturbation theory, and to establish the importance of their resummation. By considering truncated FFs up to NNLO in QCD, we mimic the description of a fixed-order computation for processes involving the corresponding splittings at the same order. We investigated both bottom-initiated and gluon-initiated production of a bottom quark. In both cases, and particularly for the latter, a fixed-order description at LO or NLO turns out not to be adequate, and either NNLO effects must be included or collinear logarithms must be resummed to all orders in order to get reliable predictions for the bottom quark kinematics. When more inclusive observables are considered, these effects are (much) reduced, as it has been shown for the case of heavy-quark multiplicities.
While the limits of a fixed-order description have been known for some time in the case of bottom-initiated splitting, which is relevant e.g. for heavy-flavour production at large transverse momenta, to the best of our knowledge this has never been investigated for the gluon-initiated heavy quark production. We have discussed in details the implications of our findings in the choice of scheme to describe this kind of processes. Our analysis motivates the effort to develop techniques aimed at combining calculations matched to parton shower, which retain the advantages of the 4F and the 5F schemes in the appropriate kinematics region, as it is currently being developed for example in [41]. Furthermore, our study outlines both a similarity and a difference between the timelike and the spacelike regimes. In particular, the evolution of a gluon from a high to a low scale (timelike) or from a low to a high one (spacelike), is associated mostly with radiating gluons, and only eventually a gluon splits into a heavy-quark pair. This is true in both cases. The main difference is JHEP01(2020)196 that, in the case of spacelike splitting, the heavy quark line enters the scattering process and therefore gluon emissions are resummed by the evolution of parton distributions even in a 4FS. On the contrary, in the case of a timelike splitting, the gluon is the particle linked with the rest of the scattering, hence no resummation of these emissions is performed.
To conclude, for processes dominated by final-state g → bb splittings, the dominant contribution comes from the radiation off the parent gluon, rather than off the bottom (anti-) quark. A recent study on charm production corroborates these findings: in fact, in ref. [75] the authors comment on the importance of a proper treatment of final-state splittings and on issues related to NLO+PS simulations in a massive scheme (see figure 12 therein and the relative discussion). As a result, simulations where bottom quarks are treated at the matrix-element level might not be the most adequate to accurately describe these processes, at least in phase-space regions in which these splittings are dominant. Despite a massive scheme (or more generally a scheme where bottom quarks are generated at the hard matrix-element level) is often advocated as superior with respect to a massless one, thanks to the possibility it provides to describe the whole phase space, without cuts, we have shown that assuming the smallness of collinear logarithms, analogously to what happens with their initial-state counterpart, is not always correct and may yield serious flaws in the description of the kinematics of the b quark.
This work has several natural follow-ups. First we will improve the description of FFs by including the second-order initial-conditions in the x space, and assess the impact of large-and small-x resummation. While these improvements will make our results more consistent, we do not expect them to change the final picture in any dramatic way. Most importantly, we will assess the importance of final-state collinear logarithms on a realistic process and compare a FF-based description (both using resummed and truncated results) within a NLO-accurate computation (the inclusion of FFs in NLO subtraction schemes can be found in refs. [101,102]) with a description at fixed order in QCD and possibly with data. While typical analyses for processes like W bb and ttbb study b-jet observables (for the latter, see the recent analyses [103,104]), it is not unreasonable to expect that more exclusive quantities related to the bottom-flavoured hadrons will be measured, specially with the larger statistics of the upcoming LHC runs.

A Truncated expressions and validation
In this appendix we give explicit expressions for the truncation of the FFs up to oder α 3 s and discuss the method we used to validate numerically our analytic expressions.

A.1 Non-singlet solution
Expanding both eq. (2.30) and the initial conditions up to O(α 3 s ), and omitting the N dependence everywhere and the m dependence from the coefficients of the initial conditions, we obtain the following truncated solutions: and
11 Note that the products in eq. (2.30) are products of 2 × 2 matrices.

A.3 Bottom quark
Summing up the singlet and non-singlet combinations according to eq. (2.20), we get where we have definedγ

A.4 Validation
To conclude this appendix, we discuss how the above expressions were validated in our computer code. We base our validation on two arguments: first, given an initial condition, MELA can provide the evolution up to NNLL accuracy; second, the difference ∆D p,q ≡ |D p,N q LO − D p,N q LL | α q+1 s , (A.7) where p = b, g, . . . and q = 0, 1, 2, should be of O(α s ). Thus, by changing the value of α s (m Z ), ∆D p,q must display the same scaling. We show this scaling in figure 9, for D b (left) as well as D g (right), in N space, for q = 0, 1, 2 respectively in the top, central and bottom row. We fix the scales to µ = 200 GeV, µ 0 = 20 GeV, and the bottom mass to m = 4.7 GeV (in particular, by choosing µ 0 = m, all initial conditions are non-zero).  Figure 9. The scaling of the difference ∆D p,q , defined in eq. (A.7), w.r.t. α s , for p = b, g (left, right) and q = 0, 1, 2 (top, medium and bottom row).

JHEP01(2020)196
Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.