Effective field theory approach to heavy quark fragmentation

Using an approach based on Soft Collinear Effective Theory (SCET) and Heavy Quark Effective Theory (HQET) we determine the $b$-quark fragmentation function from electron-positron annihilation data at the $Z$-boson peak at next-to-next-to leading order, with next-to-next-to leading log resummation of DGLAP logarithms, and next-to-next-to-next-to leading log resummation of endpoint logarithms. This analysis improves, by one order, the previous extraction of the $b$-quark fragmentation function. We find that while the addition of the next order in the calculation does not much shift the extracted form of the fragmentation function, it does reduce theoretical errors indicating that the expansion is converging. Using an approach based on effective field theory allows us to systematically control theoretical errors. While the fits of theory to data are generally good, the fits seem to be hinting that higher order correction from HQET may be needed to explain the $b$-quark fragmentation function at smaller values of momentum fraction.


Introduction
The production of heavy flavored particles in collider experiments has been of great interest since the discovery of the charm quark. Because the heavy quark mass m Q is much larger than the hadronization scale Λ QCD , aspects of heavy quark production can be calculated within perturbation theory, which provides a clean test of QCD. A process of particular importance is the single inclusive production of heavy flavored mesons such as B or D mesons. At energies large compared to the meson mass, the single inclusive cross section is factored into the convolution of a short distance cross section and a fragmentation function [1,2]. The fragmentation function describes the probability of a parton produced in the hard scattering to hadronize into a heavy meson with a fraction of the parton momentum, and an inclusive sum of other particles. The important observation of Ref. [3] was that the presence of the heavy quark mass allows the heavy meson fragmentation functions to be expressed in terms of partonic fragmentation functions, which describe the evolution of partons into heavy quarks and have a perturbative expansion in α s (m Q ). The partonic fragmentation functions are then convoluted with a universal factor for the hadronization of a heavy quark into a heavy meson of the same flavor. This feature greatly reduces the number of independent nonperturbative functions to be extracted from data [3,4]. Once these heavy quark fragmentation functions (HQFFs) are determined from e + e − data, through reliable QCD factorization formulae we can predict the heavy meson production cross section at hadron colliders without further nonperturbative input.
As HQFFs are an essential ingredient in calculations of inclusive heavy meson production at collider experiments, they must be determined with care. The importance of a precise extraction of the HQFF was made clear by the resolution of a fifteen year discrepancy between theory predictions [5][6][7] and data on the transverse momentum spectrum of bottom quarks in hadronic collisions [8][9][10][11][12][13][14][15][16][17][18][19], with data exceeding theory by a factor of 2-3. In Refs. [20,21] a prediction based on a combined next-to-leading order (NLO) calculation of b-quark production with next-to-leading log (NLL) resummation of p T /m b (with p T the b-quark transverse momentum) [22], and a similarly precise (NLO+NLL) extraction of the heavy quark fragmentation function from e + e − annihilation experiments [3,23,24] was shown to agree quite well with the data. Furthermore, it was noted that the accuracy of the fragmentation function was important for obtaining agreement with data, and that the transverse momentum spectrum of bottom quarks in hadronic collisions was particularly sensitive to the N = 5 moment of the fragmentation function, which previously had not been determined precisely.
Subsequent updates from DØ and CDF experiments [25] and new data from ATLAS [26], CMS [27] and LHCb [28] experiments on B-meson production continue to find good agreement between theory and experiment, as summarized in Ref. [29]. However, over a wide range of the heavy meson transverse momentum (0 ≤ p T 40 GeV), error bars from experiments are still overwhelmed by theory errors. This puts the onus on the theory community to provide a higher precision result, i.e., N 2 LO+N 2 LL improved calculation of B-meson production in hadronic collisions. While this is a daunting endeavor it is not beyond the realm of possibility.
Actually, for other processes such as top pair production, important progress has been made on pieces of the calculation [30][31][32][33].
In this paper we focus on the extraction of the b-quark fragmentation function at N 2 LO. A precise extraction of the HQFF from e + e − annihilation data is complicated by the presence of a number of disparate energy scales. Away from the endpoint x → 1 there are two relevant scales: the large center-of-mass energy of the collision Q, and the heavy quark mass, m Q . To achieve N 2 LO accuracy in this region we need the partonic fragmentation functions at O(α 2 s ), which have been computed in Refs. [34,35]. Large single logarithms of Q 2 /m 2 Q are resummed via DGLAP evolution [36][37][38]. The calculation of the time-like splitting functions at three loops, accomplished in Refs. [39][40][41], makes possible the resummation of these large logarithms at N 2 LL.
The HQFF, however, is dominated by contributions from the endpoint region x ∼ 1, where the heavy quark carries most of the energy of the parton emerging from the hard scattering. For x ∼ 1, three additional scales become relevant, the jet scale Q √ 1 − x, which describes the invariant mass of the particles against which the heavy quark recoils, the soft scale Q(1 − x), and the hadronic scale Λ QCD , which describe the hadronization of the heavy quark into a meson. In order to provide accurate theoretical predictions in this region it is necessary to resum double logarithms of (1 − x) that appear both in the fragmentation function and in the partonic cross section [4,42]. Furthermore, since the HQFF contains both perturbative and nonperturbative effects, a systematic approach is needed not only to separate them but also to maintain universality.
In this work we use an effective field theory (EFT) approach to study the HQFF, and to derive a factorization formula, valid in the full x range, that allows us to extract the HQFF from data on e + e − → B + X at the Z pole. In Section 2 we begin with a review of the EFTs we use in our analysis; namely Soft Collinear Effective Theory (SCET) [43][44][45][46][47][48][49], and boosted Heavy Quark Effective Theory (bHQET) [50,51]. In Section 3 we consider the single inclusive cross section away from the endpoint region x ∼ 1. In this regime we rederive the established perturbative QCD result that the differential cross section can be expressed as a convolution of a hard coefficient and the heavy meson fragmentation functions for various partonic species. The bHQET expansion at leading power in Λ QCD /m Q then lets us express the HQFF as a product of perturbative functions originating from physics at the heavy quark mass scale and an overall nonperturbative coefficient.
In Section 4 we consider the factorization theorem in the endpoint region. We use a three step procedure. In the first step, which is discussed in Section 4.1, we match QCD onto SCET I integrating out virtualities of order Q 2 . Near the endpoint, the heavy quark recoils against a collimated spray of particles of invariant mass squared Q 2 (1 − x), which is still dynamical in SCET I . In the second step, in Section 4.2, we match SCET I onto SCET M [48,49]. This integrates out the jet at virtuality Q 2 (1 − x), and introduces dependences on the heavy quark mass m Q . After discussing the crossing of the b threshold in Section 4.3, in Section 4.4 we integrate out the heavy quark mass m Q , by matching SCET M onto bHQET. We thus arrive at the final factorization formula in the endpoint, and express the cross section as the product of a hard coefficient H Q , a jet function J, a mass coefficient C m , and a shape function S H/Q . Each of these objects depends on a single scale, namely the hard scale Q, the jet scale Q √ 1 − x, the mass scale m Q and the soft scale Q(1 − x) which in the rest frame of the B-meson becomes m Q (1 − x) ∼ Λ QCD . In Section 4.5 we describe the renormalization group equations (RGEs) that govern the scale dependence, and resum large logarithms of 1 − x and m Q /Q by solving the RGEs. The results in Section 4 complete the analysis of Ref. [52], which first derives the factorization of the HQFF using SCET.
In Section 5 we discuss how to separate the perturbative and nonperturbative components of the shape function S H/Q , which describes the hadronization of the heavy quark into an heavy meson. In Section 6, to give a full description of the differential cross section, we combine the two factorization theorems for moderate and large x. Crucial to this is the notion of profile functions first introduced in Ref. [53]. Finally, in Section 7 we perform a fit to data from the LEP experiments ALEPH [54], OPAL [55], and DELPHI [56], and from the SLAC experiment SLD [57]. We discuss in detail the impact of theoretical uncertainties on the fits. We conclude in Section 8. In Appendix A we collect the fixed order expressions of the various functions that enter the factorized resummed cross section, the anomalous dimensions, and give the solution of the RGEs.

Effective Field Theories
One of our main goals is to derive factorization theorems for the inclusive production cross section of a heavy hadron H using a series of EFTs with degrees of freedom of progressively smaller off-shellness. In this section we briefly summarize the most important ingredients of each EFT, establish our notation, and refer to the original literature for more details.
In high energy collisions a hard scattering process is sensitive to several, well separated, physical scales. The short distance dynamics is governed by a hard scale Q, for example the center-of-mass energy in e + e − annihilation. After the creation of high energy partons, their evolution into hadrons or jets of hadrons happens on much longer distances, and is sensitive to collinear and soft scales. SCET takes advantage of this scale separation. Degrees of freedom with virtuality of order Q 2 are integrated out, leaving as dynamical degrees of freedom collinear quarks and gluons, with virtuality p 2 ∼ Q 2 λ 2 , and ultrasoft (usoft) quarks and gluons, with even smaller virtuality p 2 ∼ Q 2 λ 4 . The SCET expansion is governed by power counting in the parameter λ ∼ Q LO /Q 1, with Q LO the next relevant scale in the problem, e.g. a jet invariant mass. In SCET different collinear sectors can only interact by exchanging usoft degrees of freedom. An important property of SCET is that usoft-collinear interactions can be moved from the SCET Lagrangian to matrix elements of external operators through a field redefinition [46], which greatly simplifies derivations of factorized forms for observables. We now summarize some SCET ingredients needed in the rest of the paper. For more details, we refer to the original papers [43][44][45][46][47][48][49]. We introduce two lightcone vectors n µ and n µ , satisfying n 2 =n 2 = 0, andn · n = 2. The momentum of a particle can be decomposed in lightcone coordinates according to Particles collinear to the jet axis have (p + , p − , p ⊥ ) ∼ Q(λ 2 , 1, λ), while usoft quarks and gluons have all components of the momentum roughly of the same size (p + , p − , p ⊥ ) ∼ Q(λ 2 , λ 2 , λ 2 ). The SCET Lagrangian can be written as where L us is the usoft Lagrangian which has the same form as the QCD Lagrangian. Each collinear sector is described by a copy of the collinear Lagrangian L n , which for massless quarks is L n =ξ n in · D n + gn · A us + ( / where ξ n and A n are collinear quark and gluon fields, labeled by the lightcone direction n and by the large components of their momentump = (p − , p ⊥ ). We leave the momentum label mostly implicit, unless needed. The label momentum operator P µ acting on collinear fields returns the value of the label, for example The collinear covariant derivative D n is defined as The Wilson line W n in Eq. (2.3) is constructed from collinear gluon fields, and obeys the equation of motion [n · D n , W n (x)] = 0. Finally, A µ us in Eq. (2.3) is a usoft gluon field, and at leading order in λ couples to collinear quarks only through the n · A us term. This coupling between usoft and collinear fields can be eliminated from the Lagrangian via the BPS field redefinition [46]: where Y n is a usoft Wilson line in the n direction with P (P ) denoting path (anti-path) ordering. Note that we chose the integration path of the Wilson line to extend to positive infinity as this is the physical direction for hadronic production in e + e − annihilation. As is discussed throughly in Ref. [67,68] the choice of boundary condition for the soft Wilson lines in the BPS field redefinition is arbitrary, however non-physical choice induce boundary Wilson lines, which "force" the physical direction as we have chosen. If one were to insist on non-physical directions of the Wilson lines in the factorization theorem (not just in the BPS field redefinition), then he would have to include nonzero Glauber contributions in the matching calculation [69]. The effect of the field redefinition is to eliminate the usoft gluon field in Eq. (2.3), and to replace the collinear quark and gluon fields ξ n and A n with their noninteracting counterparts. The same field redefinition also decouples usoft gluons from collinear gluons [46]. From here on we always use decoupled collinear fields, and drop the superscript (0). Using the Wilson line W n it is possible to construct gauge invariant combinations of collinear fields. For example, the gauge invariant quark and gluon fields for particles moving in the n orn direction are defined as χ n,ω = δ ω,n·P W † n ξ n , χn ,ω = δω ,n·P W † n ξn, These fields serve as building blocks of gauge invariant operators, like jet or fragmentation functions, as we discuss later.

SCET M
For fast moving massive particles there are additional mass terms in SCET, which appear in the Lagrangian as [48] L m = m Qξn ( / Usually the theory with the mass terms is referred to as SCET M , a useful shorthand which we will adopt as well. We work with one massive quark with mass m Q , treat the remaining n f − 1 flavors as massless, and assume that quarks heavier than m Q have been integrated out. We use q to denote both heavy and light quarks when it is not necessary to specify the quark mass, and use Q (Q) exclusively for heavy quarks (antiquarks), while l (l) denotes the n l = n f − 1 light quark flavors. Depending on the power counting, mass terms can be either leading or subleading in λ.
The combination of collinear and usoft degrees of freedom we have discussed so far is usually referred to as SCET I . There are, however, additional degrees of freedom that can be included in the theory: those with soft momenta scaling as k µ ∼ λQ. Any interaction of soft and collinear particles would result in an object with momentum scaling as p µ ∼ Q(1, λ, λ). Such excitations are not part of the EFT since they have invariant mass p 2 ∼ Q 2 λ, which is much greater than the invariant mass of soft or collinear particles. Thus no soft-collinear interaction can appear in the Lagrangian, nor can usoft-soft interactions appear. Soft degrees of freedom can, however, appear in operators as polynomials of matter or gauge fields. Any soft field must be accompanied by a soft Wilson line in such a way as to make the combination gauge invariant under soft gauge transformations. SCET formulated with collinear and soft (but no usoft) degrees of freedom is usually referred to as SCET II . An interesting subtlety in SCET II is that a rapidity regulator must be introduced to maintain the separation between soft and collinear modes as both sit on the same invariant mass curve [70,71].

Boosted Heavy Quark Effective Theory
HQET describes a heavy parton bound in a hadron. The heavy parton momentum can be decomposed as where in the heavy hadron rest frame v = (1, 0) = n µ 2 + n µ 2 is the velocity of the heavy hadron, m Q is the heavy quark mass, and the residual momentum scales as k µ ∼ Λ QCD . HQET is an expansion in Λ QCD /m Q 1, and the leading HQET Lagrangian is where h v is a two component heavy quark field which satisfies v /h v = h v . See Ref. [72] for a detailed treatment of HQET. HQET is formulated in a frame independent manner, and while it has mainly been applied to the decay of heavy mesons in their rest frame, it is equally suited to describe a heavy hadron carrying high momentum. When the heavy quark in the hadron rest frame is boosted along the n-direction, the velocity becomes v µ = n · v n µ /2 + n · v n µ /2, where n · v = (n · v) −1 = n ·p/m Q , withp the large label momentum of the heavy quark. Thus the momentum of a heavy quark in a hadron moving in the n-direction is where n ·p ∼ Q. The residual momentum of the boosted quark scales like k µ = (m Q Λ QCD /Q, QΛ QCD /m Q , Λ QCD ) but still describes soft fluctuations of the quark in the hadron with virtuality Λ 2 QCD . The application of HQET to heavy quarks in a highly boosted frame has been referred to as boosted-HQET (bHQET) in the literature [50,51]. Though bHQET is no different from HQET it is a convenient shorthand to refer specifically to the case of highly boosted heavy quarks.
A new feature appears when considering highly boosted heavy quarks: the emergence of a Wilson line. This can be immediately seen when matching the SCET M collinear heavy quark-Wilson line combination, χ n , onto bHQET degrees of freedom: (2.14) The SCET M field ξ n matches onto the heavy quark spinor h v,n (where we add a second index n explained below), and the subset of gluon fields in W † n that are soft and collinear match onto a bHQET Wilson lineW with the gluon momenta scaling as k given above. The heavy quark field is indexed by the boosted velocity given in Eq. (2.13), which has a large component in the n µ direction; hence the second index n on the field.
3 Factorization away from the endpoint x → 1 We consider the single inclusive cross section for the production of a heavy hadron H in e + e − annihilation: e + e − → H(p H ) + X. H contains a heavy quark and has momentum p H which is measured, while all other final state particles are treated inclusively (as their properties are not measured) and are denoted by X. We consider the differential cross section with respect to the variable where q = p e + + p e − is the total momentum of the colliding electron-positron pair. In the center-of-mass frame q µ = (Q, 0), where Q = q 2 , and x equals the fraction of the beam energy carried away by the hadron H, so that x ≤ 1. The lower limit on x depends on the hadron mass We are only considering fragmentation into b-flavored hadrons at the Z pole, so Q = m Z , and the lower limit is x 0.05. A factorization theorem for single inclusive hadron production in e + e − annihilation was proven using QCD factorization methods in Refs. [1,2] (see Ref. [73] for a recent discussion).
Here we rederive the same factorization theorem using SCET. We start with the expression for the differential cross section in terms of currents of quarks and leptons: where cos θ is the cosine of the angle between the momenta of the identified hadron and the electron. The first sum is over final state quarks, at LEP energies q = u, d, c , s , b, and Here m Z and Γ Z are the Z boson mass and width, α em is the fine structure constant, e q is the quark charge in units of e, and the axial and vector couplings of a fermion to the Z boson are where T f 3 is the third component of weak isospin, and θ W is the weak mixing angle. The total tree level cross section for the production of a qq pair is given by as the parity-odd axial-vector interference term vanishes when integrated over cos θ. The leptonic tensor L i µν is where p e − (p e + ) is the electron (positron) momentum. The hadronic tensor is where the vector and axial currents are given by J µ v (y) =ψ q (y)γ µ ψ q (y) , J µ a (y) =ψ q (y)γ µ γ 5 ψ q (y) . For convenience we will adopt the short-hand notation J µ i =ψ(y)Γ µ i ψ(y), leaving the flavor label q and the Dirac structure implicit. The sum over the polarizations of the final state hadron H (if H has spin) is also left implicit.
If we restrict ourselves to the region of phase space away from the endpoint x → 1 then the final state has invariant mass of order Q 2 , and the hadronic tensor in Eq. (3.8) can be matched onto operators in SCET I [47] that only involve collinear fields in the direction of the observed hadron. The hadronic tensor with the insertion of two vector or axial currents of flavor q can be expressed in terms of a transverse and longitudinal component with respect to the hadron momentum, for i = a, v. We introduced the light-cone vectors n µ andn µ , aligned with and opposite to the hadron momentum n µ = (1, sin θ cos ϕ, sin θ sin ϕ, cos θ) ,n µ = (1, − sin θ cos ϕ, − sin θ sin ϕ, − cos θ) , (3.11) and g µν ⊥ is the metric on the transverse plane, g µν ⊥ = g µν − (n µnν + n νnµ )/2. The av component of the hadronic tensor, with one axial and one vector current, violates parity and can be expressed as where ε µν ⊥ = ε µναβn α n β /2. At leading order in the SCET power counting, and taking into account current conservation and the CP properties of the vector and axial currents, only a limited number of operators can contribute to the transverse, longitudinal and asymmetric functions. For a given quark flavor q, we can express W T,L,A as where k ∈ {T, L, A}, N c is the number of colors, ω ± = ω 1 ± ω 2 , the sum is extended over all active quark flavors, the trace is over spin and color, and the dots represent terms that are suppressed by O(m 2 H /Q 2 ) or higher. In Eq. (3.13), we pulled out factors of ω + in such a way that H f,f and H g are dimensionless.
The vacuum matrix element of the operators in this equation can be related to the standard unpolarized quark, antiquark and gluon fragmentation functions, that give the probability of finding in the parton a heavy meson state H moving in the n direction with large light-cone momentumn · p H : .
These definitions agree with those in Refs. [58,59,74]. Notice that the heavy quark, heavy antiquark, light quark and gluon fragmentation functions have the same scaling in the SCET power counting. In Eq. (3.13) the fragmentation functions are weighted by the coefficient functions H k,i , which depend only on the hard scale, and have a perturbative expansion in α s . Different terms in the hadronic tensor thus appear at different perturbative orders: for a given flavor q, H T,q and H T,q start at leading order, H L,q , H L,q , and H k,g at O(α s ), and H k,f =q and H k,f =q at O(α 2 s ) [75][76][77][78][79]. H A, g vanishes at all orders, because of the charge conjugation invariance of QCD [78].
Using the definitions of the quark and gluon fragmentation functions given above in the leading term of the SCET I hadronic tensor in Eq. (3.13) and then inserting this into Eq. (3.3) gives the leading SCET I differential cross section for inclusive heavy hadron production in e + e − collisions dσ H dx d cos θ with each term expressed as the convolution of a short distance coefficient and a fragmentation function: where i = a, v for the longitudinal and transverse cross sections, i = av for the asymmetric, and the tree level cross sections σ (0) i are given in Eq. (3.4). Integrating over cos θ, we obtain the differential cross section with respect to x, which is given by the sum of the transverse and longitudinal components dσ In the rest of the paper, we will focus on this observable. The coefficient functions H k,i describe the short-distance cross section for the production of a parton of species i. They are purely perturbative, and, for massless partons, they have been computed at N 2 LO [77][78][79]. Mass corrections to the production of heavy quarks have been considered [80][81][82], but are relevant only at small x, and we neglect them. It is instructive to look at the well known NLO expressions of H T, Q and H L, Q [75,76] where C F = 4/3. The hard scattering cross section is purely transverse at LO, while it has both transverse and longitudinal components at NLO and N 2 LO. Close to the endpoint z ∼ 1, a new scale Q 2 (1 − z) appears in the hard coefficients, and logarithms of 1 − z need to be resummed. The singular behavior is all encoded in the transverse coefficient H T, Q , while up to N 2 LO H L, Q has at most integrable singularities for z → 1 [77,78]. The N 2 LO expressions are given in Ref. [77][78][79], and are too lengthy to be reproduced here. At N 2 LO, it is convenient to separate the hard scattering coefficient into flavor singlet and non-singlet components. As we discuss in Section 7, the experimental analyses of b fragmentation at the Z pole focus on the fragmentation of a primary heavy quark into a heavy meson and reject events with more than one b-quark in a hemisphere (which are associated with gluon splitting). For this reason, we will consider only the contribution of q = b in Eq. (3.18), that is we will not consider either the gluon or the flavor singlet contributions, but will concentrate on the flavor non-singlet contribution.
The second set of ingredients in Eq. (3.18) are the fragmentation functions D H/i . In the case of light quarks, the hadronic matrix elements in Eqs. (3.14), (3.15) and (3.16) are purely nonperturbative, and need to be fit to data. Sets of fragmentation functions are available for pions, kaons, and other light hadrons [83][84][85], Recently, the first extraction of the light hadron fragmentation functions at N 2 LO has been performed [86].
In the case of heavy quarks, we can take advantage of the presence of a large scale m Q , and compute the fragmentation function in perturbation theory. Inverting the expression in Eq. (3.14) gives [58] D H/Q (z) = z 2N c Xn Tr 0| δ n · p H z − n · P χ n |HX n HX n |χ n n / 2 |0 , (3.24) and similar expressions for the antiquark, the gluon and the light quark fragmentation functions. Integrating out degrees of freedom with invariant mass ∼ m 2 Q , we can match the fragmentation functions onto bHQET operators. At lowest order in the bHQET power counting, we have In bHQET, the emission of bb pairs is no longer possible, and only matrix elements with the heavy fields h have nonzero overlap with the heavy-flavored state H. Thus, the quark, antiquark, gluon and light quark fragmentation function all match onto the same bHQET matrix elements, with different coefficients d Q/i , which are the partonic fragmentation functions for a parton i to fragment into a heavy quark Q. At tree level, this can be understood directly from Eq. (3.24). Away from the endpoint m Q (1 − z) is much larger than the hadronization scale Λ QCD and at the matching scale µ ∼ m Q the term n · P in the delta function of Eq. (3.24) becomes m Q n · v, which is fixed in bHQET. Furthermore, employing the matching condition of the χ n field, Eq. (2.14), we obtain For the the second equality we used the Lorentz invariant relation |H = √ m H |H v , and, to simplify the Dirac structure, the property (1 + / v)h v,n = 2h v,n of the HQET field. We ignored the mass differenceΛ = m H − m Q between the heavy hadron and the heavy quark. The necessity of the Wilson lines in the equation above was first discussed in Refs. [87,88] in the context of color-octet operators in quarkonium fragmentation, however, the same arguments hold here for the color-triplet operator. As mentioned before the matrix element in Eq. (3.26) describes purely nonperturbative effects and can be used as definition for a nonperturbative normalization factor which satisfies the sum rule H χ H = 1, where the sum is extended to all b-flavored hadrons. The perturbative fragmentation functions d Q/i in Eq. (3.25), describing the fragmentation of a parton i into the heavy quark Q, are known at N 2 LO [34,35]. It is again instructive to look at the NLO result. At NLO, the only possible processes are the fragmentation of a heavy quark into a heavy quark, d Q/Q , and of a gluon into a heavy quark, d Q/g , which are given by [89] rather than m Q , which points to the need of resumming logarithms of 1 − z, as we discuss in the next section. On the other hand, the gluon fragmentation function has a regular behavior for z → 1, and the only logarithms that appear are logarithms of the heavy quark mass, which are resummed by the DGLAP evolution. The N 2 LO corrections to the perturbative fragmentation function were computed in Refs. [34,35]. At this order, in addition to O(α 2 s ) corrections to Eqs. (3.28) and (3.29), the first contributions to d Q/l and d Q/Q , the fragmentation functions of a light quark l and a heavy antiquarkQ into Q, arise. d Q/g , d Q/l and d Q/Q have a regular behavior in the endpoint, while the N 2 LO expression of d Q/Q contains logarithms of µ 2 /m 2 Q up to log 2 µ 2 /m 2 Q , and plus distributions up to [log 3 To eliminate gluon splittings and events with more than one heavy quark in each hemisphere, as done by the experimental collaborations, we considered the non-singlet distribution d ns = d Q/Q − d Q/Q . At all scales, d ns satisfies the flavor sum rule that is, it keeps the number of heavy quarks fixed to 1. Furthermore, d ns does not mix with d Q/g or d Q/l . We thus arrive at the final expression for the differential cross section of a primary b quark fragmenting into a heavy meson H The non-singlet hard coefficient is given at O(α s ) in Eq. (3.22) and (3.23), and at O(α 2 s ) in Ref. [79]. The non-singlet fragmentation function coincides at one loop with the quark fragmentation function in Eq. (3.28), while the O(α 2 s ) correction is given in Ref. [34]. The hard function and the fragmentation function in the factorization formula (3.31) depend on the factorization scale µ. As we discussed, and as is explicitly demonstrated in Eqs. (3.22) and (3.28), away from the endpoint the coefficient function depends solely on the scale Q, while the fragmentation function only depends on m Q . Since Q m Q , at fixed order, for any choice of µ, large logarithms appear in Eq. (3.31). These are the standard "logs of Q" which can be resummed via the DGLAP equations [36][37][38]: where P ji (ξ) are the time-like splitting functions. The splitting functions are computed in perturbation theory with the one-loop expressions [36][37][38]: The color factors in Eqs. (3.34)-(3.37) are C F = 4/3, C A = 3, T F = 1/2, while β 0 is the leading order coefficient of the beta function, From Eqs. (3.32) and (3.34) -(3.37), it is easy to see that the non-singlet fragmentation function d ns does not mix with the gluon fragmentation function, and its lowest order evolution is governed by P (0) qq only. In particular, since the integral of P (0) qq vanishes, the normalization of d ns is unchanged by the evolution, confirming Eq. (3.30). These statements extend beyond the leading logarithmic evolution.
The time-like splitting functions at O(α 2 s ) have been known for some time [75,76], and are nicely summarized in Ref. [90]. The O(α 3 s ) corrections to the non-singlet components and the singlet splitting functions P (2) qq and P (2) gg are given in Refs. [39,40], and the nondiagonal entries of the singlet matrix, P (2) gq and P (2) qg , were determined in Ref. [41]. With the calculation of the O(α 2 s ) corrections to the fragmentation function [34,35] and of the non-singlet splitting function at O(α 3 s ) [39] all the ingredients for the resummation of the non-singlet distribution (d Q/Q (z) − d Q/Q (z)) at N 2 LL accuracy are now available.
We solved the DGLAP equation for d ns directly in z space, by discretizing Eq. (3.32), following the approach described in Ref. [91]. Some detail on the solution of the DGLAP equation are given in Appendix B.

Formalism in the endpoint x → 1
The discussion in Section 3 anticipates that care has to be taken in describing the endpoint region where the momentum fraction x approaches one. In this regime the heavy quark has large energy of order Q and is accompanied by a jet-like spray of particles with energy of order Q(Λ QCD /m Q ) Q, all of which recoils against a jet of energy of order Q and invariant mass squared of order (1 − x)Q 2 Q 2 . This occurs in a background of soft interactions which change momenta on the order of (1 − x)Q or less. As a consequence the heavy quark carries almost all of the energy in one hemisphere, while the jet carries most of the energy in the other hemisphere. The soft interactions can not change the order of the jet invariant mass squared, so that the invariant mass squared of all final state particles beside the heavy hadron is p 2 The observable dσ/dx is not sensitive to the details of the final state X, which can be composed of one or more jets, but it is sensitive to the momentum squared (p 2 X ) of all final state particles beside the heavy hadron. As x increases, the final state becomes more and more collimated, leaving only one jet in the endpoint region. The sensitivity to p 2 X gives rise to large logarithms of (1 − x) in the perturbative expansion, and the jet in the endpoint must be included in our description.
In a similar way, for x ∼ 1 large endpoint logarithms appear in the perturbative fragmentation function as well, as evidenced by Eq. (3.28). These logarithms are associated with the appearance in the endpoint of a new soft scale m Q (1 − x) ∼ Λ QCD , and threaten the convergence of the perturbative fragmentation function unless they are resummed.
We thus divide the x spectrum into two regions with different power counting: The tail region was described in Section 3. In this section we study the peak region. A factorization formula for the peak region can be derived in the framework of SCET. Here we state the final result which we derive in Sections 4.1 -4.4. The cross section is thus factored into a product of four different functions, each dependent on a single scale: H Q (Q, µ) is the hard function, which encodes hard momentum fluctuations with scaling µ Q ∼ Q; Jn(Qr, µ) is the jet function describing the collinear final state that recoils against the heavy hadron, with typical scale includes all the dependence on the heavy quark mass and thus has typical scaling µ M ∼ m Q ; the shape function S captures physics at the scale of the residual momentum of the heavy quark inside the heavy meson, ω ∼ m Qn · v(1 − x), which, in the heavy quark rest frame, is the nonperturbative scale µ S ∼ m Q (1 − x). 1 As explained in Section 5, we further divide the shape function in a perturbative (S Q/Q ) and nonperturbative (S hadr H/Q ) component. We obtain Eq. (4.1) by using a tower of EFTs. Hard momentum fluctuations of order O(Q) are removed by matching QCD onto SCET I , as detailed in Section 4.1. At lower momentum the heavy quark mass becomes nonnegligible whereas the invariant mass of the jet can be treated as a high energy scale and can be removed from the theory. This is done by switching to SCET M . In Section 4.2 the factorization in SCET M and its matching onto SCET I are discussed. The separation of the dynamics at the scale m Q and Λ QCD ∼ m Q (1−x) is achieved by matching SCET M onto bHQET, as we discuss in Section 4.4. Furthermore, bHQET allows the factorization of the nonperturbative dynamics of the heavy quark inside the hadron.
The factorization formula in Eq. (4.1) is equivalent to that derived in Refs. [4,42] in the framework of perturbative QCD. As discussed in more detail in Section 6, an important check of Eq. (4.1) is that, at fixed order, the product of the hard function H Q and the jet function Jn(Qr) and the product of the mass coefficient C m (m Q ) and the shape function S Q/Q reproduce, respectively, the soft, z → 1, limit of the coefficient function H ns (z) , and of the fragmentation function d ns (z) in Eq. (3.31).

Matching onto SCET I
To derive the SCET I factorization formula, we go back to the hadronic tensor introduced in Eq. (3.8), In the endpoint, the final state X has virtuality p 2 X Q 2 , and is still dynamical after the hard scale is integrated out. Thus, we match the QCD currents J µ i given in Eq. (3.9) onto the SCET current for the production of two back-to-back jets, The matching coefficient C has been computed to O(α 3 s ) [92], and it is the same for vector and axial current. In this work, we will need the two loop expression derived in Refs. [93][94][95][96], which we quote in Appendix A.

Using Eq. (4.3) in the hadronic tensor gives
where we have implicitly split the position integral in Eq. (4.2) into a sum over labels and an integral over the residual part of the coordinates. The time-ordering (T) and anti-timeordering (T) are relevant for the proper ordering of the usoft field in the Wilson lines Y n,n and Y † n,n [50]. The sum over labels fixes the label momenta of the currents to be ±Q. Next, we decompose the final state |HX into its collinear, anticollinear and soft components, |HX = |Xn |X s |HX n , and rearrange the color and spin indices in Eq. (4.4) to be singlets in each sector: In then-collinear and soft sector, the sum is over a complete set of states, and can be replaced by the identity. Then-collinear matrix element can be expressed in terms of the inclusive jet function The jet function depends only on the off-shellness p 2 X of the jet in then direction, and we choose to work in a frame where the jet moving in then direction has no transverse momentum with respect ton, r ⊥ = 0. In this frame, p 2 X = Qr, and the dependence of the jet function on y − and y ⊥ reduces to delta functions. Label momentum conservation forces the label ω = −Q. The jet function in Eq. (4.6) is the same as the one that appears in the thrust distribution [98], and it is known to two loops [99].
The delta functions in Eq. (4.6) force the remaining two matrix elements to depend only on y + . We define the soft function as and using this definition along with the definition for the jet function, Eq. (4.6), in Eq. (4.5) we arrive at However, as we noted previously, there can be no radiation that is collinear (with any softcollinear overlap removed) to the heavy quark as this would result in a final state with invariant mass of order Q 2 which is not part of the endpoint regime. Thus in the endpoint the sum over X n only has a nonvanishing contribution from X n = 0, and we can make further simplifications: x where we usen · p res H = −Q(1 − x), and expanded x around 1 in the last line. Using this in Eq. (4.8) we arrive at the factorized expression for the hadronic tensor in SCET I where H Q (Q, µ) ≡ |C(Q, −Q)| 2 is the hard matching coefficient.

Matching onto SCET M
Matching SCET I onto SCET M removes virtualities of order Q 2 (1 − x). Since the ultrasoft contributions from SCET I describe fluctuations of order There is no ultra-soft function in SCET M since all ultra-soft Wilson lines are contracted to the same point in space-time and therefore cancel. While the factored form of the differential cross section in SCET M looks nearly identical to the one in SCET I the power counting in the two theories is different. In SCET M , λ = m Q /Q, and in the definition of the soft functioñ S the ultra-soft Wilson lines Y n in Eq. (4.7) have to be replaced by soft Wilson lines S n in which the gluon momenta scale as (λ, λ, λ) Q ∼ (m Q , m Q , m Q ) and are completely decoupled from the collinear degrees of freedom. While Eq. (4.11) is formally correct it is neither convenient for calculating the running of the SCET M differential cross section, nor for calculating the matching coefficient when the scale m Q is integrated out. While both the running and matching can be determined indirectly, it is both edifying and gratifying to have a direct calculation of each. Towards this end we reorganize SCET M by explicitly separating out the collinear quark mode that has label momentum p µ = m Q v µ , where v µ is the heavy quark velocity. We will call this mode the massive-bin. In addition, we will separate out soft-collinear modes from soft modes. After subtracting UV divergences from which the anomalous dimension can be extracted, the combination of the massive-bin and soft-collinear modes matches directly onto the HQET shape function, while the remainder gives the matching coefficient.
To be specific, once again consider the scaling of the SCET M degrees of freedom: collinear momenta scale as p µ c ∼ (m 2 Q /Q, Q, m Q ) and soft momenta scale as k µ s ∼ (m Q , m Q , m Q ). Below the scale m Q , the correct EFT is HQET in a boosted frame, where the heavy quark has momentum The residual momentum sets the scaling for the gluonic and light quark degrees of freedom. Note that the heavy-quark degree of freedom has In the endpoint there can be no real collinear radiation into the final state due to momentum conservation, and as a consequence the collinear factor C H/Q (µ) is purely virtual. Using dimensional regularization (DR) the naive one-loop virtual collinear contribution (including the heavy quark self-energy contribution) is: To find the virtual soft piece V s we take the naive soft contributionṼ s and subtract the overlap with the soft- Finally we need to include the massive-bin virtual contribution V m . At one-loop in DR this is zero. Thus the total virtual contribution is (4.14) The collinear contribution to real radiation is zero, which leaves only soft and softcollinear radiation. Once again in DR the real soft contribution is zero. Thus the total real contribution is given by real soft-collinear radiation in the massive-bin, R tot = R m : Adding Eq. (4.12) to Eq. (4.15) gives the total SCET M amplitude. First let us consider the pieces that are singular in the → 0 limit: This expression is the z → 1 limit of P qq in Eq. (3.34), and satisfies the consistency condition is the jet-function counter-term, and is the counter-term for the hard coefficient H Q (Q, µ).
Including the counterterm results in a finite expression for the NLO SCET M amplitude, and, as we will see in Section 4.4, the finite part of the real diagrams R m in Eq. (4.15) is exactly reproduced by the bHQET shape function. Thus what is left over in the matching between SCET M onto bHQET is the finite part of Eq. (4.12), which gives the matching coefficient C m (m Q , µ).

Flavor threshold
In SCET M we can run the theory down to the b mass m b . At this scale we match the theory with five active flavors to a theory with four flavors, where b quarks are frozen out. The matching can be done at any scale µ M ∼ O(m b ), and, in our error analysis, we varied the flavor threshold µ M between m b /2 and 2m b . To implement the flavor threshold we set n f = 4 at scales smaller than µ M , including the HQET matching coefficient at µ M 2 , and n f = 5 above. Especially for the running it is crucial to run with n f = 4 below µ M and n f = 5 above µ M to preserve the consistency relations.
In addition, two loop diagrams in which the heavy quark emits a gluon, and the gluon splits in a QQ pair are not present in the four flavor theory, and are reproduced by including a matching coefficient C thr (z) [52] starting at O(α 2 s ). The two-loop coefficient c thr 2 is given in Eq. (A.3) and was obtained from Ref. [34] by taking the most singular terms of the expression The expression in Eq. (A.3) suggests that the threshold coefficient c thr 2 (z) depends both on the scales m Q and m Q (1 − z), giving rise to unresummed large logarithms of 1 − z. As shown in Refs. [101][102][103], these large logarithms are rapidity logarithms, that arise because of the rapidity separation of collinear and soft secondary massive quark modes, and can be resummed by solving a rapidity RGE. Because of the limited numerical impact of the threshold matching coefficient, we chose not to resum this class of logarithms.
The switch from n f = 5 to n f = 4 at µ M has also to be implemented in the running of α s . We follow the procedure described in Ref. [104]: we run with five flavors from m Z to µ M , then we use the pole mass decoupling relation given in Refs. [104,105], and continue the running with four flavors below µ M 3 . This procedure leads to a discontinuity in α s (µ) at the flavor threshold. We neglect the effects of other flavor thresholds. Note that for programming reasons we choose to use α s (µ M , n f = 5) in the HQET matching coefficient, even though n f = 4 should be used. We have checked that the error introduced through this approach is negligible.

Matching onto bHQET
The final step in obtaining the factorization formula in Eq. (4.1) requires integrating out the mass of the heavy quark, m Q Λ QCD . This is achieved by matching the product of the soft and collinear functions in SCET M onto a bHQET shape function 2 The matching to HQET is formally done after the matching at the flavor threshold. This can be reversed, but one has to be very careful in recalculating the matching coefficients. 3 Even though we utilize the pole mass relations, the mass used in the calculation is the 1S-mass. This leads to logarithms of m pole /m1S, which can be neglected. where the fragmentation shape function S H/Q (ω, µ) is defined as [52] S Here ω corresponds to the residual momentum of the heavy meson and of the soft particles moving collinear to its direction, and is of order O(n·v Λ QCD ). The factor of m H in Eq. (4.22) arises from the normalization of bHQET states. In order to create a heavy meson H with mass m H > m Q , the residual momentum needs to be larger thann · vΛ, withΛ = m H − m Q , implying that the shape function has support in the region ω/n · v ∈ [Λ, +∞). For simplicity, in what follows we will use the variableω = ω −n · vΛ, with support in [0, +∞). The residual momentumω is related to the momentum fraction byω = m Qn · v(1 − z)/z ∼ m Qn · v(1 − z).
(Recall the boost from the center-of-momentum frame to the heavy quark rest frame induces the factorn · v = Q/m Q .) The one-loop matching onto SCET M requires the computation of the diagrams in Fig. 1, which, subtracting the poles in the MS scheme, yield The only physical scale appearing in C m is the scale of the heavy quark mass. The one loop calculation also allows us to extract the anomalous dimension of the shape function S H/Q (ω, µ) and of the matching coefficient C m (m Q , µ), which we give in Appendix A. In Ref. [52] it was shown that the perturbative expression of the fragmentation shape function equals the shape function that appears in B decays at all orders. Using this result, the two-loop shape function can be extracted from Ref. [106], and the two-loop mass coefficient C m is obtained by subtracting the two-loop shape function from the z → 1 limit of the perturbative fragmentation function in Ref. [34]. Substituting Eqs. (4.22) and (4.11) in the differential cross section (3.3) and integrating over cos θ, we arrive to an expression that closely resembles our final factorization formula (4.1). The final step consists in expressing the bHQET shape function as a convolution of a perturbative piece S Q/Q and a nonperturbative hadronization model, S hadr H/Q . We discuss this step in Section 5.

Resummation
The endpoint factorization formula, Eq. The µ dependence of the hard, jet, soft and mass functions is governed by RGEs that resum these large logarithms, a resummation that, as we will see in Section 7, is crucial to achieve a good description of the data.
The hard and mass coefficients satisfy the renormalization group equations  Table 1. Order counting for the resummation in the endpoint, x ∼ 1. L denotes the resummed logarithms, e.g. ln(1 − x). In the third to sixth columns we indicate the loop order at which each ingredient is needed to achieve a given logarithmic accuracy.
with anomalous dimensions given by  x), respectively. We describe our choice of scales, and the scale variations we used to assess the residual scale dependence in Section 6.
In Table 1 we summarize the ingredients needed to achieve approximate N 3 LL accuracy in the endpoint. We count logarithms in the exponent of the RGE kernels U I (µ, µ I ), with I ∈ {H, J, M, S}, and in the second column of Table 1 we indicate the logarithmic series that we resum at each order. In the third to sixth columns we show the loop order at which the cusp and non-cusp anomalous dimensions, the fixed order expressions of the hard, jet, mass and shape functions, and the QCD β function are needed. The difference between the primed and the unprimed counting schemes is that in the primed scheme all fixed order series are considered at one order higher with respect to the unprimed [53,110]. All ingredients to achieve N 2 LL and N 2 LL resummation, namely the three-loop cusp anomalous dimension and QCD β function, the two-loop non-cusp anomalous dimension and the two-loop fixed order expression of each function, are known [34,99,106,109]. Most ingredients for N 3 LL resummation are also known, in particular the four-loop QCD β function, and the threeloop non-cusp anomalous dimension of the hard and jet function [99,109]. The missing ingredients are the four-loop cusp anomalous dimension Γ 3 , and the three-loop non-cusp anomalous dimension of the mass and shape functions, γ S 2 and γ M 2 , of which only the sum γ S 2 +γ M 2 is known. In our analysis, we use the Padé approximation for the unknown coefficient Γ 3 , where Γ 2 and Γ 1 are the three-and two-loop cusp anomalous dimension, and we vary e Γ between −2 and +2. For γ S 2 and γ M 2 , we use and fix the coefficients c M and c S by imposing that γ M 2 + γ S 2 equals the z → 1 limit of the anomalous dimension of the fragmentation function, Eq. (A. 35), both for n f = 5 and n f = 4. In our theory error budget, we vary the parameter e γ between −2 and 2. As discussed in Section 7, by varying e Γ and e γ and by comparing with the exact N 2 LL results we find the errors induced by missing orders in the cusp and non-cusp anomalous dimensions to be negligible.
The counting discussed so far applies to the resummation of double logarithms that appear in the endpoint. Away from the endpoint the only relevant logarithms are single logarithms of m Q /Q, which are resummed by solving the DGLAP equation. For these logarithms, we adopt the standard nomenclature, that is we denote by N k LL the resummation of terms of the form α n s L n−k , achieved with (k + 1)-loop splitting functions and k-loop initial condition. The maximum order we work at is N 2 LL, which requires three-loop time-like splitting functions, and two-loop fragmentation function.

Nonperturbative effects
The shape function (4.23) encodes physics at the scale Λ QCD , and is a nonperturbative object, which, like the parton distributions or the light quark fragmentation functions, needs to be extracted from data. Following Ref. [111], we express the HQET shape function Eq. (4.23) as a convolution of a partonic piece S Q/Q , which is computed perturbatively, and a hadronic nonperturbative piece S hadr H/Q S H/Q (ω +n · vΛ, µ) = Note that in the partonic picture the hadron and heavy parton masses are equal, m H = m Q . This means thatΛ = 0 and S Q/Q has support on [0, ∞). On the other hand S H/Q still has support on [Λ, ∞), as we made explicit by using the variableω = ω −n · vΛ. The nonperturbative function S hadr H/Q (ω) is then expanded in a complete set of orthonormal functions, as described in Ref. [111]: where N H is the overall normalization of the shape function, the parameters c i are normalized i c 2 i = 1 and the basis functions are orthonormal. It is convenient to express f n in terms of the Legendre polynomials P n , where the change of variables y(x) maps the interval [0, ∞) into [−1, 1], The shape function (5.2) is thus parametrized by the dimension one parameter λ, the dimensionless parameter p, and N independent coefficients c n . For N = 0, which is the model studied in Ref. [52]. The advantages of Eqs. (5.1) and (5.2) are many, and are discussed in detail in Ref. [111], where this representation was devised for the B meson shape function that appears in B decays. The most important are: • the shape function has, by construction, the correct dependence on the renormalization scale µ. This is captured by the perturbative function S Q/Q (ω, µ), which manifestly satisfies the RGE, • the moments of S hadr H/Q (ω) are finite, and are related to matrix elements of local HQET operators, • after renormalon subtraction, the perturbative and nonperturbative components of the shape function are clearly factorized, • the uncertainty related to the unknown functional form of the shape function can be estimated by increasing the number N of terms in the basis.
The arguments of Ref. [111] were developed for the decay shape function, but can be easily extended to the fragmentation shape function. Here we briefly discuss the relation between the moments of the shape function and the matrix elements of local HQET operators, and the subtraction of renormalon ambiguities.
Forω in the perturbative range,ω/n · v Λ QCD , the shape function can be expanded as a series of local operators For n ≤ 2, the only possible operators in the operator product expansion are of the form while more complicated structures, like four-fermion operators with heavy and light quark fields, can appear for higher n. The matching coefficients C n (ω, µ) can be computed by taking the matrix element of both sides of Eq. (5.6) between quark states with zero residual momentum. For n ≤ 2, one can prove that In a similar way, forω ω we can expand Eq. (5.1) as The relation to matrix elements of local operators is extremely useful in B decays, since it relates the first two nontrivial moments of the hadronic shape function to well known matrix elements,Λ and the matrix element of the kinetic operator λ 1 = B|h v (iD) 2 h v |B [111]. In the case of fragmentation, Eq. (5.10) fixes the normalization of S hadr H/Q to the nonperturbative parameter χ H defined in Eq. (3.27), Since we will be fitting to data for the production of all possible b-flavored hadrons, we can use the sum rule H χ H = 1 and normalize the hadronic shape function to 1. The first moment of S hadr H/Q is related to the matrix element which is an unknown nonperturbative quantity. Thus, Eq. (5.10) does not put strong constraints on the fit to e + e − data that we perform in Section 7, but rather the fit allows to extract unknown matrix elements like O 1 . Eq. (5.1) hints at a separation of perturbative and nonperturbative contributions to the shape function. The former are captured by S Q/Q , whose expansion in α s we give in Appendix A, the latter by the parameters of S hadr H/Q , which are fit to data. However, it is known that in schemes like the pole mass scheme the shape function and its moments suffer from renormalon ambiguities [111,112]. For example, in the pole mass scheme the heavy quark mass m pole Q , and thus the first moment of the decay shape functionΛ pole = m H −m pole Q , are sensitive to infrared dynamics [113,114]. Using the pole mass in the perturbative calculations therefore introduces an ambiguity into the factorization of long-and short-distance physics in Eq. (5.1), which makes the position of the peak in the shape function, and the parameters in S hadr H/Q , unstable with respect to the perturbative expansion in α s . To remove this ambiguity one has to switch to a suitable short-distance mass scheme. This is done by introducing an additional scale at which perturbative short-distance and nonperturbative long-distance physics are separated, the subtraction scale R. The pole mass can then be replaced by m pole Q =m Q (R, µ)+δm Q (R, µ) withm Q (R, µ) independent of long-distance effects below R.
The renormalon subtraction amounts to shifting perturbative corrections between S Q/Q and S hadr H/Q . Here we closely follow the prescription of Ref. [111], in which a renormalonfree perturbative kernelŜ Q/Q is achieved by demanding that the moments of S hadr H/Q are free of renormalon ambiguities. At O(α 2 s ), this can be accomplished by defining m Q and λ 1 in short distance schemes. We use the 1S scheme for the heavy quark mass [115,116], and the "invisible scheme" introduced in Ref. [111] for the B meson kinetic energy. We summarize the relevant formulae in Appendix A.3.
Note that in Ref. [111] the renormalon subtraction was derived for B decays, not fragmentation. However, Neubert [52] argues that the perturbative expression of the HQET fragmentation shape function is identical to the perturbative shape function in B decays, at all orders of perturbation theory. Since the renormalon is subtracted by a shift of perturbative terms between S Q/Q and S hadr H/Q , the same subtraction terms which fix the renormalon ambiguity in B decays also fix it in fragmentation.
The effect of the renormalon, and of using a consistent short distance scheme for m Q , is most important in the endpoint. Away from the endpoint, the perturbative fragmentation function of Ref. [34,35] was computed in the pole mass scheme. In the numerical evaluations we nevertheless use m Q = m 1S Q = 4.66 GeV [117]. This formally induces an error at O(α 2 s ), which we checked and found to be extremely small.
Finally, we notice that, in order to combine the theoretical predictions in the endpoint and the tail, it is important to use the same nonperturbative model for both regions. This involves some ambiguity, since the convolutions in the endpoint are naturally done in momentum space, while for x away from the endpoint it is convenient to retain a form in which the Mellin moments of the perturbative cross section and of the model factorize. To achieve this, we replace Eq. (3.25) with For z ∼ 1, this convolution is equivalent to (5.1), as we illustrate in the next section. As discussed in Section 3, away from the endpoint the nonperturbative physics should be described by a single parameter, not a shape function. However, for x 1, replacing Eq. (3.25) with Eq. (5.13) amounts to include a series of power corrections, suppressed by powers of Λ QCD /m Q . Since we are working at leading power in Λ QCD /m Q , using Eq. (5.13) is justified.
In our analysis we set all c i≥1 to 0, and use Eq. (5.5) as the model function. Thus we fit for only one parameter: λ. We checked that adding more coefficients c i has a negligible effect and choose to use variations in p, instead of c i , to estimate the hadronic uncertainty since this appears to be the more conservative choice. We vary p between 3 and 5 with a default value of 4. The first moment of the model function (5.5) is given byn · vλ, and, using Eq. (5.10), we find Since the residual momentum is always greater thanΛ, λ is a positive number of O(Λ QCD ). As discussed in Section 7, we find good fits to e + e − data for λ ∼ 0.5 GeV.
6 Extending the description to the full x spectrum In Sections 3 we discussed the factorization of single inclusive hadron production in e + e − annihilation away from the endpoint, and how large logarithms of the ratio of the quark mass and center of mass energy Q are resummed by the DGLAP evolution of the fragmentation function. We then discussed how for x ∼ 1 two new scales arise, the jet scale Q √ 1 − x and the nonperturbative scale m Q (1 − x) ∼ Λ QCD . An accurate description of the endpoint region thus requires the resummation of logarithms of 1 − x, and the inclusion of a nonperturbative component of the fragmentation function.
In this section, we discuss how to combine the two descriptions, in order to describe the differential cross dσ/dx in the full x range. We express the differential cross section as: The first term in Eq. (6.1) is the QCD cross section discussed in Eq. (3.31), and includes the resummation of single logarithms of m Q /Q through the DGLAP evolution. This term gives an accurate description of the intermediate x region, but it lacks the resummation of logarithms of 1 − x and the nonperturbative effects that are needed to reproduce the peak.
The last term is the endpoint cross section of Eq. (4.1). In this expressions, single logarithms of m Q /Q and double logarithms of 1−x are correctly resummed, and a nonperturbative shape function describes the hadronization of the heavy quark into a hadron. However, dσ E /dx does not contains powers of 1 − x, which, as one moves away from the peak region, become more and more important. In order to obtain a correct description in the whole range, and to avoid double counting between the endpoint and the QCD regions, we have to subtract the second term in Eq. (6.1), which is equal to the endpoint cross section, with the resummation of logarithms of 1 − x turned off. In practice turning off the resummation is accomplished by setting the soft scale equal to the mass scale µ S = µ M , and the jet scale equal to the hard scale µ J = µ H . As is possible to explicitly verify using the formulae in App. A, when µ J = µ H the product of the hard coefficient H Q and the jet function Jn reproduces the x → 1 limit of the QCD hard coefficient, given at one loop in Eqs. (3.22) and (3.23), and at two loops in Ref. [79]. The anomalous dimension for the product H Q × Jn is the x → 1 limit of the QCD splitting functions. In a similar way, when µ S = µ M the product of the mass coefficient C m and the bHQET shape function gives the z → 1 limit of the QCD fragmentation function, and the anomalous dimension of C m × S equals the z → 1 limit of P qq . These properties guarantee that the subtraction term in Eq. (6.1) will exactly cancel dσ QCD /dx as x approaches 1, leaving only the resummed endpoint cross section.
In order to ensure that away from the endpoint only the QCD cross section contributes we need the subtraction and endpoint terms to cancel for small x. We accomplish this by using profile functions; that is by choosing x-dependent jet and soft scales. To determine the profile functions, we first choose the x-independent hard and mass scale as follows where e H and e M are free parameters that we vary between 1/2 and 2. The jet and soft scales must approach µ H and µ M in the limit x 1, and must have the correct scaling, namely µ J ∼ Q √ 1 − x and µ S ∼ m Q (1−x), in the resummation region. At very large x, we also need to make sure that all scales remain perturbative. To satisfy these requirements, we choose the jet and soft scale as 3) where the coefficients a j,s , b j,s , c j,s and d j,s are chosen so that the profile functions are continuous, and with continuous derivatives. An illustration of the behavior of the profile functions is shown in Fig. 2.
Our choice of scales is thus determined by six parameters, e H , e M , x 1 , x 2 , µ s 0 , and µ j 0 , that we vary in order to assess the dependence of the theoretical cross section on missing  x). For x < x 1 the resummation is quickly turned off, and µ J and µ S become equal to µ H and µ M for x ∼ 0.5. In the fits to data, we choose x 1 = 0.8, close to the peak of the heavy quark fragmentation function, and vary x 1 between 0.7 and 0.9. Our default choice for x 2 , is x 2 = 0.96, and vary it between 0.9 and 1.
In Fig. 3 we plot: the N 2 LO endpoint cross section dσ E /dx with N 3 LL resummation of logarithms of 1−x (blue curve) and without resummation of logarithms of 1−x (orange curve), the N 2 LO QCD cross section dσ QCD /dx (green curve), which includes only the resummation of DGLAP logs, and the combined cross section, dσ/dx, given by Eq. (6.1) (red curve). One can see that for x ∼ 0.5, the resummation is turned off, and the contribution of dσ E /dx is completely canceled by the subtraction term in Eq. (6.1), leaving only the QCD contribution. On the other hand, at large x, the QCD cross section is dominated by terms singular in 1 − x, which are captured by dσ E /dx fix , so that the combined cross section lines up with the endpoint resummed cross section. These curves are produced using the model function in Eq. (5.5), with p = 4 and λ = 0.5 GeV. As discussed in Section 5, in our calculations we apply the nonperturbative model to all x. While this prescription is correct in the endpoint, it introduces an error in the tail region where the nonperturbative shape function should reduce to a single normalization constant χ H . However, the size of the mistake we are making by multiplying the nonsingular terms by the model is of order O(Λ QCD /m Q ). This is of the same size as other power corrections which we neglect, and therefore justifies our treatment.
7 Fits to e + e − data at the Z pole The inclusive production of b-flavored hadrons in e + e − annihilation at the Z pole has been measured by ALEPH [54], SLD [57], OPAL [55] and DELPHI [56]. All experimental collaborations give the normalized distribution 1/N dN/dx, where x is the momentum fraction of the weakly decaying B + , B 0 d and B 0 s mesons. The weakly decaying mesons are either produced directly after the hadronization phase, or result from the decay of a primary B * * and B * meson. All experiments require the B meson to be measured in a bb event, with one b orb quark in each hemisphere (the hemispheres are defined with respect to the event thrust axis). This requirement effectively eliminates the contributions of gluon or light quarks splittings into bb pairs. Some experiments (e.g. DELPHI) assign events with four b-quarks, which also require g → bb splittings, to the background.
We fit the theoretical cross section simultaneously to all available data. Our fits include the correlation matrices given in the experimental papers. SLD only provides statistical correlations, not systematic correlations [57]. We therefore treat the SLD data as having no systematic correlations, though for the other experiments the systematic correlations are larger than the statistical ones. 4 OPAL quotes asymmetric systematic errors and correla-tions [55]. To simplify the analysis we symmetrized the correlations by using the arithmetic mean of the positive and negative correlations. While the correlation matrices should be positive semidefinite, with zero eigenvalues in the case of complete correlations between two measurements, some of the experimental correlation matrices contain negative eigenvalues. Since the experimental bins are highly correlated especially in the far-tail of the distribution (in the case of the OPAL experiment the bins are completely correlated at small x), the errors due to rounding the entries of the correlation matrix can cause negative eigenvalues. In a situation where some bins are highly correlated it is not sensible to treat them as independent degrees of freedom. Thus we fit only to the most significant eigenvalues of the data. We follow the prescription of DELPHI [56] and take the effective number of degrees of freedom for ALEPH, OPAL, and DELPHI as 7, 5 and 7, respectively. For SLD we use all 22 bin values since, as mentioned earlier, the systematic correlation matrix is not given and the statistical correlations are modest. The inclusion of OPAL data, even with the reduced weight assigned to them by the DELPHI procedure, leads in general to worse fits.
We perform fits to the theoretical cross section computed at different orders. We denote by N 2 LO + N 3 LL the cross section with O(α 2 s ) fixed order expressions of the hard coefficient and perturbative fragmentation function, N 2 LL resummation of DGLAP logarithms, and N 3 LL resummation of endpoint logarithms, ln(1 − x). N 2 LO + N 2 LL differ from N 2 LO + N 3 LL only in the endpoint, where the resummation is carried out at N 2 LL . Finally, NLO + N 2 LL denotes the cross section with O(α s ) matching, NLL resummation of DGLAP logs, and N 2 LL resummation in the endpoint.
The N 2 LO + N 3 LL theory description includes 10 theoretical parameters, beside the nonperturbative parameter λ that we fit to data. Table 2 lists all theory parameters together with their default values used for the best fit result and the range of variation we used in the error analysis. The six parameters e H , e M , µ j 0 , µ s 0 , x 1 and x 2 govern the scale dependence of the theoretical prediction. µ H = e H Q, with Q = 91.2 GeV at the Z pole, is the hard scale in the process at which the hard scattering coefficient H Q is evaluated. µ M = e M m Q is the scale associated with the heavy quark mass. µ j 0 , µ s 0 , x 1 and x 2 enter the parameterization of the profile functions, discussed in Section 6. µ j 0 and µ s 0 are the minimal values that the jet and soft scales can assume. x 1 and x 2 determine the region where the resummation of logarithms of 1 − x is important.
A complete N 3 LL resummation requires the knowledge of the four-loop cusp anomalous dimension Γ 3 and of the three-loop non-cusp anomalous dimension of the shape function. At the moment, these two quantities are not known. To estimate them, we use the Padé approximation, and allow for 200% variations. For α s we use as our central value the world average, α s (m Z ) = 0.1185, and vary it within the error quoted by the PDG [117]. The final parameter in Table 2, p, is used to gauge the dependence of the fits on the hadronization model, Eq. (5.5).
To estimate the theory uncertainty we vary the parameters in Table 2 and calculate the best fit for each parameter set. We do this for all possible combinations of each parameter's  default value and up and down variation. 5 For the N 2 LO + N 3 LL analysis, we consider 45927 theory settings. For the lower order fits, N 2 LO + N 2 LL and NLO + N 2 LL, it is not necessary to include Γ 3 and γ 2 S which reduces the number of theory settings to 5103. To make our theory comparable to the experimental results we normalize the theoretical distribution by the integral of the hard coefficient To compute the χ 2 , we bin the theoretical distribution, that is we integrate the theory distribution over the extent of bins used by the experimental collaborations, and divide by the size of the bins. The best fit value of λ is the minimum of the χ 2 interpolating function. The 1σ statistical uncertainty is obtained by finding the values of λ for which χ 2 = min(χ 2 ) + 1. When working at NLO + N 2 LL, we have to extend the χ 2 -grid to include λ = {0.575, 0.625, 0.650, 0.700} GeV.

N 2 LO + N 3 LL fits
We obtain the central value of the parameter λ by setting the theoretical parameters to their default settings, summarized in Table 2. We consider two cases: 1. we perform simultaneous fits to all data over the complete x range available, 2. we exclude data from OPAL. 5 We did constrain µ j 0 to always be larger than µM = eM mQ and µ s 0 to always be smaller than µM . The best fit value of λ, the statistical error and the χ 2 per degree of freedom (χ 2 /dof ) in these two scenarios are Eqs. (7.2) and (7.3) make it clear that including data from the OPAL experiment pushes λ to higher values and noticeably worsens the χ 2 /dof . A closer look at fits to individual experiments reveals that the fits to ALEPH, DELPHI and SLD are good, with χ 2 /dof between 1.5 and 0.8. However, the fit to OPAL data only is poor, with χ 2 /dof = 13, and does not ruin the simultaneous fits to all experiments only because of the small weight assigned to OPAL by the DELPHI prescription [56]. The disagreement between the theoretical cross section and the OPAL data is mostly in the tail, and the effect on the χ 2 is amplified by the small error on the data points in this region. As we will discuss in greater detail later in this section, we have indications that the theoretical setup we have adopted is overestimating the tail of the distribution, and that power corrections of O(Λ QCD /m Q ), which we have not included, could lead to a better description of the data. Since our theoretical cross section describes the OPAL data poorly, we will first focus our discussion on fits to ALEPH, DELPHI and SLD, and include OPAL data afterwards. The data from the LEP experiments ALEPH and DELPHI, and the SLAC experiment, SLD, show some tension in the peak region of the differential cross section. We checked that excluding SLD data from the fit has a negligible effect on the value of λ and on the quality of the fits.
We now turn to the discussion of the theoretical error, that we estimate by varying the theory parameters in the ranges described in Table 2. In Fig. 4 we show the distributions of χ 2 /dof and best fit value of λ for the N 2 LO + N 3 LL fits to ALEPH, DELPHI and SLD data. The λ and χ 2 /dof obtained with the default theory setting is denoted by a black dot. While in principle the best fit value of λ and the goodness of the fit should not be impacted by (reasonable) choices of the theory parameters, we notice that at N 2 LO + N 3 LL the cross section is still quite sensitive to the theory settings, in particular to the scale µ M , where we evaluate the QCD fragmentation function and start the DGLAP evolution. Changing µ M from 4.66 GeV to 9.32 GeV or 2.33 GeV has the effect of both considerably shifting the best fit parameter, and changing the quality of the fit. The choice µ M = m Q /2 gives noticeably worse fits. In this case, the lowest value of χ 2 /dof is 2.2, and the average χ 2 /dof is much larger, 3.75. On the other hand, 86% of the fits with µ M = 4.66 or 9.32 GeV have χ 2 /dof < 2.
In Fig. 5 we show the distribution of λ and of σ λ , the statistical error on λ, in the N 2 LO + N 3 LL fits. We see that, for each value of µ M , varying the remaining theory parameters has little effect, resulting in a distribution with width of 10 MeV around the default values. However, the three distributions for different µ M have little or no overlap, signaling a spurious dependence of the best fit value of λ on the truncation of the perturbative expansion of the cross section. The second histogram shows that the statistical error on λ is roughly the same, at least for the values of µ M which give good fits. Notice that the distance between the values of λ obtained with different choices of µ M is always within the statistical error on λ.
In Fig. 6 we show the differential cross section with theoretical error bands. The bands are obtained by considering, for each point in x, the maximum and the minimum values of the differential cross section over all fit results. The black dashed curve is obtained by setting the theory parameters to their default value. In the left panel we highlight the dependence of the fit on the choice of µ M : the blue band has µ M = m Q , the yellow band has µ M = m Q /2, and the green band has µ M = 2m Q . It is clear that lowering the scale has the effect of raising the tail of the distribution, resulting in poorer fits. This feature is relatively independent of the model function. In the right panel we highlight the effect of excluding fits with decreasing χ 2 : the (nearly hidden) blue band is the theoretical uncertainty obtained from all the fits, the red band is the theoretical uncertainty from fits with χ 2 /dof < 3 (which corresponds to including 80% of all fits), and the purple band is the theoretical uncertainty from fits with   Here the blue band includes all fits with χ 2 < 3. The light blue band shows the combination of the statistical and theoretical uncertainties. It is obtained by considering, for each point in x, the maximum and the minimum values of the differential cross section over all fit results, with λ set to the central value, or λ ± σ λ . In Fig. 8 we show the ratio of the theoretical prediction, obtained with default theory settings, to the data of the ALEPH and DELPHI experiments (left panel) and OPAL and SLD experiments (right panel). The error bars in Fig. 8 include only the uncertainties of the data points. The theoretical curves give a good description of the data. The broadness of the peak reflects differences between the experiments. In the tail of the distribution, x 0.5, the theory starts to overshoot some of the data points. While, with the exception of OPAL, the χ 2 remains good, this effect might indicate the need to include power corrections in the matching of the QCD fragmentation function onto bHQET. In our theoretical framework, the tail of the distribution is not very sensitive to the nonperturbative model function S hadr H/Q . As discussed in Section 5, the convolution with the model S hadr H/Q corrects the partonic QCD fragmentation function d ns by including a series of power corrections of order O(Λ QCD /m Q ). Therefore, the tail region of the differential cross section is a genuine QCD prediction, and cannot be easily adjusted by changing the parameters or the functional form of the hadronization model. Nonetheless, power corrections can be relevant, as they may be as large as λ/m Q ∼ 10%. While the convolution with the hadronization model includes some of the power corrections, the small discrepancy we see in the tail suggests that it is important to systematically include all of them. We will further investigate the issue in future work.
In the far tail, x 0.05, the differential cross section becomes unphysical. Here the approximation of massless b quark breaks down, and power corrections of order m 2 Q /Q 2 need Data Sets λ (GeV) σ exp (GeV) σ th (GeV) (χ 2 /dof ) def  Table 3. Best fit value of λ, with statistical and theoretical errors. The procedure used to assess the theoretical error is described in the text. (χ 2 /dof ) def and χ 2 /dof denote the χ 2 obtained with default theory settings, and the average of the χ 2 over all fits.  Table  2.
to be included. Since there are no experimental points in this region, we neglect this class of power corrections. In Table 3 we summarize the best fit values of λ, the statistical and theoretical errors in the two cases described at the beginning of the Section. The best fit is obtained by setting the theory parameters to their default values. The theoretical error is given by taking the difference between the best fit λ, and the maximum (or minimum) value of λ obtained by varying the theory settings. The fifth and sixth columns of Table 3 give the χ 2 /dof in the case of default theory settings, and, as a measure of the quality of the fits when varying theory settings, the average χ 2 /dof for the 45927 settings we considered.

Convergence
To study the convergence of the perturbative series, we repeated the fits with lower order expressions, namely N 2 LO + N 2 LL and NLO + N 2 LL. The ingredients included at each order data sets  Table 4. Best fit value of λ, with statistical and theoretical errors, and α s (m Z ) = 0.1135. The fits are performed with the N 2 LO + N 3 LL formulae. The procedure used to assess the theoretical error is described in the text. (χ 2 /dof ) def and χ 2 /dof denote the χ 2 obtained with default theory settings, and the average of the χ 2 over all fits.
are summarized in Section 4.5. As shown in Table 3, the results at N 2 LO + N 2 LL are very close to the full analysis. This stresses the importance of including higher order corrections to the matching coefficients, in particular to the fragmentation and shape functions. Once the matching corrections are included, performing a complete N 3 LL resummation or limiting ourselves to N 2 LL makes little difference. As Table 3 shows, the NLO + N 2 LL fits give larger values of λ, with noticeably worse χ 2 /dof . The NLO + N 2 LL cross section includes only terms that are strictly of O(α s ), that is we discard O(α 2 s ) terms from the product of the fragmentation function and short-distance cross section. We notice that including these terms, as was done in Ref. [4], improves the agreement with the data. Since the full N 2 LO expressions are available [34,35], we decided not to include spurious O(α 2 s ) in the NLO cross section, and perform a complete N 2 LO analysis. Notwithstanding the poor χ 2 /dof , we can use the NLO + N 2 LL fits to test the convergence of the perturbative expansion. Fig. 9 shows dσ/dx at NLO + N 2 LL (red band) and N 2 LO + N 3 LL (blue band). In the left panel, we show the results of the fits. The envelopes are obtained by considering, for each x, the maximum and the minimum values of the differential cross section over all fit results. We can see that the N 2 LO + N 3 LL band is narrower than and lies within the NLO + N 2 LL band. In the right panel, we show the comparison of the differential cross section at different orders, but for fixed value of p = 4 and λ = 0.525 GeV, so that the comparison in not contaminated by the effects of the experimental uncertainties and/or the poor agreement with data of the NLO cross section. Once again, it can be appreciated that the N 2 LO + N 3 LL band is significantly narrower than the NLO + N 2 LL band, which indicates a reduction of the theory uncertainties. Furthermore, the right panel of Fig. 9 shows that, for fixed λ, the inclusion of N 2 LO corrections (in particular the corrections to the QCD fragmentation function) causes an increase of the differential distribution in the region x ∈ (0.5, 0.7), compensated by a slightly lower peak. The N 2 LO shape follows the data more closely, with a consequent improvement in χ 2 .

Dependence on the value of α s (m Z )
For our best fit in Section 7.1 we used the world average α s (m Z ) = 0.1185 ± 0.0006 of the Particle Data Group [117]. There are by now several extractions of α s using event shapes in e + e − data that point to lower values of α s (m Z ) [53,[118][119][120][121]. To assess the impact of a lower value of α s , we repeated the analysis of Section 7.1, but with α s (m Z ) = 0.1135 ± 0.0011 [53]. For simplicity, we did not vary Γ 3 and γ S 2 , which have little effects on the fits. We also did not include the error on α s quoted in Ref. [53]. The variations of the other theory parameters produced a total of 1701 theory settings. In this case, we calculated the χ 2 for λ between 0.400 and 0.700 GeV, in steps of 0.025 GeV. The results of the fits, with statistical and theoretical errors, the value of χ 2 /dof for the default theory settings, and the average χ 2 /dof for all the fits are shown in Table 4. Comparing Tables 4 and 3, we see that the different value of α s (m Z ) has three effects. First of all, a smaller α s requires a larger value of λ. Secondly, we find that the fits improve, even when including OPAL data. The effect is mainly due to a lower tail. Finally, the theory error is decreased as regards the discussion in Section 7.1.
In Fig. 10 we show the theory bands obtained with α s (m Z ) = 0.1185 (blue band) and α s (m Z ) = 0.1135 (red band). The dashed blue and red lines are obtained with the default theory parameters. The red band is narrower as a consequence of the better fits and smaller theory error, and it is higher in the peak and lower in the tail relative to the blue band. Even if the effects are not dramatic, they lead to better agreement with the data, as shown in Fig.  11.

Comparison to the literature
The most recent extraction of the b-quark fragmentation function from e + e − data has been performed by M. Cacciari, P. Nason and C. Oleari in Ref. [4]. In this paper, the inclusive differential cross section for the production of a b-flavored hadron is considered at NLO.
The parameters a and b were determined by fitting to data from the ALEPH [54] and SLD [57] experiments. The best fit parameters are a = 24 ± 2 and b = 1.5 ± 0.2, with χ 2 /dof = 2.3. Away from the endpoint x ∼ 1, the main novelty of our work is the inclusion of N 2 LO corrections to the hard coefficient [77][78][79] and to the perturbative heavy quark fragmentation function [34,35], and the use of the three-loop time-like splitting functions [39][40][41] in the solution of the DGLAP equations. This allows us to reach N 2 LO + N 2 LL accuracy in the x < 1 region.
In the endpoint, using SCET and bHQET techniques, we are able to extend the resummation of soft logarithms of 1 − x by an additional order, reaching (approximate) N 3 LL, in the counting delineated in Section 4.5. Without going into a detailed comparison of the relation between resummation in perturbative QCD and SCET (which is carefully discussed elsewhere, for example in Ref. [122] in the context of threshold resummation, and in Ref. [110] in the case of event shapes), here we simply observe the very good agreement between the extraction of the HQFF carried out in this paper and in Ref. [4]. This can be seen in Figure 12, where we compare the fragmentation function we obtained by fitting the N 2 LO + N 3 LL cross section to data from the ALEPH, DELPHI and SLD experiments (blue band) to the fragmentation function extracted in Ref. [4] (red band). The bands only include the statistical errors of the fits, which is determined by the experimental errors. The fragmentation functions are evaluated at the scales µ = 91.2 GeV (left panel) and µ = 182.4 GeV (right panel). We verified that the agreement is good in a wide range of factorization scales. As we have discussed in Sections 7.1 and 7.2, the inclusion of an additional perturbative order allows for a reduction of the theoretical errors due to residual dependences on the scales µ H , µ J , µ M and µ S .
Another advantage of the EFT framework discussed in Sections 4.4 and 5 is a more immediate physical interpretation of the parameters of the nonperturbative model, due to the relation to local HQET matrix elements. Furthermore, the bHQET expansion can be systematically extended to include power corrections of O(Λ QCD /m Q ). bHQET power corrections constitute a large theoretical uncertainty in the determination of the HQFF, possibly as large as 5 − 10%. A separate investigation of these effects is needed.

Conclusion
In this paper we have derived a factorization theorem for the single inclusive production of heavy flavored hadrons in e + e − annihilation, using SCET and bHQET. We split the differential cross section into the tail region comprising moderate values of x and the peak region where x approaches 1. In each region we use a hierarchy of EFTs to systematically control theory errors, sum logarithms, and organize perturbative corrections. We then "sew" the two regions together using a prescription that smoothly goes from one region to the other. A crucial ingredient for this is the use of profile functions which allow us to change scales in various parts of the calculation, with the result that certain resummations are turned on and off depending on the value of x. The EFT approach allows us to achieve a clean separation of the perturbative and non-perturbative aspects of the fragmentation of a heavy quark. The non-perturbative information is parametrized by the hadronic shape function S hadr H/Q , whose moments are related to matrix elements of local bHQET operators. We eliminate renormalon ambiguities in the factorization of long-and short-distance physics contribution to the shape function by a suitable renormalon subtraction.
Using state-of-the art results for the fixed order expression in Eqs. (3.31) and (4.1), and for the anomalous dimensions in Eqs. (3.32), (4.26), (4.27), (4.30), (4.31), we evaluate the cross section at N 2 LO, with N 2 LL resummation of logarithms of the ratio of the heavy quark mass m Q and the center-of-mass energy Q, and N 3 LL resummation of logarithms of 1 − x in the endpoint. By fitting the theoretical cross section to e + e − annihilation data at the Z pole, we extract the b-quark fragmentation function at N 2 LO, one order higher than in the existing literature. We repeat the fits at NLO + N 2 LL and find that, by going to higher order, the size of theoretical errors is reduced. As shown in Fig. 12, the fragmentation function we extract is in good agreement with previous extractions [4].
One of the advantages of the EFT approach is a systematic control of the theoretical uncertainties, which stem from missing orders in the perturbative expansions, and from missing power corrections in the EFT. We study the former by varying the scales µ H , µ J , µ M and µ S . We find that at N 2 LO the cross section still has a noticeable dependence on µ M , the scale at which the heavy quark fragmentation functions is evaluated and the DGLAP evolution is started. This dependence leads to a 15% theoretical uncertainty on the fit parameter λ, as big as the statistical uncertainty. The cross section is much less sensitive to the remaining scale variations, which induce an error on λ of a few percents. The most important power corrections originate from the bHQET expansion, and are of order Λ QCD /m Q ∼ 10%. Though the fits to the e + e − data are in general very good, the inclusion of these power corrections might be important to achieve a better description of the tail of the distribution, where our prediction slightly overshoots the data.
The b-fragmentation function extracted in this work can be used in high precision calculations of B-meson production in other processes, such as hadronic collisions [22], and top quark decays [123]. Given the copious amounts of high quality data being produced by the experimental collaborations at the LHC, such a study is of the foremost interest.
The N 2 LO b-fragmentation functions extracted in this work, tabled in the LHAPDF format [124], are available from the authors upon request.
(PRISMA -EXC 1098) and DFG grant NE 398/3-1. CK was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (Grants No. NRF-2014R1A2A1A11052687). EM acknowledge support by the US DOE Office of Nuclear Physics and by the LDRD program at Los Alamos National Laboratory. EM thanks D. Kang, Z. Kang, A. Hornig, Z. Ligeti, and F. Ringer for several interesting discussions. We thank in particular C. Lee for detailed comments on the manuscript. We thank P. Pietrulewicz for pointing out a mistake in Eq. A.3 in the first version of the paper, and for illuminating discussions on the role of rapidity logarithms in the threshold coefficient c thr 2 .

A Perturbative results
In this Appendix, we collect the fixed order expressions of the hard function, H Q , mass coefficient, C m , jet function, Jn and shape function, S Q/Q , which enter the factorization formula of the endpoint cross section in Eq. (4.1). These functions are known to O(α 2 s ). In Section A.2 we give the anomalous dimensions of these functions, and the solution of the RGEs.

A.1 Fixed order results
The hard function H Q , which encodes dynamics at the hard scale and it is obtained by matching QCD onto SCET I , is related to the quark time-like form factor, which was computed up to three loops [92]. For our analysis, it is enough to work at O(α 2 s ). At this order, H Q is given by [93][94][95][96]125] The color factors in Eq. (A.1) are C F = 4/3, C A = 3, and T F = 1/2. n f is the number of light flavors, n f = 5 above the bottom threshold.
The derivation of the one-loop matching coefficient between SCET M and bHQET was discussed in Section 4.2. The two-loop expression for C m can be obtained by comparing the singular terms of the perturbative fragmentation function in Ref. [34] to the two loop shape function [52], and it is given by , and n f = 4, since we are below the bottom threshold.
The mismatch of n f is made up by the 2-loop matching coefficient at the flavor threshold, discussed in Section 4.3 To express the jet and shape function, we need to introduce the distributions dr θ(r) log n (r) r + ϕ(r) = ∞ 0 dr log n (r) r ϕ(r) − θ(κ − r)ϕ(0) + 1 n + 1 log n+1 (κ)ϕ(0), (A.4) where r is a dimensionful variable, taking values in the (0, +∞) interval. In the case of the jet function, r has dimension two, and represents the virtuality of the jet, while for the shape function, r has dimension one. κ is an arbitrary cutoff, with the same dimensionality as r. The expression in Eq. (A.4) is independent of κ. It is convenient to define θ(r) log n (r/µ a ) r where a is the dimension of the variable r. The quark jet function has been computed to two loops in Ref. [99]. In terms of the distribution (A.5), we can express the jet function as with coefficients with β 0 the first coefficient of the QCD β function. The perturbative expression of the HQET shape function is also known to two loops [106].
The coefficients of the expansion in Eq. (A.8) are

A.2 Anomalous dimensions and solutions of the RGEs
Here we give the anomalous dimension and the solution of the RGE for each ingredient of the endpoint factorization formula, Eq. U S (ω, µ, µ 0 ) = exp (V (µ, µ 0 ) − 2K S (µ, µ 0 )) (µ 0n · ve γ E ) 2g(µ,µ 0 ) Γ (−2g(µ, µ 0 )) θ(ω) ω 1+2g(µ,µ 0 ) . (A.21) The first coefficients of the quark cusp anomalous dimension are [107][108][109] Γ 0 = 4C F , We use the Padé approximation for the unknown coefficient Γ 3 , where e Γ is one of the theory parameters we vary in our error analysis. We take e Γ from −2 to 2. Note that Γ 3 depends on the number of flavors and hence is different below and above the flavor threshold. For the default e Γ = 0 The non-cusp anomalous dimensions γ M and γ S are known at two loop [52,106]  (A.34) The three-loop non-cusp anomalous dimension γ S 2 and γ M 2 are not known. Their sum is constrained to be equal to the x → 1 limit of the anomalous dimension of the fragmentation function. From Ref. [

A.3 Renormalon subtraction
The renormalon subtracted perturbative shape function is given by [111] S Q/Q (ω) = 1 + δm Qn · v d dω + (δm Q ) 2 2 − δλ 1 6 n · v 2 d 2 dω 2 S Q/Q (ω). (A. 38) In the code, it is convenient to integrate by parts, and have the derivatives acting on the model S hadr H/Q . As we remarked in Section 5, Eq. (A.38) was originally derived for the shape function in B decays, but it can be applied to the fragmentation shape function. δm Q and δλ 1 are the shifts from infrared sensitive to infrared safe quantities. Here we used the 1S scheme for the heavy quark mass [115,116], and the "invisible scheme", introduced in Ref. [111], for the B meson kinetic energy. At the order we are working with R 1S = m 1S Q C F α s (µ S ) and m 1S Q = 4.66 GeV [117].
where R is a dimensionful quantity. We take R = 1 GeV, and do not vary it in the analysis of theoretical errors. The number of flavors in β 0 is n f = 4.

B Python code
Along with this publication, we release a python program for the DGLAP evolution of the b-quark fragmentation function 6 The program was written for python 2.7 and consists of five files. The computation is executed by running the main file "QCDcalc.py" with the python interpreter. To use the parallel capabilities of the program the package SCOOP [127] has to be loaded together with python, python -m scoop QCDcalc.py . This will automatically distribute the computation among all available local cpus. For more options and how to include remote machines we refer to the documentation of SCOOP [127]. The file "QCDcalc.py" also contains a section "OPTIONS" where all theory parameters and some numerical switches are collected to configure the computation. In addition, "QCDcalc.py" contains routines to do the calculation described in Section 3 and a routine for the solution of the DGLAP equation based on the brute force approach in Ref. [91].
"from fortran" is a python module created with f2py [128] from the Fortran code for the numerical evaluation of harmonic polylogarithms [129], the Fortran code [39] of the exact 2loop MS non-singlet coefficient functions for the fragmentation function F L [77] and F T [79], the Fortran code for the exact 3-loop MS non-singlet splitting functions P (2) N S [109,130] and the Fortran code for the differences between the time-like and space-like non-singlet splitting functions at second and third order in α s from Ref. [39].
"physics.py" contains all physics expressions including the hard function, soft function and renormalization group evolution kernel. Hence, all equations used for the calculations in Section 3 can be found in this file.
"convolution.py" contains integration routines and a function to numerically calculate convolutions in momentum fraction space. These routines can be used for convolutions of expressions other than the ones in "physics.py".
"mytools.py" contains some useful tools like a routine for parallelization using the package SCOOP [127] and a simple progress counter.