Parton distribution functions in the context of parton showers

When the initial state evolution of a parton shower is organized according to the standard"backward evolution'' prescription, ratios of parton distribution functions appear in the splitting probabilities. The shower thus organized evolves from a hard scale to a soft cutoff scale. At the end of the shower, one expects that only the parton distributions at the soft scale should affect the results. The other effects of the parton distributions should have cancelled. This means that the kernels for parton evolution should be related to the shower splitting functions. If the initial state partons can have non-zero masses, this requires that the evolution kernels cannot be the usual MSbar kernels. We work out what the parton evolution kernels should be to match the shower evolution contained in the parton shower event generator Deductor, in which the b and c quarks have non-zero masses.


Introduction
In a companion paper [1], we have introduced a parton shower event generator, Deductor [2], that is designed to be amenable to improved treatments of spin and color. This event generator is based on our earlier work [3,4]. Methods for an improved treatment of spin are described in ref. [5] and methods for an improved treatment of color 1 are described in ref. [6]. This shower generator contains features that differ from other parton shower event generators even when one uses the leading color approximation and averages over spins, as we do in ref. [1]. Two of these features are important for this paper.
The first feature is that we use a shower evolution variable defined by the virtuality in a splitting divided by the energy of the mother parton. In a second companion paper [7], we argue that this choice is advantageous because, at leading order in QCD, it factors hard interactions from softer interactions at the amplitude level.
The second feature is that we allow initial state partons to have nonzero mass. We regard up, down, and strange quarks to be effectively massless. The top quark is so heavy that we do not treat it as a possible constituent of the proton. This leaves the bottom and charm quarks, which do appear in the initial state as constituents of the proton. We take m b and m c to be non-zero. Do parton masses matter? We are interested in using parton shower evolution to examine what happens when there is a hard process with a momentum scale Q of 100 GeV or more. At this scale, m b and m c do not matter. However, a b or c quark that participates in the hard interaction at scale Q ultimately arose from an initial state g → b +b or g → c +c splitting. If the virtuality scale for this splitting was somewhere around the heavy quark mass, then the mass does matter.
These two features of the parton shower evolution have implications for the parton distribution functions used in the shower. This paper concerns these implications.
The general analysis of ref. [3] gives the equations for parton shower evolution with a full quantum treatment of color and spin. The analysis in ref. [1] makes the leading color approximation and averages over spins. The issues of how masses enter the evolution equation for an initial state shower apply both with and without color and spin. Accordingly, in this paper we address these issues using the definitions of ref. [3] with full color and spin.
We begin the analysis of this paper in section 2 by outlining some of the important features of the shower evolution used in Deductor and then defining the general structure of the evolution equations needed for parton distributions used in the shower. This is not completely straightforward because of the presence of quark masses and because the shower evolution uses parton distribution functions at fixed shower time rather than fixed MS renormalization scale. In section 3, we review the definitions from refs. [3,4] of the operators that generate shower evolution and relate these to "perturbative" versions of these operators, which differ from the full versions by not containing factors of ratios of parton distribution functions. In section 5, we use this analysis to argue that the kernels in the evolution equation for parton distributions must bear a simple relationship to the splitting kernels in the shower evolution operators. Some work is needed to derive the needed functions from the shower evolution operators of refs. [3,4]. This analysis is placed in an appendix A. With the needed functions from shower evolution, one determines the part of the parton evolution kernels that involve splitting variables z not equal to 1. There are δ(1 − z) terms that we find in section 6 by using flavor and momentum sum rules. We state the results for the parton evolution kernels including masses in section 7. If one starts parton evolution at a low scale Q fit with fixed input distributions, then at a high scale the parton distributions defined with shower evolution will differ from those defined with MS evolution. In section 8, we derive a lowest order perturbative relation for this difference. In section 9, we display numerical results for the difference between shower parton evolution and MS parton evolution. In section 10, we record a modification at next-to-leading order to the parton evolution that is used in Deductor. We offer some concluding remarks in section 11. 2 Parton evolution and shower evolution Consider the following scenario. Two hadrons, A and B, collide to produce a final state system, for example a W boson plus a jet. The final state system has momentum Q 0 . Now, the parton shower evolution simulates the development of the final state and also the development of the initial state. In each case, the development works from relatively hard interactions to softer interactions. In the case of the initial state, this means going backwards in physical time [8,9].
At each stage in the shower, the incoming partons are defined to be on shell with zero momentum transverse to the beam directions. Of course, in the exact Feynman diagrams that describe the shower, the initial state partons are not exactly on shell. It is part of the shower approximation that we treat them as being on shell. It is also an approximation to treat the incoming partons as if they had zero transverse momenta. However, this approximation is not as drastic as it seems. At each initial state splitting, the momenta of the final state particles are adjusted as described in section 7.3 and appendix A of ref. [7] to account for the recoil from the transverse momentum of the initial state splitting. For this reason, the transverse momentum of a Z-boson produced in the Drell-Yan process is correctly generated [10].

Initial state parton splitting
Let us look at the kinematics of initial state parton splitting. We define momenta p A and p B associated with the hadrons. These obey (p A + p B ) 2 = s, but we do not take p A and p B to be exactly the hadron momenta. Rather, it is convenient to define p A and p B so that p 2 A = p 2 B = 0. 2 At any time in the shower, the incoming partons have momenta p a and p b . These are defined to be on shell, with flavors a and b, with masses m(a) and m(b), and with zero momentum transverse to the beam. We define momentum fractions η a and η b so that the momenta p a and p b are (2.1) Now suppose that parton "a" splits in the sense of backwards evolution. Before the splitting, suppose that there were m final state partons plus the two initial state partons. We denote momenta after the splitting by momentum vectors with hats,p. The momentum 2 We never need the exact hadron momenta, but note here that in the case that both hadrons are protons, of parton "b" remains the same:p b = p b . Parton "a" after the splitting has a new momentum fractionη a and possibly a new flavorâ: (2. 2) The splitting creates a new final state parton with label m+1, flavorf m+1 , and momentum p m+1 . Parton m + 1 is on shell:p 2 m+1 = m(f m+1 ) 2 and typically has some transverse momentum. Momentum is not locally conserved in the splitting:p a −p m+1 = p a . Rather, we conserve momentum globally by making a small Lorentz transformation on the final state spectator partons:p j = Λp j for j = 1, . . . , m. (See section 7.3 and appendix A of ref. [7].) The momentum fraction splitting variable is z = η a /η a . The spacelike virtuality in the splitting is We divide the virtuality by 2η a p A · Q 0 to define the shower time t of the splitting, 3 is a parameter that is fixed throughout the initial state shower. Thus µ 2 A is twice the energy in the hard scattering process times the energy of hadron A, both as measured in the c.m. frame of the hard scattering process. 4

The strong coupling
The probability for an initial state splitting is proportional to α s . What should be the argument of α s ? We use α s (λ R k 2 T ), where λ R ≈ 0.4 is given in eq. (10.3) below and k 2 T = (1 − z)µ 2 . These choices are helpful for improving the summation of large logarithms arising from the emission of soft gluons [10,12]. However, for the analysis of the evolution of parton distributions in this paper, it is more convenient to use α s (µ 2 /z). These are related by (2.6) where β 0 = 33 − 2n f /(12π) is the first coefficient in the QCD β function. If we used this order α 2 s correction in the analysis of this paper, it would suggest order α 2 s corrections to parton evolution. However, higher order corrections to the shower splitting function would 3 In ref. [7], we use the dimensionful variable Λ 2 = Q 2 0 exp(−t) to express the shower ordering definition, but in this paper it is more convenient to use the dimensionless variable exp(−t). 4 Compare this to the scale variable ζA = (2pA · Q0) 2 /Q 2 0 used in ref. [11] to aid in factoring transverse momentum dependent parton distributions from the hard process. also lead to order α 2 s corrections to parton evolution. We do not know what the shower splitting function should be beyond the leading order, so we ignore these corrections to parton evolution with one exception: since the β 0 log(λ R ) correction to parton evolution is so simple, we add it in section 10 below.

The role of parton distributions
What parton distribution function describes the mother parton at the time of the splitting? We take it to be a function f a/A (η a , µ 2 ). If all partons were massless, we could use the MS definition [13] of parton distribution functions. These functions, f MS a/A (η a , µ 2 ), obey the standard DGLAP evolution equations [14]. However, the partons are not all massless, so f a/A is a possibly different function from f MS a/A . We assume that the first order evolution equation for f a/A has the form When all of the partons are massless, P aâ (z, µ 2 /z) is the standard DGLAP kernel, which does not depend on the scale parameter in this case. When some quarks have masses and one uses MS parton distribution functions, then one conventionally switches between an (n − 1) flavor scheme and an n flavor scheme when µ 2 becomes large enough. Specifically (working to order α s in the evolution equations), if m is the mass of one of the quarks q, then for µ 2 < m 2 one sets f MS q/A (η a , µ 2 ) = 0, while for µ 2 > m 2 one lets f MS q/A (η a , µ 2 ) = 0, with evolution determined by the normal DGLAP splitting functions with f MS q/A (η a , m 2 ) = 0 as a boundary condition. Thus effectively the g → q splitting kernel is (2.8) We will see in this paper that, with masses, we will need some extra terms that depend on the relevant squared parton mass m 2 . The mass dependence will not be the same as in eq. (2.8). It will be a convenient convention for us to take the second argument of P aâ to be µ 2 /z. With our choice of shower time, the dimensionful variable µ 2 A e −t defines the shower time for an initial state splitting in hadron A. We have Inside of shower evolution, we use the function f a/A (η a , µ 2 ) to describe the parton distribution, but with a different notation that emphasizes the separate roles of the momentum fraction η a and the shower time t: (2.10) This function represents the probability to find a parton with flavor a and momentum fraction η a at shower time t given by eq. (2.9).
Using eq. (2.10), the corresponding evolution equation forf a/A η a , µ 2 /η a is (2.11) This is the exact evolution equation forf a/A (η a , µ 2 /η a ). For our purpose of analyzing shower evolution, we will want to drop some contributions that are higher order in α s so that we write On the right hand side of the equation, we have changed the scale argument of α s from µ 2 to µ 2 /z and we have changed the scale argument off from zµ 2 /η a to µ 2 /η a . Using the renormalization group equation for α s and the evolution equation eq. (2.11) forf , we see that these scale changes correspond to higher order adjustments to the evolution equation, denoted by the +O(α 2 s ) notation in eq. (2.12). Since we will be dealing with parton shower evolution only at leading order in α s , these higher order terms will not concern us. We will find that when the partons have mass, terms beyond those of the customary DGLAP evolution kernel are needed in the evolution kernel P aâ (z, µ 2 /z) in eq. (2.12). These extra terms appear at leading order in α s .
The evolution kernel P aâ z, µ 2 /z is not an ordinary function but a distribution, with singular behavior as z → 1. We can specify part of the structure of the kernel and write the same equation using ordinary functions by writing (2.13) Here the lower limit of the z-integration is z = 0. However, we definef a/A (η a , µ 2 /η a ) = 0 for η a > 1, so that in the first termfâ /A (η a /z, µ 2 /η a ) = 0 unless z > η a . The upper limit is infinitesimally less than z = 1. The kernel P aâ z, µ 2 /z in the first term is an ordinary function. We anticipate that forâ = a, P has a singularity as z → 1 of the form 2C a /(1−z).
Here, as we will find later, C a is C F or C A for quarks and gluons, respectively. The same constant C a appears in the second term, so that the singular behavior is cancelled. There is also a term γ a , which we allow to depend on quark masses and on µ 2 . We will have to determine γ a .

The perturbative splitting operators
The shower evolution of ref. [3] is based on the evolution equation Here ρ(t) represents the state of the system at shower time t and H I (t) and V(t) are operators on the space of states; H I (t) describes splitting, increasing the number of partons by one, while V(t) describes virtual graphs and unresolved splittings, leaving the number of partons unchanged. See ref. [1] and ref. [3] for a more complete description. The splitting operator H I (t) contains a factor with a ratio of parton distribution functions. Specifically, suppose that we start with a basis state 5 Here there is a ratio of parton distribution functions after the splitting to parton distribution functions before the splitting. That is because before the splitting, the probability for the system to be in the specified state is proportional to parton distribution functions . After the splitting the probability is proportional to parton distribution functions with the new variables. Thus we need to cancel the old parton distribution functions and introduce the new ones. Coming along with the parton distribution functions there is a ratio of kinematic factors η a η b and there is a ratio of color factors n c (a)n c (b), where n c (a) is the number of colors of a parton with flavor a. The rest of the matrix element of H I (t), denoted here as a matrix element of a new operator H pert I (t), is rather complicated but contains no factors of parton distribution functions. It is precisely the ratio of parton distribution functions in eq. (3.2) that interests us in this paper. This ratio is standard in modern parton shower event generators and it is needed for an efficient generation of parton splittings. However, there is a sense in which a dependence on parton distribution functions should not be there. The very splittings described in H pert I (t) are the splittings that generate the evolution of the parton distribution functions. Thus we should not need parton distribution functions to describe the splittings. The only parton distribution functions that we should need consist of one factor of parton distributions at the low virtuality end of the parton shower. Indeed, roughly this idea was present from the beginning of the development of parton showers with backwards evolution [8,9]. The formulation used in Deductor follows most closely that of ref. [8].
In order to investigate this idea, let us define an operator F(t) that multiplies by the parton distribution factor that relates the cross section to a squared matrix element, Then the operator H pert How should we define the corresponding operator V pert (t)? To find out, first define a shower state vector ρ pert (t) that has the parton distribution factor removed: (3.5) The evolution equation for ρ pert (t) can be determined from the evolution equation (3.7) We can write this as Here H pert Here we have noted that F(t) commutes with V(t) since V(t) does not change momenta or flavors. Now we can make a couple of observations. First, the starting value of ρ pert (t) at the time t 0 that corresponds to the hard interaction does not contain parton distribution functions because we removed this factor from ρ pert (t) . Second, if we let ρ pert (t) evolve to some late shower time t f , then we can recover the full shower state at t f using (3.10) Thus ρ(t f ) contains the proper product of parton distribution functions as long as ρ pert (t f ) , like ρ pert (t 0 ) , does not depend on parton distribution functions. This means that the evolution from t 0 to t f should not have introduced any dependence on parton distribution functions. Now, the operator H pert I (t) in the evolution equation (3.8) for ρ pert (t) does not contain any parton distribution functions by construction. However, in eq. (3.9) for V pert (t), the operator V(t) does contain explicit parton distribution function factors. Additionally, F(t) −1 dF(t)/dt contains parton distribution functions. Because of the differentiation with respect to t, it also contains the evolution kernel for the parton distribution functions. Thus, what needs to happen is that the evolution kernel for the parton distribution functions has the right form compared to the functions in V(t) so that the dependence on parton distribution functions cancels between the two terms in eq. (3.9), at least after applying suitable kinematic approximations that correspond to the parton splittings in the shower being approximately collinear or soft. This is the issue that we will investigate in the following sections.

Shower kinematics
We will want to examine the evolution of V pert (t). For this purpose, we need some kinematic variables for the initial state shower.
At shower time t, an initial state parton from hadron A with momentum fraction η a can become a new initial state parton with momentum fractionη a with the emission of a new final state parton with momentump m+1 . The ratio of momentum fractions is η a /η a = z.
It is useful to define a dimensionless virtuality variable That is, y is the virtuality in the splitting divided by the total current squared c.m. energy of the colliding partons, η a η b s. At the first initial state splitting, y is much smaller than 1 as long as the first splitting is close to being collinear or soft. At each subsequent initial state splitting, t is larger than in the previous splitting and η b is the same or larger. Thus y gets smaller at each splitting. For this reason, in a parton shower it is a good approximation to assume y 1. It is also useful to define a dimensionless mass squared variable For u, d, and s quarks we can take ν(f ) = 0. For c and b quarks, ν(f ) = 0. However, we are interested in hard processes for which the scale is much greater than squared quark masses: Thus ν(f ) 1 at the start of the shower. At each subsequent initial state splitting, η a and η b are the same or larger than they were at the start of the shower. For this reason, in a parton shower it is a good approximation to assume ν(f ) 1.
5 Determining P aâ (z, µ 2 /z) at finite z As argued in the section 3, we want to arrange that the virtual splitting operator does not involve parton distribution functions after suitable kinematic approximations are applied.
Let us look at the second term in V pert (t). Our partonic basis states are eigenfunctions of this operator: with a corresponding expression for λ F b . Using the parton evolution equation (2.13), this is 3) The first term in λ F a involves a ratio of parton distribution functions. We need to somehow make this term go away.
We now look at the first term in V pert (t), namely V(t). This operator has a contribution for each initial state or final state parton, The contributions V l (t) from final state partons do not contain any factors of ratios of parton distributions, so we can ignore these terms. There are two choices for initial state partons, but V b (t) has the same structure as V a (t), so we can concentrate on V a (t). The operator V a (t) contains two kinds of terms, The term V aa (t) is derived from parton splittings in which parton "a" splits in the ket state and in the conjugate bra state. We will return to it shortly. The terms V ak (t) are derived from interference graphs in which parton "a" emits a gluon in the ket state but a different parton, k, emits the gluon in the bra state or in which parton "a" emits a gluon in the bra state and parton k emits the gluon in the ket state. The action of V ak (t) on a basis state has a simple form, That is, V ak (t) leaves momenta, flavors, and spins unchanged but acts according to a matrix in color space. The color-space matrix has the form (We choose the scale arguments of α s andf a/A as discussed in sections 2.2 and 2.3.) We need to understand the structure of the function g ak . It contains a color matrix that need not concern us here and a function A ak that defines how much of the interference graph is attributed to a splitting of parton "a" and how much is attributed to a splitting of parton k. The only important feature of A ak is that it is everywhere finite. The essential factor in g ak is the eikonal approximation to the Feynman graph, where D(p m+1 ) µν is the polarization sum for the emitted gluon in Coulomb gauge. This factor is singular in the region of wide angle soft gluon emission, but it is not singular when gluon m + 1 becomes collinear with p a or p k . Now, at small y, we integrate over z. There are three integration regions to consider: the collinear region y 1 is important. For that reason, in eq. (5.7) we can approximate z by 1 in the parton distribution functions (and also elsewhere). This gives There is no factor of the ratio of parton distribution functions in the contribution to V(t) from λ V ak (t), so this contribution does not help to cancel the ratio of parton distribution functions in eq. (5.3).
To avoid confusion, let us note that the functions g ak are important in the parton shower. They help determine the part of the development of the parton shower that comes from soft gluon emissions. However, they do not play a role in the present analysis because, in the limit of small y, they do not multiply parton distribution functions.
Next, we examine V aa (t), which contains the functions that we will really need. The states {p, f, s , c , s, c} m are eigenvectors of V aa (t): The eigenvalue λ V aa is made of a factor of α s , a ratio of parton distribution functions, and a certain function g aâ : Again, we choose the scale arguments of α s andf a/A as discussed in sections 2.2 and 2.3. The function g corresponds to parton splittings in which a is the flavor index of the parton after the splitting andâ is the flavor index of the parton before the splitting (thinking of the process going forward in time). We need to understand the structure of the functions g aâ . The function g aâ is rather complicated, but it is simple when y 1 and ν(f ) 1 for f = a,â, b, with no requirement on the ratio of y to ν(f ). This function appears inside an integration over z and both the regions of finite (1 − z) (the collinear region) and of (1 − z) ∼ y 1 (the wide angle soft region) are important in the integration. The intermediate region, (1 − z) y 1, is also important. In the wide angle soft region, the structure of g aâ is particularly simple, The constraint (1 − z) > y arises from the kinematics. It is useful to write this as (5.14) With this notation, we find that when y 1 and ν(f ) 1 we have The function G aâ is an ordinary function of its arguments and is independent of the arguments in {p, f } m other than η a and a. We leave a detailed determination of this function for the appendix A. Inserting eq. (5.15) into eq. (5.11), we have + O(y, ν(a), ν(â)) .

(5.16)
Now, consider the second term on the right hand side of eq. (5.16). The only region of the z integration that matters for y 1 is the region (1 − z) y. We can make further use of the approximation y 1 by replacing z by 1 in the factor 1/z, the argument of α s , and, more importantly, in the argument of the parton distribution functions. This removes the ratio of parton distribution functions from this term. We are left with + O(y, ν(a), ν(â)) + O(α 2 s ) .

(5.18)
We see that the ratio of parton distribution functions disappears from V pert (t) if P aâ z, µ 2 /z = G aâ z, µ 2 /z .
We compute G directly from the splitting functions in the shower and this determines the evolution kernel P for the parton distributions at finite values of (1 − z).
6 Determining γ a (μ 2 ) Eq. (5.19) gives us P aâ z, µ 2 /z at finite (1 − z) from the small y limit G aâ of the initial state splitting functions in the shower. However the full splitting function is actually a distribution, with singular behavior at z → 1, as indicated in eq. (2.13). We need to determine the constants γ a (μ 2 ) that appear in eq. (2.13). Essentially, these constants multiply δ(1 − z) in the evolution kernel and are thus not present at finite (1 − z). However, we can determine the constants γ a (μ 2 ) from the momentum and flavor sum rules that guarantee that the total longitudinal momentum of the partons sums to the total longitudinal momentum of the proton and that the total flavor quantum numbers of the partons sums to the total flavor quantum numbers of the proton.
To proceed in a unified fashion, consider the quantity If we take c a = 1 for all a , then the momentum sum rule implies that this quantity should be zero. If we let q be a quark flavor and take then the flavor sum rule for flavor q implies that this quantity should be zero. Using eq. (2.13), we see that for either kind of sum rule In the first term we change variables from η a toη a = η a /z, giving .

(6.4)
With a little manipulation, this is .

(6.5)
The coefficient offâ /A (η a , µ 2 A e −t ) must vanish. Thus we need (setting η a µ 2 A e −t =μ 2 ) We now write this out in detail. The only nonzero functions Pâ a are those for which there is a first order splitting graph forâ → a + f for some flavor f . Thus for any quark or antiquark flavors q and q, P q q = 0 unless q = q. Let us examine the flavor sum rule for flavor q. Takingâ = q, we have Thus eq. (6.8) is automatically satisfied. This leaves eq. (6.7), which determines γ q . Now let us examine the momentum sum rule. Takingâ = g, we have Here we sum over quark flavors Q = {u, d, c, s, b}, not antiquark flavors, and then multiply the quark term by 2. Takingâ to be a quark or antiquark flavor q, we have Now, eq. (6.10) determines γ g . Then eq. (6.11) would determine γ q except that we have already determined γ q in eq. (6.7). For these equations to be consistent, we need 0 = 1− 0 dz z P qq z,μ 2 − P qq z,μ 2 + 1− 0 dz z P gq z,μ 2 . (6.12) Changing variables from z to 1 − z in the second integral, this is The two functions P qq and P gq both describe the splitting q → q + g and differ by whether it is the quark or gluon that goes on to the hard interaction. We will find that these two functions are related by P qq z,μ 2 = P gq 1 − z,μ 2 . (6.14) Because of this relation, the two formulas for calculating γ q give the same result.

(7.2)
Note that the relation (6.14) that allows a consistent calculation of γ q (µ 2 ) does indeed hold.
The theta functions that provide a lower limit on µ 2 for a given z in eq. (7.1) are easy to understand. For P qq we consider a splitting of a quark with momentump a = (p + a ,p − a ,p a ) given byp A daughter gluon is emitted into the final state with momentum This leaves a daughter quark heading toward the hard interaction carrying momentum p a −p m+1 . The daughter quark has virtuality µ 2 = −(p a −p m+1 ) 2 + m(q) 2 given by The minimum virtuality occurs when the transverse momentum k vanishes and we find (1 − z)m(q) 2 < µ 2 , as in the first equation in eq. (7.1). The other cases follow similarly. The momentum sum rule constant γ g (µ 2 ) is of special interest. When µ 2 is very large, each of N f flavors of quark contributes and we have However, when µ 2 decreases to close to 4m(q) 2 for some flavor of quark, the contribution of that flavor begins to turn off because the splittings g → q +q turn off. For µ 2 < 4m(q) 2 , the contribution from quark q turns off entirely.

Difference between pdf 's with and without mass
The parton distribution functions f a/A (η a , µ 2 ) evolve according to eqs. (2.7), (7.1), and (7.2). The MS parton distribution functions evolve according to the same equation with all of the quark masses set to zero, but with boundary conditions that set the quark distributions for heavy quarks to zero for µ 2 < m 2 , as in eq. (2.8). For the purposes of this section, let us choose a modification of the MS scheme in which the boundary condition is at µ 2 = λm 2 for some λ that is possibly not 1. We can call this the MSλ prescription. Thus the effective g → q evolution kernel is Let us work in a five flavor theory with the charm and bottom quark masses non-zero, while other quark masses are set to zero. Let us suppose that we set parton distribution functions f a/A (η a , µ 2 ) equal to the MSλ parton distribution functions when the scale is smaller than the charm mass squared: It is of interest to calculate the order α s contribution to these differences. Consider, for example, the change in the bottom quark distribution. Evidently where ∆P bâ (z, µ 2 ) = P bâ (z, µ 2 /z) − P MSλ bâ (z, µ 2 ) .

(8.5)
We first note that the contribution fromâ = b can be neglected because f MSλ b/A (η a /z, µ 2 ) is nonzero only for µ 2 > λm(b) 2 and the kernel is significantly nonzero only for µ 2 ∼ m(b) 2 .
In this region f MSλ b/A (η a /z, µ 2 ) is itself of order α s , so thatâ = b contribution to eq. (8.4) is of order α 2 s . This leaves the contribution fromâ = g. If we integrate the differential equation, we have Because of the structure of ∆P bg (z,μ 2 /z), the most important contributions for large µ 2 to the integration overμ 2 come fromμ 2 somewhere around m(b) 2 . A reasonable estimate of the most important integration region isμ 2 ∼ 4m(b) 2 . Thus at order α s we can set µ 2 = 4m(b) 2 in the argument of f MSλ a/A (η a /z, µ 2 ) and α s (µ 2 ). This gives We learn three things. First, ∆f b/A (η a , µ 2 ) = 0 for µ 2 < min(1, λ) m(b) 2 because both versions of the parton distribution for b quarks vanish there. Second, ∆f b/A (η a , µ 2 ) changes as µ 2 increases; if λ = 1, it becomes negative because the splittings g → b +b turn on more slowly with a physical treatment of the threshold than with the MS treatment. Third, for very large µ 2 , the difference stays finite because the integral in eq. (8.8) is finite in the limit µ 2 → ∞. In fact This function is negative with a logarithmic singularity for z → 1. For small z it is positive. If we choose the standard MS prescription λ = 1, then the relevant convolution with the gluon distribution is negative, so that there are fewer bottom quarks with the shower evolution of the partons than with MS evolution, as we will see in section 9. Increasing λ makes the difference with the MSλ bottom quark distribution smaller. Evidently, analogous results apply for the charm quark distribution. For the gluon distribution, similar reasoning gives The evolution kernel P gg is the same as the MSλ version except for the term γ g (µ 2 ) δ(1−z). Thus After performing the integration, the result for large µ 2 is very simple The sum simply counts the number of quarks treated as massive, which is normally 2. The coefficient of δ(1 − z) is generally not large. It is positive for λ = 1 and vanishes when λ = e 5/3 ≈ 5.3.

Behavior of the parton distributions
The parton distribution functions introduced in this paper have a different definition from the conventional MS parton distributions. Thus one should fit them to data using perturbation theory for deeply inelastic lepton scattering and other hard scattering processes that help to determine parton distributions. Needless to say, this is a very big project and we have not attempted it. However, parton showers are, at least at present, accurate only to lowest order in QCD perturbation theory. At this order, we may hope that the following scheme suffices. We take a standard set of MS parton distributions. For this paper, we have used the MSTW 2008 leading order central fit [15]. 7 These are defined by applying ordinary MS evolution to the parton distributions at a starting scale Q fit . For the MSTW 2008 set, the starting scale is Q fit = 1 GeV. Instead, we can define shower parton distributions by applying the evolution equation (2.7) to the parton distributions at the starting scale Q fit . In this section, the parton distributions thus defined are labelled simply as "shower." We also display distributions labelled as MS, which are defined by applying the standard MS lowest order evolution to the parton distributions at the starting scale Q fit . This is the same as the MSTW 2008 LO set. Finally, we display distributions labelled as MSλ, which are defined by letting the partons evolve from the starting scale Q fit using the standard MS lowest order evolution kernels but with the boundary condition that heavy quark evolution (for c and b quarks) starts at µ 2 = λm 2 , as discussed in the previous section. This amounts to redefining the renormalization prescription for the heavy quark distribution functions so that one subtracts not only an ultraviolet pole term and a conventional finite term proportional to (log(4π) − γ E ), but also a finite term proportional to log(λ). Naturally, this would entail a corresponding change in the factorization subtraction for next-to-leading order hard scattering graphs. In this section, we choose λ = 4, so that the heavy quark threshold is at µ 2 = 4m 2 .
Consider first what happens at the hard scattering that serves as the starting point for parton showers. Suppose that the hard scattering has scale µ 2 =ŝ. Then the partons that produce the hard scattering have momentum fractions η given byŝ = η a η b s. Assume that the hard scattering is at central rapidity, so that η a ≈ η b . Then η a ≈ η b ≈ µ/ √ s. Thus we use parton distribution functions f a/A (µ/ √ s, µ 2 ). We take √ s = 14 TeV. In figure 1, we plot the ratio of f a/A (µ/ √ s, µ 2 ) for b quarks to the corresponding b-quark distribution function in the MS prescription. We see that in the interesting range 100 GeV < µ < 2000 GeV this ratio is around 0.8. The reason, of course, is that physical b-quark evolution starts more slowly than MS evolution. This is a perturbative effect, as analyzed in the previous section. We show also the ratio of f b/A (µ/ √ s, µ 2 ) in the MSλ prescription with λ = 4 to the b-quark distribution function in the standard (λ = 1) MS prescription. This ratio is also around 0.8. Now we look at the parton distributions from the point of view of the shower. We considerf a/A (η, q 2 ) as a function of the momentum fraction η at fixed shower time t, with q 2 = µ 2 A e −t . These functions are related to the functions f a/A (η, µ 2 ) by eq. (2.10), which givesf a/A (η, q 2 ) = f a/A (η, η q 2 ) . (9.1) In figure 2, we plot the b-quark distribution in a proton, f b/p (η, η q 2 ), as functions of η at fixed q 2 . If we imagine starting at a hard interaction at central rapidity with a scale Q 2 0 ≈ (640 GeV) 2 , then η ≈ Q 2 0 /s ≈ 0.05. With ηq 2 = Q 2 0 we have q ≈ 3000 GeV. In the first panel of figure 2, we show f b/p (η, η q 2 ) versus η at q = 3000 GeV. We also show the b-quark distributions in the MS convention and in the MSλ convention with λ = 4. Now with "backward evolution" for the initial state, we move to smaller q 2 and larger η. In the second panel of figure 2, we show the b-quark distribution versus η at q = 1000 GeV. The value of the b-quark distribution has started to decrease, which means that shower evolution will often turn a b quark into an incoming gluon. In the third panel, we show the b-quark distribution versus η at q = 300 GeV. The value of the b-quark distribution has now decreased dramatically: a substantial fraction of the b quarks have been turned into incoming gluons. Finally, in the fourth panel, we show the b-quark distribution versus η at q = 20 GeV. This is very close to the threshold. All but a small fraction of the b quarks have disappeared and only a limited range of η is allowed for those that remain. With MS evolution, many more b quarks would remain. That is, the discrepancy is substantial between evolution that follows the Feynman diagrams for g → b +b with m b > 0 and MS evolution.
One may wonder what happens to the gluon distribution. In figure 3, we show the distribution f g/p (η, η q 2 ) for gluons at fixed q 2 = (3000 GeV) 2 for shower, MS, and MSλ parton distributions for λ = 4. We note that there is hardly any difference.

A small modification
If we were to add one more order of perturbation theory to our parton evolution, we would have the evolution equation aâ z, µ 2 /z fâ /A (η a /z, µ 2 /η a ) . In our analysis, we have regularly dropped contributions to P (2) aâ , since we have only a leading order shower. However, we find it helpful to use a limited version of P (2) aâ in our parton evolution: P (2) where β 0 = (33 − 2n f )/(12π) is the first coefficient in the QCD β function and where [12] λ The terms in eq. (10.2) proportional to 1/(1 − z) appear in the exact MS parton evolution kernel at next-to-leading order. This modification of the evolution kernel amounts to changing µ 2 in the argument of α s in the evolution equation to λ R µ 2 . In fact, we do include a factor λ R in the argument of α s in shower evolution, as discussed in section 2.2. Accordingly, we also use α s (λ R µ 2 ) in place of α s (µ 2 ) in the evolution equation for the parton distributions. Thus we effectively include the term given in eq. (10.2) in the evolution of the parton distributions used in Deductor. However, we have not used this modification in the comparisons of massive and massless evolution presented in the sections 8 and 9.

Conclusions
When the initial state evolution of a parton shower is organized according to the standard prescription of ref. [8], the probabilities for parton splittings involve ratios of parton distribution functions. We have argued that in order for this to be physically consistent, the kernels of the evolution equation for the parton distributions need to be consistent with the splitting functions in the shower. In the case that the initial state partons can have non-zero masses, as in Deductor [1], this means that the parton evolution kernels cannot be the standard MS kernels.
In this paper, we have deduced what the revised parton evolution kernels should be in order to match the shower evolution in Deductor to first order in α s .
Numerical investigations presented in section 8 show that the modification of the evolution strongly affects the distribution functions for heavy quarks at evolution scales comparable to the square of the heavy quark mass. This effect shrinks as the evolution scale increases. The gluon distribution function is not much affected at any scale.
There is work to be done to understand these issues better. We would like to see what happens, for instance, if we keep non-zero masses but use k T ordering for the shower evolution instead of the ordering specified in eq. (2.4). We would also like to have an operator definition of the modified parton distribution functions, analogous to that for MS parton distribution functions [13].

Acknowledgments
This work was supported in part by the United States Department of Energy and by the Helmoltz Alliance "Physics at the Terascale." We thank Voica Radescu of the HeraFitter group for providing the parton distribution functions that we use in the Deductor code.

A The splitting functions
We have argued that the parton splitting functions P aâ z, µ 2 /z should be given by eq. (5.19), which equates these functions to functions G aâ z, µ 2 /z that are defined in eq. (5.15) to be the small virtuality limits of parton splitting functions g aâ z, µ 2 /z, {p, f } m . In this appendix, we calculate the functions G aâ z, µ 2 /z .

A.1 The virtual splitting operator and the splitting functions
In order to find G aâ z, µ 2 /z , we seek the splitting function g aâ that appears in eq. (5.11), which we repeat here: Here λ V aa is the eigenvalue of a part V aa (t) of the virtual splitting operator, as defined in eq. (5.10), As explained in section 5, the subscript "aa" here means that we are to take the part of H I (t) that describes splittings of incoming parton "a" and comes from graphs (in a physical gauge) in which parton "a" splits in both the quantum ket state and the quantum bra state. There are other contributions V ak that come from interference graphs. As explained in section 5, these are soft gluon contributions and do not contain a ratio of parton distribution functions, so we can ignore them. To analyze eq. (A.5), we begin with eqs. (12.20) and (12.21) of ref. [3]. We construct H I,aa (t) by keeping only the terms corresponding to "aa" graphs: This formula requires a bit of explanation. We examine the integration measure dζ p in the following subsection. The delta function defines the shower time according to eq. (2.4), which is different definition than that used in ref. [3]. The parton flux factor is described in eq. (3.2). The splitting functions w aa are given in ref. [3]. The symbol g ak represent an operator on the color space 8 that obeys the color identity , (q,q), (g, q) or (g,q) C A (â, a) = (g, g) T R (â, a) = (q, g) or (q, g) . (A.9) 8 We hope that these color operators g ak ({f }m+1), defined in ref.

A.2 Initial state splitting kinematics
In order to proceed, we need to specify in some detail the kinematics of an initial state splitting in our version of a parton shower, in which partons can have non-zero masses and in which we implement momentum conservation in a somewhat different way from other parton shower algorithms.
We consider a splitting of initial state parton "a" with momentum p a . In general, we denote the momentum of parton i before the splitting by p i and after the splitting byp i . Before the splitting, there are m final state partons. The splitting creates a new final state parton with momentump m+1 . Here "before" and "after" are in the sense of backward evolution, so that the initial state parton with momentump a evolves going forward in physical time to the partons with momenta p a andp m+1 . The two initial state partons have momenta given by eq. (2.2), in which η a and η b are the respective momentum fractions and p A and p B are the initial hadron momenta modified slightly so that they are lightlike, with 2p A · p B = s.
It will prove convenient to define lightlike vectors n a and n b by Then 2n a · n b = η a η b s . (A.12) We also define dimensionless mass squared variables by This is a somewhat more compact version of the definition in eq. (4.2). With this notation, the incoming parton momenta are p a = n a + ν a n b , (A.14) After the splitting, parton "b" remains the same, However, parton "a" gets a new momentum, p a =η a η a n a + η â η aν a n b . (A.16) We define a momentum fraction for the splitting, We define a virtuality variable y by We can express this usingp a ·p m+1 as It is well to note here that we will later use approximations based on y 1,ν a 1, ν a 1, ν b 1, andν m+1 1. However, we do not take y to be either much larger or much smaller than the dimensionless mass variables. When shower evolution reaches a stage near to heavy quark thresholds, these variables are comparable.
At this point, we need to remind ourselves about a subtle point that affects shower kinematics. Before the splitting, we know p a . At the splitting, parton m+1 with momentum p m+1 is emitted. Sincep m+1 has four components butp 2 m+1 = m(f m+1 ) 2 , the momentum p m+1 can be described using three splitting variables. Knowingp m+1 should then determinê p a . However, we cannot simply setp a to p a +p m+1 because thenp a will not be on-shell and it will not have zero components transverse to the beam. Instead, we need to take a small amount of momentum from elsewhere in the event and supply it top a . The method chosen in ref. [3] is to apply a Lorentz transformation to all of the final state partons:p i = Λp i for i = 1, . . . , m. For this to work, we need In one way of proceeding, this condition determines z in eq. (A.18) in terms of the three free components ofp m+1 . We will follow a slightly different alternative as follows. We need three splitting variables. Let one of them be z. Let the second be y. Let the third be the azimuthal angle φ ofp m+1 around the beam axis. Then eq. (A.21) determinesp m+1 as a function of y, z, and φ.
To proceed with this program, we writep m+1 aŝ Here the direction of k ⊥ defines the azimuthal angle φ and the magnitude of k ⊥ is given by Here ζ p stands for the splitting variables and ζ p ∈ Γ a means that the variables are within their kinematic bounds. The bounds are determined by 0 < x a , 0 < x b , and ν m+1 < x a x b . The delta function that defines the shower time t serves to eliminate the integration over y. Also, as we will see, w aa does not depend on φ so we can immediately perform the integration over φ. This gives × n c (a) n c (â)fâ Here µ 2 us the virtuality in the splitting, µ 2 = y 2n a · n b .

A.4 The splitting functions w aa
Let us begin with (â, a, f m+1 ) = (q, q, g). The spin averaged splitting function is given in eq. (2.26) of ref. [4], w aa = 4πα s 2(n a ·n b ) 2 1 (2p a ·p m+1 ) 2 D µν (p m+1 ,Q) Here P a =p a −p m+1 and D µν is the polarization sum for the emitted gluon, using timelike axial gauge with gauge fixing vectorQ =p a + p b : The genesis of this splitting function is described in ref. [3]; evidently it is quite directly derived from the Feynman rules. The function w aa is a rather complicated function of y, z, ν a and ν b , withν a = ν a and ν m+1 = 0. However it is simple in the collinear limit λ c → 0 with y ∝ λ c , ν a ∝ λ c , ν b ∝ λ c . It is also simple in the soft limit, λ s → 0 with y ∝ λ s , (1 − z) ∝ λ s , ν a ∝ λ s , ν b ∝ λ s . A straightforward calculation gives the following form, which contains the leading behavior in both limits: (A.35) In the collinear limit, the first, second, and third terms in the parentheses are important.

(A.37)
Here the three gluon vertex is v αβγ (p a , p b , p c ) = g αβ (p a − p b ) γ + g βγ (p b − p c ) α + g γα (p c − p a ) β (A.38) and D µν (p,Q) is given in eq. (A.34). The numerator function D γν (p a −p m+1 ; n l ) projects onto the physical polarization states for the off-shell gluon. It is given by eq. (A.34), where now the gauge fixing vector is n b , the lightlike vector in the direction of hadron B, as in the quark splitting function in eq. (A.33). The genesis of this splitting function is described in ref. [3]; evidently it is quite directly derived from the Feynman rules. The function w aa is a rather complicated function of y, z, and ν b . However it is simple in the collinear limit λ c → 0 with y ∝ λ c , ν b ∝ λ c . It is also simple in the soft limit, λ s → 0 with y ∝ λ s , (1 − z) ∝ λ s , ν b ∝ λ s . A straightforward calculation gives the following form, which contains the leading behavior in both limits: yz w aa ∼ 16πα s 2 n a ·n b (A.39) In the collinear limit, the first four terms in the parentheses are important. In the soft limit, the first and fifth terms in the parentheses are important. In eq. (A.32), there is also a theta function that gives the limits of integration over y and z. This is, in the soft or collinear limits Θ(ζ p ∈ Γ a ({p} m , ζ f )) = Θ(y < (1 − z)) . (A.40) The restriction y < (1 − z), which applies in the soft limit, comes from x a > 0. Let us turn next to (â, a, f m+1 ) = (g, q,q). The spin averaged splitting function, derived from the definitions in ref. [3], has a form similar that in eq. (A.33). In terms of dot products, it is given in eq. (A.3) of ref. [4], Here we use the limiting forms for w aa and for the theta function that we worked out in the previous section. Using our results, g aâ has the form g aâ z, µ 2 /z, {p, f } m ∼ G aâ z, µ 2 /z − δ aâ 2C a y (1 − z) 2 Θ(y < (1 − z)) , (A.48) where y(1 − z)) , a y Θ(z 2ν a < y(1 − z)) . (A.49) These are the results used in eq. (5.12) for the soft limit and eq. (7.1) for the collinear limit.