Quarkonium production in high energy proton-nucleus collisions: CGC meets NRQCD

We study the production of heavy quarkonium states in high energy proton-nucleus collisions. Following earlier work of Blaizot, Fujii, Gelis, and Venugopalan, we systematically include both small $x$ evolution and multiple scattering effects on heavy quark pair production within the Color Glass Condensate (CGC) framework. We obtain for the first time expressions in the Non-Relativistic QCD (NRQCD) factorization formalism for heavy quarkonium differential cross sections as a function of transverse momentum and rapidity. We observe that the production of color singlet heavy quark pairs is sensitive to both"quadrupole"and"dipole"Wilson line correlators, whose energy evolution is described by the Balitsky-JIMWLK equations. In contrast, the color octet channel is sensitive to dipole correlators alone. In a quasi-classical approximation, our results for the color singlet channel reduce to those of Dominguez et. al. [1]. We compare our results to those obtained combining the CGC with the color evaporation model and point to qualitative differences in the two approaches.


Introduction
Quarkonium production in proton-nucleus collisions provides an excellent laboratory for studying the interaction of colored heavy quark probes with an extended colored medium. The large mass scale provided by the heavy quarks suggests that their interactions can be computed systematically in a weak coupling framework. However, the use of heavy quarks as a probe of colored media has been bedeviled by the complexities encountered in understanding the production of heavy quark states in more elementary collisions. The development of the Non-Relativistic QCD (NRQCD) framework [2] provided a systematic power counting to organize this complexity, and there has been a tremendous amount of work since in making this a quantitative framework-for recent summaries of the state of the art, see for example [3][4][5]. Specifically, we should point to recent next-to-leading order studies which find that the yield of all quarkonia states in proton-proton collisions can be described in NRQCD factorization, including the J/ψ [6,7], ψ [8], χ cJ [9] and Υ(nS) [10,11] states.
At the same time, a systematic weak coupling framework, the Color Glass Condensate (CGC), was developed to describe the high parton density effects of small x QCD evolution and coherent multiple scattering [12][13][14][15]. At high energies, the typical momentum transfer from partons in the medium to the probe is no longer soft and is characterized by a semihard "saturation" scale Q 2 s Λ 2 QCD . This scale [16][17][18][19] separates highly occupied gluon transverse momentum modes from perturbative dynamics at large transverse momentum. The saturation scale is dynamically generated from the fundamental scale of the theory; it is proportional to the density of partons in the transverse radius of the nucleus, and grows with energy. Because the running of the coupling is controlled by this scale, asymptotic freedom tells us that the coupling of the colored partonic probe should be weak and will become weaker at higher energies. The hope therefore is that with some effort one can compute systematically the many-body structure of hadrons and nuclei at high energies.
In particular, the CGC has been widely applied to study a number of final states in proton-nucleus collisions-for reviews, see [20,21]. For other approaches to quarkonium production in proton-nucleus collisions, see [22][23][24][25][26][27]. An attractive feature of the CGC effective theory is that one can quantify what one means by dilute or dense scatterers as a function of energy and mass number [28]. Typically in proton-nucleus collisions we encounter a "dilute-dense" system. To be more precise, the "dilute" limit is a systematic expansion of amplitudes to lowest order in the ratio of the saturation momentum of the proton to the typical transverse momentum exchanged by the proton in the reaction (Q s,p /k ⊥,p 1). In turn, the "dense" limit corresponds to keeping in the amplitude all orders in the ratio of the saturation momentum of the nucleus relative to the momentum exchanged by the nucleus (Q s,A /k ⊥,A ∼ 1). At very high energies, the power counting in proton-nucleus collisions may be closer to that in proton-proton collisions. Further, at rapidities far from the proton beam, the power counting in proton-nucleus collisions may be closer to that in nucleus-nucleus collisions.
Quarkonium pair production was first studied in the CGC framework in the limit of small x and large transverse momentum [29]. It was shown explicitly that in this limit one recovers the k ⊥ -factorization results 1 of Collins and Ellis [31] and Catani, Ciafaloni and Hautmann [32]. However, for k ⊥ ≤ Q s , it was shown 2 in [33] that k ⊥ -factorization is broken explicitly in quark pair production, even at leading order in proton-nucleus collisions 3 . The magnitude of the breaking of k ⊥ -factorization for single inclusive quark production and quark pair production was quantified respectively in [37] and [38].
The results in these papers were derived for heavy quark pair production but the projection of these results for specific quarkonium states were not considered. In the same general framework, J/ψ production from quark pairs in color singlet and color octet configurations were previously considered in [1,[39][40][41][42]. However, these derivations were performed in a quasi-classical approximation, and the effects of QCD evolution were only included heuristically through energy evolution of the saturation scale. The formalism for heavy quark pair production developed in [33,38] was recently combined with the color evaporation model to compute J/ψ and Υ production in high energy proton-nucleus collisions [43].
In this paper, we project the amplitude for heavy quark pairs computed in [33] on to color singlet and color octet configurations. Interestingly, the energy/rapidity evolution of the corresponding short-distance cross-sections, as we shall discuss further, is described by different combinations of multi-gluon correlators in the CGC framework. These short distance cross-sections are matched on to long distance vacuum NRQCD matrix elements to provide detailed expressions for the cross-sections for all common S and P wave quarkonium states in proton-nucleus collisions 4 . In a follow up paper, we will compare our results to data on quarkonium production in deuteron-nucleus collisions at RHIC and proton-nucleus collisions at the LHC. The large amount of data now available at different energies, and for a variety of quarkonium states promises to provide sensitive tests of both the CGC and and the NRQCD formalisms.
The paper is organized as follows. In section 2, we provide a brief recap of the CGC framework and key results for heavy quark pair production. In section 3, we discuss the matching of these results to the NRQCD formalism. We describe simplifications of our results that occur in the limit of large N c , the collinear limit, and at high p ⊥ of the quarkonium. A comparison of our results to previous results obtained in the quasi-classical approximation is presented in section 4. In this section, we also compare our results to results obtained by combining the CGC framework with the Color Evaporation model (CEM). We end with a brief summary and outlook on ongoing work. Some essential details of the computations are presented in two appendices.
2 Quark pair production in the Color Glass Condensate

General discussion
In the CGC formalism, the proton-nucleus collision is described as a collision of two classical fields originating from color sources representing the large x degrees of freedom in the proton and the nucleus. The color source distribution generating the classical field in each projectile is evolved from initial valence distribution at large x to the rapidity of interest in quarks can be found in [36]. 4 In very high energy proton-nucleus collisions, at small x, the hadronization of heavy quark pairs into quarkonium states happens well after the collision. It is therefore reasonable to expect that the vacuum NRQCD matrix elements accurately represent the hadronization physics in these collisions.
the collision. The gauge fields of gluons produced in the collision are determined by solving the Yang-Mills equations Here J ν is the color current of the sources, which can be expressed at leading order in the sources as where ρ p is the number density of "valence" partons in the proton moving in the +z direction at the speed of light. Likewise, ρ A is the number density of "valence" partons in the nucleus moving in the opposite light cone direction. To solve these equations, one needs to impose a gauge fixing condition. Further, covariant current conservation requires that 3) The latter equation in general implies that eq. (2. 2) for the current receives corrections that are of higher order in the sources ρ p and ρ A , because of the radiated field. The solution of eqs. (2.1), (2.2) and (2.3) has been determined to all orders in both sources only numerically [44][45][46][47]. To lowest order in the proton source (as appropriate for a dilute proton source) and to all orders in the nuclear source, analytical results are available and an explicit expression for the gauge field to this order, in Lorentz gauge, is given 5 in ref. [28]. The amplitude for pair production to this order is obtained by computing the quark propagator in the background corresponding to this gauge field [33]. The probability for producing a single qq pair for a given distribution of color sources (ρ p in the proton and ρ A in the nucleus) is where M F (q ⊥ , p ⊥ ) is the amputated time-ordered quark propagator in the presence of the classical field generated by the sources. The expression, as it stands, is not gauge invariant. To convert this probability into a physical cross-section, we first average over the initial classical sources ρ p and ρ A respectively with the weights W p [x p , ρ p ] and W A [x A , ρ A ]. These weight functionals are gauge invariant by construction. We subsequently integrate over all impact parameters b ⊥ , to obtain the cross section to produce a heavy quark pair: This formula incorporates both multiple scattering effects and those of the small x quantum evolution. The multiple scattering effects are included in i) the classical field obtained from solving the Yang-Mills equation in eq. (2.1) with the current in eq. (2.2), ii) in the propagator of the quark in this classical field, as well as iii) in the small x renormalization group evolution of the color source distribution of the nucleus.
The leading logarithmic small x evolution is included in the evolution of the weight functionals, W p and W A , of the target and projectile with x. The arguments x p and x A denote the scale in x separating the large-x static sources from the small-x dynamical fields. In the McLerran-Venugopalan model [18,19], the functional W A that describes the distribution of color sources in the nucleus is a Gaussian in the color charge density 6 in ρ A . A Gaussian distribution of sources is equivalent to the QCD Glauber model of independent multiple scattering [28]. We shall address this point further later in our discussion of the quasi-classical limit of quarkonium production. In general, however, this Gaussian distribution of color sources is best interpreted as the initial condition for a non-trivial evolution of W A [x A , ρ A ] with x A . The evolution of the W 's is described by a Wilsonian renormalization group equation, the JIMWLK equation; the corresponding hierarchy of equations for expectation values of multi-gluon is called the Balitsky-JIMWLK hierarcy [53][54][55]. We will discuss the Balitsky-JIMWLK hierarchy further in the following section.

Heavy quark pair production amplitude
For our purpose here, the relevant quantity is the heavy quark pair production amplitude computed in [33]. We begin with the kinematic notations for the process 7 We will assume that the proton moves in the +z direction with momentum p p = (p + p , 0 − , 0 ⊥ ) and the nucleus in the −z direction with momentum p A = (0 + , p − A , 0 ⊥ ). Here p and q correspond respectively to the total momentum of the heavy quark pair and one half of the relative momentum of the quark and anti-quark constituting the pair. The on-shell constraints on the quark and the anti-quark (p/2 + q) 2 = m 2 and (p/2 − q) 2 = m 2 imply that p · q = 0 and p 2 = 4(m 2 − q 2 ) , (2.7) with m the heavy quark mass.
Within the CGC formalism, the amplitude to produce a heavy quark pair has two contributions. One of these, illustrated in fig.1 (a), is where a gluon from the proton emits a heavy quark pair before the collision with the target, while the other, illustrated in fig.1 (b), is where the gluon emits the heavy quark pair after the collision with the target [33]. We denote k 1 = (x p p + p , 0, k 1⊥ ) as the momentum of the gluon from the proton, ) as the total momentum of gluons from the nucleus, and ρ p and ρ A as the densities of color sources in the proton and nucleus, respectively. The heavy quark pair production amplitude then reads [33] M F ss;iī (p, q) = where s and i (s andī) are spin index and color index of quark (antiquark), respectively, and k ⊥ ≡ d 2 k ⊥ , x ⊥ ≡ d 2 x ⊥ . The functions T qq (p, q, k 1⊥ , k ⊥ ) and T g (p, k 1⊥ ) are defined to be with C µ L (p, k 1⊥ ) the well-known Lipatov effective vertex, where P + denotes the "time ordering" along the z + axis, and t a (T a ) are the SU (N c ) generators of the fundamental (adjoint) representation. We note that the amplitude in eq. (2.8) agrees exactly with the k ⊥ -factorized result derived in [29] when the Wilson line correlators are expanded to first order in ρ A /∇ 2 ⊥ . In general, however, k ⊥ -factorization is explicitly broken for pair production in proton-nucleus collisions 8 .

Quarkonium production cross section
In this section, we will discuss the matching of the results of the previous section to the NRQCD formalism. We will derive explicit expressions for the short distance cross-sections, and the associated small x multi-gluon correlators in the large N c limit. We shall also discuss the limit when the transverse momentum of the gluon exchanged by the proton is small, and demonstrate that collinear factorization is recovered on the proton side to leading order. Finally, we will discuss the power counting of the color singlet and color octet channels in the large p ⊥ limit of our computation.

Quarkonium production within the NRQCD factorization formalism
We begin with a brief review of the NRQCD factorization formalism [2]. The inclusive production of a heavy quarkonium state H in the process p + A → H + X is expressed in this framework as J are the quantum numbers of the produced intermediate heavy quark pair, where S, L and J are the spin, orbital angular momentum and total angular momentum, respectively. The symbol C here denotes the color state of the pair, which can be either color singlet (CS) with C = 1 or color octet (CO) with C = 8. In eq. (3.1), dσ κ are the short distance coefficients 9 for the production of a heavy quark pair with quantum numbers κ. These can be calculated perturbatively and can be factorized from the nonperturbative NRQCD long distance matrix elements (LDME) 10 O H κ . Specifically, the 8 This is to be contrasted to the result, shown by several authors, that k ⊥ -factorization holds at leading order for single inclusive gluon production in proton-nucleus collisions. 9 Readers should note that these coefficients for different channels have differing mass dimensions, as do of course then the long distance matrix elements. Further, for our convenience we shall use a definition for CS LDMEs [56], which is different from the original Quarkonium contributing states Table 1. Essential heavy quark pair states for quarkonium production. The contribution of color singlet states for each quarkonium production is at leading power in v. The color octet contributions for P -wave quarkonium production, say h c,b and χ cJ,bJ , are also at leading power in v. The color octet contributions to S-wave quarkonium production are power suppressed.
LDMEs describe the hadronization of a heavy quark pair with quantum numbers κ to the quarkonium state H. They are universal and can be determined by fitting experimental data [3]. The LDMEs are organized by powers of v, the relative velocity of heavy quark pair in the heavy quarkonium bound state. As v is a small non-relativistic velocity in the heavy quarkonium system, one needs only a few LDMEs in practice. For example, there are four independent LDMEs which are important for phenomenological study of J/ψ production 11 , There are two other P -wave CO LDMEs that contribute to J/ψ production with the same power counting as the O J/ψ ( 3 P [8] 0 ) . However, one can use heavy quark spin symmetry to relate P -wave operators with J = 1, 2 to the operator with J = 0 [2], For completeness, we list essential heavy quark pair states for common heavy quarkonia production in table 1. The CGC enters the quarkonium framework in the derivation of the perturbative crosssection dσ κ . We begin with the heavy quark pair production amplitude in eq. (2.8) and project it on to a definite quantum configuration κ [58] of the produced intermediate heavy where R(0) is the J/ψ wavefunction at the origin. 11 The magnitude of the CS LDME O J/ψ ( 3 S 1 ) is largest in powers in v, while the three CO LDMEs listed in eq. (3.3) are relatively power suppressed by v 3 , v 4 and v 4 , respectively. For J/ψ production with a large transverse momentum p ⊥ at hadron colliders, one finds that the contribution of the CS channel at leading order in αs is suppressed by m 2 /p 2 ⊥ compared to the 1 S 1 channel [57]. Therefore, although suppressed by powers of v, CO contributions are important for J/ψ production, especially at large p ⊥ . We refer interested readers to ref. [8] for further discussion. quark pair, .
(1, 8c) gives 1 if κ is CS, and 8c if κ is CO. The color and spin quantum numbers for the heavy quark pair are projected out by the sums over the respective SU (3) and SU (2) color and spin Clebsch-Gordan coefficients 3i;3ī|1 = δ iī / √ N c , 3i;3ī|8c = √ 2t c iī and 1 2 s; 1 2s |SS z . The coefficients LL z ; SS z |JJ z account for the spin-orbit LS coupling. As we normalize the Dirac spinors asūu = −vv = 2m, and normalize the heavy quark pair composite state as QQ(κ)|QQ(κ) = 4m, we have the extra normalization 2m . To simplify our notation, we will suppress the color index in the rest of the paper by introducing the matrix notation where 1 is a unit 3 × 3 matrix. Then distinguishing the color structure from the spinor structure, we can rewrite eq. (3.5) as where the functions F κ,Jz qq (p, k 1⊥ , k ⊥ ) and F κ,Jz with covariant spin projectors given by [59,60] Π SSz = 1 m s,s After these color and spin projections, the probability P κ 1 (b ⊥ ) to produce a heavy quark pair at an impact parameter b ⊥ can be obtained as follows. One first squares the spin and color projected amplitude. Next, averages are performed over all possible color charge densities in both proton and nucleus. Finally, the degrees of freedom of the heavy quark pair with quantum number κ are averaged over 12 .
For the complex conjugate amplitude, we will denote all Lorentz, color and spin indices, as well as unobserved momenta and coordinates, by a prime in their top right corner. Thus 12 To understand why one averages over the states of the heavy quark pair, let us go back to the NRQCD factorization formula in eq. (3.1). Assume that there are N κ possible states for each configuration κ. We can denote these by λ1, · · · , λNκ . Then the factorization formula is Heavy quark spin symmetry as well as rotational invariance in color space imply that the matrix elements O H κ,λκ are independent of λκ. If we then define the LDMEs as the summation of all possible states, can be written as (3.11) Here y p = ln(1/x p ) is the rapidity of the gluon that comes from the proton, and y A = ln(1/x A ) is the rapidity at which the Wilson line correlators of the target nucleus are evaluated. In this expression, · · · y p(A) denotes the average over color charge densities where O here generically denotes the average over the projectile charge density ρ p or the target color charge density ρ A in eq. (3.11). Further, the summation over color degrees of freedom after the second equal sign has been taken care of by our default rule: any repeated indices are assumed to be summed over. N κ = (2J + 1)N color are the number of states for a given κ, with N color = 1 or N 2 c − 1 if κ is color singlet or color octet, respectively. For convenience, we will use in the rest of the paper. All transverse coordinates in eq. (3.11) are defined with respect to the center of the proton. To convert these to the coordinates with respect to the center of nucleus, one simply has to shift all coordinates by the impact parameter b ⊥ . (For example, Translational invariance guarantees 13 that the averaged values in · · · y A are unchanged under such a shift. Therefore such a shift only leads to the extra phase factor e i(k 1⊥ −k 1⊥ )·b ⊥ .
When we derive the cross section dσ κ for a minimum bias proton-nucleus collision, we have to integrate P κ 1 (b ⊥ ) over the impact parameter b ⊥ . This generates the factor Using the delta function to integrate out the k 1⊥ , we find that the average over color density on the proton side to be ρ p,a (x p , k 1⊥ )ρ † p,a (x p , k 1⊥ ) yp . Following [33], we define the unintegrated gluon distribution inside the proton to be 14 With this substitution, the differential cross section of production of heavy quark pair with quantum number κ can be written as  13 This assumes that the size of nucleus is large enough for translational invariance to apply. 14 The unintegrated gluon distribution in eq. (3.16) is normalized such that the leading log gluon distribution in the proton satisfies See eq. (3.36) and ref. [38] for further discussion.
we will work out the explicit simplifications of this general result for the color singlet and color octet channels in the large N c limit. The phenomenological applications of this result will be left for future publications.
3.2 Complete results for quarkonium cross-sections in the large N c limit

Color singlet contributions
If κ is a color singlet intermediate state, only the first term ∝ F κ,Jz qq F †κ,Jz qq in eq. (3.17) survives; all other terms vanish. This is because all other terms involve F κ,Jz g (or F †κ,Jz g ), in which a gluon naturally transforms into a color octet heavy quark pair state. Taking where we have used the identity a t a ij t a kl = Further, in the large N c limit and for large nuclei (α 2 s A 1/3 >> 1), the expectation value for the second term in eq. (3.18) can be factored as the product of the expectation values of the traces within as (3.21) Using translation invariance for large nuclei, one can express the well known dipole correlator as Thus in the large N c and large A limit, the expectation value over color charge densities in the nucleus in eq. (3.18) can be expressed as Henceforth, for simplicity of notation, we will not write out explicitly the rapidity index on the quadrupole and dipole correlators.
It is convenient to express our result in terms of the variables r 0⊥ , ∆ ⊥ , r ⊥ , and r ⊥ which are expressed in terms of the co-ordinates x ⊥ , x ⊥ , y ⊥ , and y ⊥ as (3.24) Translation invariance implies that eq. (3.23) is independent of r 0⊥ . The r 0⊥ integration can therefore be performed trivially, giving as a result πR 2 A , the transverse area of the nucleus.
With these coordinate transformations, we obtain the cross-section for the production of color singlet heavy quark pairs to be where Γ κ 1 are defined as which are listed in appendix B.1. Note that, if Γ κ 1 ∝ δ(r ⊥ ) or δ(r ⊥ ), the quadrupole correlator in eq. (3.25) collapses to a single dipole correlator and cancels the second term exactly. Thus the terms in Γ κ 1 that are proportional to δ(r ⊥ ) or δ(r ⊥ ) do not contribute to the heavy quarkonium cross section and shall be neglected.
In the limit of N c → ∞ and α 2 s A 1/3 → ∞, the dipole correlator in eq. (3.25) satisfies the Balitsky-Kovchegov (BK) equation [53,61], (3.27) In the low density limit |x ⊥ − y ⊥ |Q s << 1, this equation reduces to the well known BFKL equation [62,63], which describes the leading logarithmic behavior of perturbative QCD at small x. The BK equation is the simplest equation of high energy QCD describing both small x QCD evolution and coherent multiple scattering and is used widely in phenomenological applications in both deeply inelastic scattering and hadron-hadron scattering.
The quadrupole correlator in eq. (3.25) is less well known but is an equally fundamental object in high energy QCD. Evolution equations in the JIMWLK framework for the quadrupole have been derived [64]. Their evolution can be computed numerically [65] and analytic results obtained in different limits [66]. It has been argued that in the large N c limit, dipole and quadrupole operators are the only universal multi-gluon correlators that appear in the "dilute-dense" final states [67]. This theorem certainly appears to hold for quarkonium production in the color singlet channel and, as we shall shortly discuss, in the color octet channel.

Color octet contributions
For the color octet state κ, , the first term in eq. (3.17) gives (3.28) Here we have used the identity in eq. (3.19) repeatedly. The expression in eq. (3.28) can be significantly simplified if we take the large N c limit. In this limit, the first term in eq. (3.28) dominates since it scales as O(N 2 c ) while all the other terms scale as O(1) in color space. We thus obtain Defining the dipole unintegrated gluon distribution in momentum space as one can integrate out all the coordinate variables in eq. (3.17) straightforwardly, and obtain As a result, the first term in the braces in eq. (3.17) gives (3.32) Likewise, we can work out the color algebra for the remaining three terms in eq. (3.17) making use of the identity Adding up all these terms together, we find With the spin projectors in eq. (3.10), the calculations of Γ κ 8 are straightforward, and we list the results in appendix B.2. Note that unlike the case in the color singlet channel, only dipole correlators appear in the color octet channels.
Eqs. J . Once these results are multiplied by the corresponding NRQCD LDMEs O H κ , one obtains the differential cross-section for the production of heavy quarkonium states in high energy proton-nucleus collisions. The results collected in the appendix provide a complete set for phenomenological studies of all the common heavy quarkonium states.

The proton collinear limit
When the gluon momentum fraction x p in the proton is not very small, the typical transverse momentum of the gluons in the proton is much smaller than the mass and the transverse momentum of heavy quarkonium, Q s,p (x p ) k 1⊥ m and Q s,p (x p ) k 1⊥ p ⊥ . We can then take the limit k 1⊥ → 0 in both the hard part and in the Wilson lines. Then one can integrate out k 1⊥ and arrive at a collinear gluon distribution function in the proton, thereby restoring collinear factorization from the proton side. Using d 2 k 1⊥ = 1 2 dθ 1 dk 2 1⊥ and defining we find for the color singlet channel, .

Small p ⊥ limit
For simplicity, we will discuss the small p ⊥ behavior only in the proton collinear limit. Small p ⊥ behavior for general case can be obtained similarly. The kinematic regime we are considering here is p ⊥ m. Then eqs. (3.37) and (3.39) imply that the leading contribution region should be k ⊥ ∼ k ⊥ ∼ p ⊥ m. We will derive the power law of p ⊥ /m for each channel. VariablesX l ⊥ andX l ⊥ have the behavior For color singlet channels, we begin with eq. (B11). It is easy to find ∝ δ 2 (r ⊥ ) into eq. (3.37), we find the expression vanishes. Therefore, to obtain a nonzero contribution,X l ⊥ must be expanded to next-to-leading power. Similarly,X l ⊥ also needs to be expanded to next-to-leading power.
Power law forW can be derived in the same way. We thus get For color octet channels, starting with eq. (B19), and realizing The power law of differential cross sections is complicated by different correlators between color singlet channel and color octet channel. If we assume the correlators do not contribute any power behaviors, then eqs. (3.42), (3.43), (3.45) and (3.46) are also the power law of differential cross section of each channel. We thus can discuss the relative importance of each channel. Taking J/ψ production as an example, if we normalize the contribution of 3 S where v is the typical relative velocity between charm quark pair inside J/ψ. Therefore, the color singlet channel 3 S is dominant for J/ψ production as long as m p ⊥ mv 2 . Conversely, for p ⊥ < mv 2 , the color octet contribution will dominate. The latter regime was studied recently in ref. [68] -our results are in agreement with those presented there.

Large p ⊥ limit
In the kinematic region p ⊥ Q s , additional contributions come from a higher order in α s process where a recoiling particle with large transverse momentum in the final state is needed to balance the quarkonium's p ⊥ [69,70]. Nevertheless, we can still study the limit p ⊥ ∼ Q s m, because Q s can be larger than m. To expand the hard matrix element in powers of m in this limit, we need to know the relative size of typical values of k 1⊥ .
Let us first consider the case where p ⊥ ∼ Q s m k 1⊥ . In this case, all the results obtained in the previous subsection (where we took the collinear limit for the proton side) are still valid. Normalizing the 3 S 1 channel behaves as m 4 /p 4 ⊥ . All the other channels behave as m 2 /p 2 ⊥ if we restrict ourselves to the regime where p ⊥ ∼ l ⊥ ∼ l ⊥ . The inclusion of other kinematic regions gives logarithm enhancements for some channels; however, the power laws governing the p ⊥ dependence are not changed.
Thus we find that at the perturbative order in our work color octet channels will dominate large p ⊥ quarkonium production. This is similar to the LO calculation for quarkonium production in proton-proton collision using collinear factorization [57]. In particular, for J P C = 1 −− quarkonia such as J/ψ, contributions from the color singlet channel are suppressed by m 4 /p 4 ⊥ , implying that color octet contributions may be large even if p ⊥ is not too large.
From eqs. (B2) and (B18), we find that the above power counting is unchanged if the typical value of k 1⊥ is a little larger, p ⊥ ∼ Q s m ∼ k 1⊥ . However, in the regime m, although the p ⊥ power counting of of all other channels is unchanged, that for the 3 S 1 channel is proportional to k 2 1⊥ + 4m 2 . This can be seen in eq. (B2a).

Comparison with other approaches
In this section, we discuss the relation between our complete NRQCD results and those from related theoretical works in the literature. In particular, we compare our results for the color singlet channel with those based on a quasi-classical saturation approach [1,[39][40][41][42] and to results matching the CGC computations of [33] to the color evaporation model [38,43].

Quasi-classical saturation model
Within the framework of a quasi-classical approximation to the QCD dipole model [71][72][73], Dominguez et. al. investigated cold nuclear matter effects of J/ψ production in pA collisions in a series of papers [1,[39][40][41][42]. Within the NRQCD factorization formalism, we naturally have both color singlet and color octet contributions. We will compare here our color singlet contribution with recent results in [1,42].
Since the works of [1,42] are performed in the limit of collinear factorization on the proton side, we will compare their results to our results for the color singlet channel in collinear limit of eq. (3.37). In the quasi-classical approximation, the color sources in the nucleus are assumed to be the Gaussian distributed sources of the McLerran-Venugopalan model. As noted previously, this is a Glauber-like multiple scattering approximation [28]. In this quasi-classical approximation, the quadrupole correlator in the large N c limit reads [33,74] If we further change the integration variable ∆ ⊥ → −∆ ⊥ and choose a Gaussian distribution for the dipole correlator we arrive at a much simpler expression where the wave-function Φ(r ⊥ ) is given by Remarkably, the above differential cross section is equivalent 15 to the result of eq. (27) of Kharzeev et. al. in [42] once we define the function φ T (r, z) in that paper to be φ T (r, z) = . When we integrate our results over p ⊥ , we recover the result in ref. [1] for the total J/ψ cross-section.
We conclude therefore that results for J/ψ differential cross section in high energy proton-nucleus collisions derived by Dominguez et. al. in Refs. [1,42] correspond to our color singlet results when we work in the quasi-classical approximation of the McLerran-Venugopalan model for the dipole/quadrupole correlators 16 .
We note however, that our expressions [for instance eq. (3.37)] allow for a full JIMWLK treatment of quarkonium production, including small x evolution and coherent multiple scattering in a consistent way. Another advantage of our formalism is that we also have color octet contributions which as we have discussed are important when p ⊥ ≥ Q s .

Comparison to the Color Evaporation model
The Color Evaporation model (CEM) is often employed in the literature to study heavy quarkonium production in high energy proton-nucleus collisions. For recent work relating the CGC framework to the CEM, see [38,43]. In this model, heavy quarkonium production is factorized into two steps: the perturbative (weak coupling) production of a heavy quark pair with invariant mass M followed by a non-perturbative hadronization process. The latter is assumed to have a universal transition probability for the pair to become a bound quarkonium state. It is assumed that the transition probability is the same for all heavy quark pairs whose invariant mass is less than the mass threshold of producing two open flavor heavy mesons.
Taking J/ψ production as an example, the cross section can be written as where F J/ψ is a constant non-perturbative transition probability and is independent of the color and spin of the heavy quark pair, m c (M D ) is the charm quark (D-meson) mass, and M is the invariant mass of the charm quark pair. If we decompose the expression in eq. (4.6) into color singlet and color octet contributions, the latter will be larger than the former by an factor of N 2 c − 1. This corresponds to 15 A careful reader will observe that the term 4r ⊥ ·r ⊥ in eq. (4.4) is a little different from the corresponding term in [42]. The reason is that the calculation in [42] effectively used Dr ⊥ = e − 1 8 Q 2 s r 2 ⊥ ln 1 µr ⊥ instead of eq. (4.3) to calculate dipole gluon distributions. The expression used in [42] is the correct expression in the framework of the McLerran-Venugopalan model. We used the Gaussian form of eq. (4.3) for convenience to efficiently check how our results reduce to those of [42]. 16 Note that the model for J/ψ wave function in [1,42] is different from ours. However, using the power counting in NRQCD, one finds that the difference is suppressed by v 2 . Thus the equivalence holds to leading order in v accuracy.
the ratio of the color states for both contributions. As a result, in the large N c limit, only color octet contributions remain in the CEM. This simple analysis agrees with the explicit calculations in [38,43]. In these papers, the CEM expressions for J/ψ production involve only the dipole gluon distribution. This can be contrasted with our NRQCD framework. In our case, while the the color octet channel in eq. (3.34) involves only the dipole gluon distribution, the color singlet channel in eq. (3.25) involves the quadrupole gluon distribution as well.
The power counting in NRQCD gives color octet contributions that are suppressed by v 4 relative to the color singlet contributions to J/ψ production. As v 4 < 1 N 2 c for both charmonium and bottomonium states, the color octet contributions are generally less important than color singlet contribution in this case. Exceptions exist for special kinematic region ( such as at large p ⊥ ), where the color octet mechanism may be dominant. Even so, though the color octet channels may dominate, the predictions of NRQCD factorization and the CEM can be different. This is because NRQCD factorization assigns a different parameter for each color octet channel, while the CEM assumes all these parameters to be the same.

Summary and outlook
The Color Glass Condensate (CGC) is a powerful formalism to systematically compute the final states in deeply inelastic scattering and hadron-hadron scattering experiments at high energies. In proton-nucleus collisions, it allows one to compute both the small x QCD evolution of the projectile and target wavefunctions, as well as multiple scattering effects due to the large number of color charges in the nuclear target. The CGC formalism was used previously to derive the cross-sections for the production of heavy quark pairs in [33]. However, only the Color Evaporation Model (CEM) was used previously to compute the production of quarkonium bound states [38,43].
The production of quarkonium bound states can be quantified within the Non-relativistic QCD (NRQCD) framework. The magnitude of long distance color singlet and color octet bound state matrix elements in different spin and angular momentum configurations can be categorized in powers of the relative velocity between the heavy quark-antiquark pair. Further, these universal matrix elements can be determined independently by experimental measurements. The short distance hard partonic cross sections however have to be computed in perturbative QCD.
In this work, we combined for the first time the CGC and NRQCD formalisms for quarkonium production. The former is used to compute the short distance matrix elements in weak coupling and the latter to describe the hadronization of the produced intermediate color singlet and color octet heavy quark pairs. Interestingly, we find that the intermediate color states are sensitive to different universal multi-gluon correlators in high energy QCD. The color singlet channel is sensitive to the QCD evolution of dipole and quadrupole Wilson line correlators while the color octet channel is sensitive to those of the dipole correlators alone. The fact that we were able to reproduce non-trivial results for color singlet J/ψ production in a quasi-classical approximation gives us confidence in the power and validity of our results.
Because the dipole and quadrupole correlators are universal, they can be measured in other final states (such as inclusive photon-hadron and di-hadron correlations) in protonnucleus collisions, and used to predict the production cross-sections of a number of quarkonium states. Conversely, the extraction of these correlators from combinations of production cross-sections of quarkonium states compared to data, can be used to predict cross-sections for other final states in high energy proton-nucleus collisions.
One thus has the possibility to further systematically test and extend the NRQCD framework, as well as the CGC effective theory describing the behavior of multi-gluon correlators in hadron wavefunctions. Understanding these "cold" nuclear matter crosssections then provide a benchmark for the interpretation of the same in nucleus-nucleus collisions. The recently demonstrated ability of LHC and RHIC experiments to compare final states in vastly different systems with the same bulk properties (such as events with the same number of charged hadrons) make such studies especially compelling in order to study the transition from cold matter to hot matter effects in the production of different quarkonium states.
We have not attempted in this work to perform the numerical computations necessary to compare our results to those from collider experiments. This work is numerically challenging (particularly for the color singlet channel) but feasible. Work in this direction is in progress and will be reported in the near future.
we find J channels, because of CO LDMEs are related, we sometimes only need the expression by summing over J, which gives J,Jz * αβ (J, J z ) α β (J, J z ) =P αα P ββ . (A5)

B Calculation of the hard part
In this appendix, we give results of hard part for all S-wave channels and P -wave channels. These results are sufficient for phenomenological study of common heavy quarkonia production in pA collision using NRQCD factorization.
B.1 Hard part for color singlet channels

B.1.1 Complete results
To calculate Γ κ 1 defined in eq. (3.26), we first calculate the following quantities We find where and The "· · · " in W 3 S represents terms that are independent of either l ⊥ or l ⊥ , which will eventually contribute to Γ 3 S [1] 1 1 in terms of δ(r ⊥ ) or δ(r ⊥ ). Let us denote the following abbreviations where K 0,1 are the modified Bessel functions. Then, Γ κ 1 can be obtained by 1 , we obtain Γ κ 1 from W κ by doing the replacement 0 , we obtain Γ κ 1 from W κ by doing the replacement For κ = 1 P