Toward initial conditions of conserved charges. Part I. Spatial correlations of quarks and antiquarks

In this paper, we study the spatial correlations among quarks and antiquarks produced at mid-rapidity by gluon pair production in the color glass condensate framework. This paper is the first part of a series in which we calculate a complete set of quark/quark, quark/antiquark, and antiquark/antiquark spatial correlation functions in heavy-light ion collisions, with the goal of incorporating their conserved charges into the initial conditions of hydrodynamics. The physical mechanisms captured in this calculation include geometric, entanglement, and interaction-mediated correlations. In this first paper, we construct the building blocks for the correlations arising from single- and double-pair production, studying in detail the single-pair case and the general features of the double-pair case. We find a rich correlation structure in transverse coordinate space, with different mechanisms dominating over different length scales, and we present explicit results for the quark-antiquark correlations in the single-pair production regime. We reserve a detailed discussion of the double-pair production regime for the next paper in this sequence.


I. INTRODUCTION
In recent years, fluctuations in the initial stages of heavy ion collisions have become understood as central drivers of the overall spectrum of produced particles.For instance, while the phenomenon of elliptic flow v 2 can be understood mainly as an average geometric effect in peripheral heavy ion collisions, the triangular flow v 3 seems to be driven entirely by spatial fluctuations in the initial energy density of the fireball [1].In addition to the nucleon-scale fluctuations arising from the distribution of nucleons within the colliding ions in Monte-Carlo-Glauber models (see for instance Ref. [2] and references therein), sub-nucleonic fluctuations also leave an important imprint on the spectrum of produced hadrons [3,4].Unlike the larger-scale fluctuations due to nucleonic degrees of freedom, these sub-femtometer fluctuations due to QCD degrees of freedom are driven largely by color charge.
In heavy ions at high energies, the Lorentz contraction of multiple nucleons into a short longitudinal thickness enhances the RMS two-dimensional color-charge density.These parametrically large charge densities radiate intense, classical gluon fields described by the effective theory known as the color glass condensate (CGC) (see [5] and references therein).The sub-nucleonic fluctuations in energy density due to these classical gluon fields are incorporated into, for instance, the IP-Glasma model [6,7], with the sub-femtometer fluctuations having a substantial effect on particles observed in the final state [4,[8][9][10].
Fluctuations and correlations in the initial stages of proton-proton, proton-nucleus, and nucleus-nucleus collisions have been the subject of tremendous activity within the CGC community (see for instance the reviews [11] and [12] and the references therein).Much of this work focuses on the applications of the CGC formalism as a natural explanation for the long-range-in-rapidity "ridge" correlations in nucleus/nucleus collisions [13][14][15] as well as high-multiplicity proton-proton [16,17] and proton/nucleus [17][18][19] collisions.Similar long-range-in-rapidity azimuthal modulations of the correlation function are also natural candidates for CGC effects in the initial state [20].Other works such as Refs.[21,22] study how such correlations produced in the initial stages of hadronic collisions may be modified by the subsequent strong-coupling classical Yang-Mills dynamics of the gluon fields.Other approaches to studying the role of initial-state correlations include the AMPT model [23,24], the longitudinally-extended source model [25], and others.These studies of the initial conditions of heavy ion collisions have so far focused predominantly on the fluctuations of the initial energy density, which is dominated by the production of gluons.However, new state-of-the-art hydrodynamic codes have been produced which are capable of preserving other conserved charges, such as flavor and baryon number, throughout their evolution [26].Moreover, recent ab initio CGC calculations have computed certain momentum-space correlations between quarks and made strides toward extending the dilutedense formalism to the dense-dense limit [27][28][29][30][31].In light of these simultaneous advances, there is a timely opportunity to compute the full set of quark and antiquark correlations in the CGC, along with their associated conserved charges such as flavor, baryon number, and electric charge.These can then be combined with the latest hydrodynamic simulations to assess their impact on the spectrum of final-state particles.
It has long been known that baryon stopping, the production of net baryon number at mid rapidity, is suppressed at high energies [32][33][34][35].The same is true of other quantum numbers, like spin and flavor, which are carried by valence quarks.Most of the valence quantum numbers brought by the colliding ions are carried down the beam pipe or to very far forward rapidities, with the dynamics at mid rapidity being dominated by the soft gluon fields of the CGC.While these soft gluons themselves carry no net flavor or baryon number, they do produce a spectrum of characteristic fluctuations through quark-antiquark pair production.One may attempt to identify the momentum-space correlations among quarks and antiquarks with the interesting correlations [36] observed in experiment among baryons and antibaryons.Some caution in such an identification is warranted, however, because nonperturbative hadronization effects may muddle the connection implied by parton-hadron duality.In our case, we are interested primarily in the spatial correlations at the parton level which serve to specify the initial conditions for the energy-momentum tensor and conserved currents in hydrodynamics.These initial-state effects are free from hadronization corrections, and their impact on final-state particle production can only be determined by coupling these initial conditions to subsequent hydrodynamic evolution.
The production of a single q q pair in proton-nucleus collisions in the CGC has been studied in the past by various authors [37][38][39][40][41][42].Although these calculations focused on the production cross-sections in momentum space, the same essential ingredients they derived are responsible for local fluctuations and quark/antiquark correlations in coordinate space.
An important recent calculation by Altinoluk et al. [27] examined the effect of Pauli block-ing by studying quark/quark correlations via double pair production in heavy ion collisions.
The correlations obtained in [27] neglect the contributions in which the radiated gluon first scatters in the target field and later fragments into a q q pair.This contribution was knowingly omitted by the authors [27] based on the expectation that gluon scattering followed by pair production will not lead to the Pauli blocking correlations of interest.However, as we will show explicitly for the case of single-pair production, it is precisely the interferences of these terms which generate the dominant contribution to quark / antiquark correlations.It is therefore reasonable to expect that contributions of this type may also generate significant interaction-mediated correlations in the quark/quark case, although these are separate from the statistical Pauli blocking effects that were the focus of [27].We will explore these aspects in detail in our future work.
An important extension of the CGC formalism for dilute-dense collisions to "semi-dilutedense" collisions was recently developed by Kovchegov and Wertepny [43].This formalism, intended to describe asymmetric "heavy-light ion collisions" such as copper-gold collisions, completely re-sums the high-density effects in the heavy ion while incorporating the highdensity effects of the light ion order-by-order in perturbation theory.Such effects enhance the probability to radiate a second q q pair, enabling additional types of correlations through quantum entanglement.
Building on these recent advancements, we will undertake the calculation of the leadingorder quark/quark, quark/antiquark, and antiquark/antiquark spatial correlations at midrapidity, along with their associated charges, in the CGC formalism.We will first analyze in detail the correlations arising from a single q q pair production at mid-rapidity; this contribution describes the leading-order correlations in proton-nucleus collisions and remains the dominant short-distance contribution in heavy-light ion collisions.We will also study at length the correlations arising from the production of two q q pairs at mid-rapidity; these correlations, while still sub-leading compared to single-pair production, are enhanced in heavy-light ion collisions and exist over longer distance scales.The length and complexity of this undertaking requires us to divide the calculation into multiple parts.In this work ("Part I") we will set up the formalism, construct the correlation functions and cross sections, and identify the wave function and Wilson line building blocks for these correlations.By studying the long-distance asymptotics of these correlations, we will identify the mechanisms and associated length scales which characterize different physical regimes.
In subsequent publications, we will present the detailed set of Wilson line operators that describe the interactions in these regimes, solving for the correlation functions numerically and analytically.
The rest of this paper is organized as follows.In Sec.II we construct the definitions of the correlation functions and the coordinate-space cross-sections.In Sec.III, we construct the elementary q q production amplitude from the fundamental building blocks of light-front wave functions and Wilson line interactions, and we analyze how these building blocks are combined to form the single-pair production and double-pair production cross-sections.In Sec.IV we analyze the long-distance asymptotics of various channels in the single-and double-pair production cross-sections to determine the length scales associated with the various types of correlations.In Sec.V we present explicit results for q q correlations and baryon number correlations for single-pair production, summarize the overall physical picture of both single-and double-pair correlations, and conclude by outlining the relation of the analysis presented here in "Part I" to future work to follow.

A. Definitions: Quark, Antiquark, and Baryon Number Correlations
Let us construct the definitions of the quark/quark, quark/antiquark, antiquark/antiquark, and baryon number correlation functions following the treatment of [44].
Let Ω denote a kinematic window about mid-rapidity; the single-and double-inclusive probability densities ρ i 1 and ρ ij 2 to produce particles of species i, j with kinematics ω 1 , ω 2 ∈ Ω are given by Here σ inel is the total inelastic cross-section to produce any particles in the mid-rapidity window Ω; at lowest order in perturbation theory, this corresponds to single-inclusive gluon production.Integrating these probability densities over the phase space Ω yields the event-averaged number of each type of particle, such that ρ 1 and ρ 2 simply correspond to multiplicities: where • • • ev represents an average over events.
With the help of the generic formulas (3), we can write down expressions for the average number of quark/quark (qq), quark/antiquark (q q), and antiquark/antiquark (q q) pairs produced at mid-rapidity.Throughout this paper we will denote the longitudinal degrees of freedom in terms of light-front components The average number of particles (which may be either quarks or antiquarks) with longitudinal momenta k + 1 , k + 2 and transverse positions B 1 , B 2 are then given by In the same way, we can use the expressions (4) to construct the expectation values for conserved charges such as baryon number1 B ≡ 1 3 f (n q f − n qf ), with f a sum over relevant quark flavors: , the expectation values above describe nonlocal correlations in three dimensions, with only the double-inclusive cross-sections contributing.There is also an additional delta function contribution at (B , with the weight of that delta function reflecting the strength of the local fluctuations (variance).We define the associated correlation functions for these various quantities as with n i , n j ∈ {n q , n q, B}, where we assume that (B 1 , k + 1 ) = (B 2 , k + 2 ) to exclude the delta function term.The first term in (6) describes the correlated production of two particles, from which the uncorrelated background in the second term is subtracted.While this definition of the correlation function is perhaps the simplest, it is by no means unique.Some references [43,45], for example, rescale the correlation function to make it dimensionless, and others [46] do not subtract the uncorrelated background.We will explore the features of different definitions of the correlation function in future work, but for now, the simple definition (6) will suffice.
The production of net baryon number at mid-rapidity is suppressed [27,[32][33][34][35] by the center-of-mass energy squared s corresponding to baryon stopping, in which valence quarks lose nearly all of their energy in the collision and are produced at mid-rapidity.Instead, it is far more preferable at high energies for these valence degrees of freedom in the colliding nuclei to punch right through each other and continue down the beam pipe, with the mid-rapidity region instead being populated by soft gluon bremsstrahlung.With eikonal accuracy, then, the inclusive probabilities to produce a quark or antiquark at mid-rapidity are equal and given at leading order by integrating out the spectator in q q pair production: This single-inclusive cross-section defines the uncorrelated baseline for the correlation functions.The correlation functions of interest are therefore given by 1: Illustration of the high-energy scattering process.The valence degrees of freedom from the light projectile ion are produced in the forward direction, the remnants from the heavy target ion are produced in the backward direction, and a number of soft particles are produced at midrapidity.The momenta and coordinates of the valence quarks are denoted by p i and b i respectively.
The momenta and coordinates of the particles produced at mid-rapidity are denoted by k i and x i respectively.

B. Structure of the Cross Sections
Next we will explicitly construct the two-particle-inclusive cross-sections which enter the correlation functions (9).These cross-sections which define the spatial correlations are somewhat unusual in that they are differential in the transverse positions of the produced (anti)quarks, rather than their momenta, so it is useful to construct them directly from textbook principles (e.g.[47]).
Consider the collision shown in Fig. 1 between a light "projectile" nucleus with a nucleons moving with large momentum P + a along the light-cone ⊕ direction and a heavy "target" nucleus with A nucleons moving with large momentum P − A along the light-cone direction.The paradigm of "heavy-light ion collisions" corresponds to the regime in which the target nucleus is so heavy that its density-enhanced corrections must be resummed to all orders, , while the density-enhanced corrections from the projectile nucleus are included order by order in perturbation theory.More precisely, the region of interest is α s α2 s a 1/3 1, such that corrections enhanced by the density ∼ a 1/3 of the light ion are more important than genuine quantum loops, but not so large that they must be simultaneously resummed.In the inclusive final state, we will identify a number of particles traveling in three distinct regions: the "forward" fragmentation region of the light nucleus a, the "backward" fragmentation region of the heavy nucleus A, and the mid-rapidity region between the two.
In the forward region, we consider a set of n active valence quarks in the light nucleus a with momenta {p j } n j=1 . 2 For our purposes, we will only need up to n = 2 independent valence quarks from the light nucleus.In the mid-rapidity region, we consider a set of N particles with momenta {k i } N i=1 which will generate the correlations of interest, and in the backward region we consider a generic remnant state with momentum P A .The standard momentum-space expression for the cross section is then With eikonal accuracy, the particles traveling in the forward direction have negligible minus momenta, P − a , p − j ∼ 0, particles traveling in the backward direction have negligible plus momenta, P + A , P + A ∼ 0, and particles at mid-rapidity have both negligible plus and minus momenta: k + i , k − i ∼ 0. This allows us to approximate the momentum-conserving delta function by the product of δ(P + a − n j=1 p + j ), δ(P − A − P − A ), and δ 2 (P A + N i=1 k i + n j=1 p j ) and performing the d 2 P A dP − A integrals directly, we obtain where A aA ≡ M aA /2s with s = 2P + a P − A and z j = p + j /P + a .Next we Fourier transform the amplitude to transverse coordinate space in the produced particles; that is, we insert and perform the transverse momentum integrals to obtain Note that, having integrated out the transverse momenta, the coordinates of the produced particles are the same in the amplitude and complex-conjugate amplitude; this simplification makes the problem of multi-particle correlations much more tractable in coordinate space than in momentum space.
In the eikonal approximation, neither the transverse coordinates {b j } nor the longitudinal momenta {z j P + a } of the light-nucleus valence quarks are changed by the interaction with the target; they therefore correspond to the many-body wave function Ψ a of valence quarks in the light nucleus: where ) is the scaled amplitude for a set {N } of valence quarks (or nucleons) to scatter on the target nucleus.Note that, in the quasi-classical approximation (where the collision energy is not so large that quantum evolution need be considered), the amplitude Ã{N}A is independent of the momentum fractions {z j }.We can also insert a Fourier transformation of the cross-section to impact parameter space d 2 B e +iPa•B , with P a = 0 in the center-of-mass frame.Using this, we have and by integrating the z j dependence out of the wave functions, we can replace the wave functions of the light nucleus in terms of the nuclear density profile T a (b): The nuclear density profiles are normalized such that d 2 b T a (b) = a, and since the integral in this normalization is proportional to the transverse area of the light nucleus, this implies that T a (b) ∼ a 1/3 .Each explicit factor of T a above therefore reflects the combinatoric enhancement by a 1/3 associated with independent nucleons participating in the scattering process.
With the general form of the coordinate-space cross section ( 16), we can construct the two-particle-inclusive cross-sections entering the correlations (9).There are two fundamental cases of interest: single-pair production, in which one active nucleon (n = 1) from the light projectile nucleus radiates a soft q q pair (N = 2); and double-pair production, in which two active nucleons (n = 2) from the light projectile nucleus independently radiate two q q pairs (N = 4).These different processes contribute to different correlations, and the dominant process may change as a function of distance.Single-pair production, for instance, contributes to C q q but not C qq , while double-pair production contributes to all correlations but is sub-leading compared to single-pair production where applicable.
For single-pair production, which contributes only to q q correlations, we can straightforwardly write where we have performed the integral over the impact parameter B to generate a factor of a.For double pair production, it is convenient to formulate the cross-sections by integrating over all the (anti)quarks in the final state, assigning the tagged particles through the use of delta functions as illustrated in Fig. 2: Note the inclusion of the symmetry factor for identical tagged particles.Then the crosssection for double-pair production can be written in a standard form for all three observables FIG. 2: Coordinate labels for double-pair production.Quarks are produced at x 1 , k + 1 and x 3 , k + 3 and antiquarks are produced at x 2 , k + 2 and x 4 , k + 4 .The crossed vertices represent the tagging delta functions (18) which assign two of the particles to the external coordinates B 1 , K + 1 and B 2 , K + 2 .The tagged combination shown here corresponds to Z (qq) , for example.Note that the pairs to which the (anti)quarks belong can become entangled in the complex-conjugate amplitude. as Comparing the single-and double-pair cross-sections ( 17) and ( 19), we see the relative power-counting of these two processes, where we recall that T a (b) ∼ a 1/3 and the amplitude for each radiated pair brings in a factor of α s .These considerations imply that, in the heavy-light regime, double-pair production is suppressed relative to single-pair production by a factor of α 2 s a 1/3 1.Note also that, for double-pair production, the d 2 B integral is no longer trivial and itself can induce geometric correlations between the pairs [43].Finally, to convert the cross sections ( 17) and ( 19) into the correlation functions (9), we need to normalize by the total aA inelastic cross-section σ inel .At lowest order in perturbation theory, σ inel corresponds to the total cross-section for single-inclusive gluon production at mid-rapidity, which is a standard textbook result [5].
FIG. 3: The light-front wave functions for the radiation of a q q pair, depending on the placement of light-front time t LF = x + = 0. Ψ 1 describes the q q emission before x + = 0, Ψ 3 describes the radiation after x + = 0, and Ψ 2 describes the gluon emission before x + = 0 and the q q pair production after.The transverse positions for the Fourier transform are also shown for Ψ 1 .

III. CONSTRUCTION OF THE AMPLITUDE
The core of the calculation is the construction of the scattering amplitudes Ã{N}A for the production of one or two q q pairs.We will perform this calculation using the formalism of light-front perturbation theory (LFPT) [48][49][50] and choose the A + = 0 light-cone gauge, in which the q q pairs are radiated from the light projectile nucleus a.In this framework, the amplitude Ã{N}A is given by a product of light-front wave functions Ψ describing the emission of the q q pair and Wilson lines describing SU (N c ) color rotations (with N c the number of colors) in the fundamental or adjoint representations.Here we will construct these building blocks of the amplitude and contract them in all possible ways to obtain the cross-sections (17) and (19).

A. Building Blocks: Wave Functions and Wilson Lines
In a time-ordered formalism like LFPT, there are three distinct time orderings and three distinct wave functions for the production of a single q q pair, as shown in Fig. 3.The wave functions in this language were previously calculated in [37], and we rederive them in Appendix A. The results for the momentum-space wave functions are with the momenta labeled as in Fig. 3 and α ≡ k + 1 q + the fraction of the pair momentum carried by the quark.Here we explicitly keep the quark mass to consider the possibility of producing heavy quarks, since this mass characterizes the typical size of the q q separation.We then Fourier transform into coordinate space using where r ≡ x − y and w ≡ u − b with u ≡ αx + (1 − α)y the position of the gluon at the pair center of momentum, as shown in Fig. 3.This choice of variables makes use of the fact that the wave functions depend only on the center-of-mass momentum q of the pair and the relative momentum (k − αq) of the pair when boosted to its rest frame.Performing the Fourier transform we obtain the coordinate space expressions where the functions F i (w T , r T , α) are defined as in [37], Note that F 0 and F 2 have dimensions of M 2 , while F 1 has dimensions of M 1 .
We further note that the spin structure of the wave functions can be conveniently decomposed [51] in terms of the Pauli matrices [ τ ] as Ψi (w, r, α) with U , L , and T denoting unpolarized, longitudinally polarized, and transversely polarized quarks, respectively.Note that, in order to get a Pauli representation which transforms as a vector under 2D rotations, it is necessary to represent the antiquark quantum number as (−σ), similar to the idea of an antiquark helicity used in [37].This construction yields a transparent physical interpretation of the spin state of the q q pair, is independent of the spinor basis, and is explicitly invariant under rotations in the transverse plane [51].The corresponding spatial wave functions are given by The building block (A q q) (ij) (kk ) (σσ ) of the scattering amplitude: a sum over time orderings for a single q q emission.
Note that the unpolarized and transversely polarized wave functions are real, while the longitudinally-polarized wave function is pure imaginary, such that The wave function for each time ordering is subsequently dressed by the Wilson line color rotations of the partons which scatter in the field of the target nucleus at t LF = x + = 0, as shown in Fig. 4. The Wilson line V x in the fundamental representation is defined as and the adjoint Wilson line U x is defined analogously.In the case of the three time orderings shown in Fig. 4, there are two color matrices built from these Wilson lines: one for the color rotation of the valence quark (with indices kk ) and one for the color rotation of the produced q q pair (with indices ij).For the time-ordering 1 in which the pair is radiated before the interaction, the wave function Ψ1 is dressed by the color rotations ( For the time ordering 2 in which the gluon is radiated before the interaction but pair produces after, the wave function Ψ2 is dressed by the color rotations (V b t b ) kk (U u ) ab (t a ) ij .But for the time ordering 3 in which the valence quark scatters first and then radiates the q q pair after the interaction, the situation is slightly different.The wave function Ψ3 = − Ψ1 − Ψ2 is not independent of the others, and the color rotation of the valence quark is different from the other two time orderings: (t b V b ) kk vs. (V b t b ) kk .This different color matrix for time ordering 3 can be brought into the same form as the ones for time orderings 1 and 2 by where in the last step we relabeled the dummy indices a ↔ b.This trick, which is visualized in the transition of Fig. 1(d) → Fig. 3 of [52], brings all three time orderings into the same form, allowing us to write the total amplitude for a valence quark to radiate a soft q q pair as ( ÃNA where time ordering 3 has been absorbed into time orderings 1 and 2 by shifting the q q color rotations: The fundamental building block ( 30) is an operator matrix in the spins and colors of the various partons, and by squaring and tracing out these quantum numbers in the appropriate ways, we can form all contributions to the single-and double-pair production cross-sections.

B. Single Pair Production
As a first application of the building block (30), let us compute the single-pair production cross-section (17) contributing to the q q correlation function (9b).Squaring the building block (30) and averaging over the color and spin of the valence quark, we obtain where tr C denote a trace over colors in the fundamental representation and tr τ a trace over spins and the remaining angle brackets denote an average over color configurations of the target nucleus. 3Note that, because we have expressed the spin structure of the wave functions in terms of Pauli matrices, the spin trace is trivial (using tr[τ α τ β ] = 2δ αβ ) and can be straightforwardly generalized to more complex structures.The nontrivial aspect of the calculation is the color algebra for each of the time orderings i, j ∈ {1, 2}.Inserting the Wilson line interactions (31), we obtain where 2Nc is the quadratic Casimir factor and we recall that Here we have introduced the operator D2 for the quark dipole scattering amplitude, As seen in (34), the average of D2 and of products of D2 determine the interactions' contribution to the q q correlation function.At higher orders, more and more complex Wilson line traces will become relevant, including the quadrupole D4 , sextupole D6 , and octupole D8 operators: Inserting these operators into the cross-section (17) gives the lowest-order expression Note that the product of the wave functions ( 26) is manifestly real and that, since D2 (x, y) * = D2 (y, x), the interactions come in pairs which are also manifestly real.This expression is simply the coordinate-space analog of the momentum-space q q cross-section calculated previously [37][38][39][40][41][42].By integrating out either the quark or the antiquark from (37) we can obtain the uncorrelated background (8) which needs to be subtracted to obtain the correlation function (9b).
Finally, the correlation function is normalized by the inelastic cross-section, which we take to be single-inclusive gluon production at lowest order.The standard textbook result [5] is with x ij ≡ x i − x j , and the corresponding total cross-section is with ∆Y ≡ A the total rapidity interval of the collision.Note that the scale to which the center-of-mass energy squared s is compared in the rapidity is not uniquely fixed; with eikonal accuracy one could equally well define ∆Y = ln s MaM A instead.Once an explicit model is chosen for the target averaging, one can calculate the correlations explicitly, using the same model for both the q q production and for σ inel .One simple and widely-used model which can be employed is the McLerran-Venugopalan (MV) model [53][54][55].The average for the dipole amplitude in the MV model is well known, but the average of products of dipole amplitudes is more esoteric.Still, the MV model evaluation of the double-dipole is known in the literature, with the explicit form [56]  for topologies without fermion entanglement, which form two independent fermion loops.The source gluons are kept free to be contracted in all possible ways, and the free indices i, j, k, ∈ {1, 2} denote possible time orderings for each pair emission.
These expressions are valid at finite N c (that is, without taking the large-N c limit).In general, these expressions may contain an additional logarithm ln 1 r T Λ which recovers the recover the perturbative power-law tail at asymptotically large transverse momentum (and hence asymptotically short distances); that logarithm has been neglected in these expressions from [56].This is also known as the Gaussian approximation or the Golec-Biernat-Wusthoff (GBW) model [57].

C. Double Pair Production: Wave Functions
The calculation of the double-pair production cross-section ( 19) is enormously more complex than the calculation of the single-pair production cross-section (17), for many reasons.
One is that, when two q q pairs have been produced, the pairs can become entangled in highly nontrivial ways: between the amplitude and complex-conjugate amplitude, the pairs may swap ownership of the quark ("quark entanglement"), the antiquark ("antiquark entanglement"), or neither ("no fermion entanglement").This flow of the q q fermion loop determines the structure of the spin trace tr τ over the wave functions, with each topology leading to a different wave-function structure.
The simplest case for the wave functions in double-pair production is the case of no fermion entanglement, as shown in Fig. 5.Each pair is radiated from a valence quark in the light nucleus with a particular time ordering: i, k ∈ {1, 2} in the amplitude and j, ∈ {1, 2} in the complex-conjugate amplitude.This reflects another source of the greatly increased complexity for the double-pair case: now there are 2 4 = 16 different combinations to consider (for each channel) due to the various time orderings, compared to 2 2 = 4 combinations in the single-pair case.
Each pair is also produced with a set of kinematic arguments: the positions of the quark and antiquark, the center-of-momentum position of the pair corresponding to the position of the gluon, the position of the valence quark which radiated the pair, and the fraction of the pair momentum carried by the quark.The details of these kinematics for the various pairs will depend on the precise diagram, but for our purposes it is convenient to denote the set of these kinematic arguments collectively by g 1 , g 2 , g 3 , and g 4 .
With this compact notation shown in Fig. 5, we can perform the spin trace tr τ on the pair wave functions to determine the spatial structure associated with this channel: with i, j, k, ∈ {1, 2}.In this simplest case, the fermion flow simply factorizes into two bubbles, giving the square of the single-pair production case.Likewise, the color traces over the Wilson lines will also factorize into a product of two traces, but the precise color flow will depend on which valence quark sources each pair.
A significantly more complicated wave function structure arises when the fermion flow is entangled between the pairs, as illustrated in Fig. 6.Here, instead of forming two independent loops, the fermion flow is connected into one larger loop.Depending on which valence quark radiated which pair, this may correspond to the entanglement of either the quark or the antiquark.Using the same compact notation, we can write the wave function structure for this topology as for fermion entanglement topologies which form a single fermion loop.The source gluons are kept free to be contracted in all possible ways, and the free indices i, j, k, ∈ {1, 2} denote possible time orderings for each pair emission.
This time, the Pauli trace contains four matrices, so several new structures are generated according to the algebraic properties given in (B1) of Appendix B. Nontrivial results occur for traces of two, three, and four Pauli matrices, but although this result is significantly more complex, its calculation in the Pauli matrix notation remains straightforward: FIG. 7: Quark entanglement channel of double-pair production.
structure, each of these color structures will generate very complex Wilson line multipoles.
To illustrate the ingredients and the complexity of such channels, let us consider the specific example of quark entanglement shown in Fig. 7. Immediately, the entanglement of the pairs leads to an entanglement of their kinematic arguments: the momentum fractions of the pairs become scrambled, such that which subsequently scrambles the positions of the gluons lying at the pairs' centers of momentum: With these definitions, the abbreviated notation for the various pairs can be spelled out explicitly as where the arguments denote the position of the quark, antiquark, emitting valence quark, gluon, and quark momentum fraction, respectively, for each pair.
For the quark entanglement channel of double-pair production, the complete color trace is given in terms of the building block (31) by There are 2 4 = 16 such terms corresponding to the different time orderings i, j, k, ∈ {1, 2} for this channel alone, and each of these color traces is quite long and generally involves high-order Wilson line multipoles.To illustrate this, consider just the first contribution This operator, which must yet be averaged over color configurations of the target, involves products of one, two, and three dipoles, quadrupoles, and the product of one or two dipoles with a quadrupole.Clearly, an expression of this complexity will be difficult to compute even in a simple analytic model like the MV / GBW model, and some resort to approximate or numerical techniques will be necessary.For this reason, we defer a detailed analysis of all the ensuing color structures to a dedicated future publication, in which we will study aspects of these full correlations analytically and numerically.Still, there are general physical features of the correlation functions we can understand by studying the single-pair production result (37) and the wave functions ( 41) and ( 43), which we will pursue next.

IV. CORRELATIONS OVER VARIOUS LENGTH SCALES
A. Single-Pair Production: Long-Distance Asymptotics One important quantification of the various correlation functions C ij is their characteristic correlation length L ij which controls the exponential falloff of the correlation function at To understand the physical picture embodied by the correlation functions C ij , let us compute their long-distance asymptotics.
Let us start by focusing on the single-pair production cross-section ( 17).This crosssection only generates one q q pair, and as such can only contribute to the quark-antiquark correlation function C q q given in (37).As seen clearly in (37), there are two sources of spatial dependence: the wave functions U i , L i , and T i which are given in ( 26) and the interactions.The interactions can be modeled in different ways, but for definiteness let us consider the expressions in the MV / GBW model given in (40).Many of these interactions are exponentially suppressed at large distances due to dynamical color screening effects embodied in the saturation scale Q s .The single dipole scattering amplitude, for instance, is and the majority of the double-dipole amplitudes have the same long-distance asymptotics: Thus we immediately see that the full structure of the single-pair C q q correlation exists on On length scales larger than 1/Q s , these terms die off, but the overall correlation function C q q is not similarly suppressed.In addition to the constant terms from the interactions, which are obviously unsuppressed for Thus for |B 1 − B 2 | T 1/Q s in the MV / GBW model, the q q cross-section is given by the simplified form This analysis shows that there are two interesting regimes of C q q in the case of single-pair production: The former is sensitive to very short-distance correlations driven by the interactions, while most of these effects have died away for the latter.This also shows that Q s does not set the overall correlation length L q q, since the correlation function overall does not decay exponentially when The correlation length is instead set by the long-distance asymptotics of the wave functions, which are insensitive to Q s .
To determine the overall correlation length L q q, then, we need to compute the longdistance asymptotics of the wave functions and the integrals F 0 , F 1 , F 2 given in (24) from which they are built.In the regime mr T 1, the asymptotics of these integrals are Propagating this forward to the wave functions, we find that all of the terms at long distances all have the same limiting behavior: The wave function squared and hence the single-pair contribution to the correlation function C q q correspondingly decay as such that the overall correlation length for C q q via single-pair production is set by the mass of the quark pair: L (single) q q = 1 2m .If the produced quarks are heavy, such as charm quarks, then the associated correlation length is a perturbative scale.If the quarks are light (or considered massless), on the other hand, then the correlations can extend to nonperturbatively long distances |B 1 − B 2 | T > 1/Λ QCD .Our perturbative calculation ceases to be applicable over such long distances, however, so for light quarks one should cut off the correlation function by hand at In the heavy-light regime, double-pair production is suppressed relative to single-pair production by a factor of α 2 s a 1/3 1, so the quark-antiquark correlation function C q q is therefore dominated by single-pair production (37) wherever it contributes.However, as we saw in (56), single-pair production can only accommodate correlations for distances ; at distances larger than this, the exponential suppression ( 56) of the single-pair correlations begins to compete with the suppression by α 2 s a 1/3 of the double-pair production mechanism.Thus for heavy quarks at distances larger than double-pair production becomes the dominant source of q q correlations C q q.These doublepair mechanisms will have correlation lengths L (double) q q which extend beyond L (single) q q = 1/2m.
For C qq and C q q, single-pair production does not contribute, and double-pair production is the leading mechanism at all length scales.

B. The Double-Pair Regime
While we defer a full analysis of the lengthy double-pair production cross-section for a dedicated future publication, there are some general features of double-pair correlations which we can understand already based on the wave function structures (41) and (43).
Consider first the fermion entanglement topology shown in Fig. 6.Any correlation function will tag on the production of quarks or antiquarks at two positions B 1 = x i and B 2 = x j , and the asymptotic limit the correlation corresponds to sending these coordinates to infinity in opposite directions: x i = + 1 2 r and x j = − 1 2 r.Because each coordinate x 1 , x 2 , x 3 , x 4 appears exactly twice in the fermion trace and all wave functions have the same asymptotic decay, all contributions from Ψ (loop) have the long-distance behavior This exponential suppression is exactly the same as for the single-pair production channel, as one would intuitively expect from a topology in which all the pairs are entangled.Thus we conclude that, whenever single-pair production is negligible for C q q, so is the fermion entanglement contribution from double-pair production.For C qq and C q q, fermion entanglement contributes only at distances shorter than 1/2m.
Similarly, for the topology with no fermion entanglement shown in Fig. 5, the wave function structure leads to the same suppression if the size of either pair However, combinations which tag on one (anti)quark from each pair are not suppressed in this way; in these cases, large |B 1 −B 2 | T requires large separations between the pairs' centers of momentum |u 12 − u 34 | T , rather than that the q q splitting itself be long-distance.This applies to C qq and C q q, which necessarily select the quarks or antiquarks from each pair, as well as the specific combinations , we can replace the delta functions Z (q q) from ( 18) with just the unsuppressed combinations, For heavy quarks, the region Here the distance is large enough that the dominant production process is double-pair production (with no fermion entanglement), but it is also short enough that correlations can arise from perturbative QCD.These correlations can occur either through the interactions of the pairs with the same correlated color domains in the heavy nucleus, or through the entanglement of the gluons which produce the two pairs, as illustrated in Fig. 8.
For nonperturbatively large distances Λ QCD , perturbative degrees of freedom cannot mediate the correlations.Entanglement of gluons, as in Fig. 8, not possible this because the range of perturbative gluons is limited to 1/Λ QCD .In this case, the only contribution to the double-pair cross-section (19) occurs when the large separation in |B 1 − B 2 | T is due to a large separation between the nucleons |b 1 − b 2 | T which radiate the independent pairs.At these distances, the two pairs are truly independent: they are not connected by the exchange of any perturbative degrees of freedom, and they interact with disjoint, uncorrelated color fields in the heavy nucleus.Because of this, both the wave functions and the interactions of the two pairs factorize; for example, for C qq we have where we integrated out the spectators in the last line.Still, even in this regime the correlated part (62) does not fully factorize into the product of two uncorrelated cross-sections, because the two pairs are constrained to arise from nucleons in the same light nucleus T a (b).These are the "geometric correlations" observed in [43], which contribute at distances |B 1 − B 2 | T up to the size R a of the light nucleus.
In the absence of geometrical structure from the light ion, say with S ⊥ A the transverse area of the heavy nucleus, a full factorization of the correlated part (62) does occur, such that the overall correlation function reduces to Even though there are no correlations of any kind, the correlation function itself remains nonzero, and the situation is similar for C q q and C q q.This is a consequence of the simple definition (6); other definitions [43] can be chosen such that the correlations go to zero here instead.

V. RESULTS AND OUTLOOK
A. Results: Single-Pair Production As a concrete illustration, let us directly compute the single-pair contribution to the quark-antiquark correlation function C q q (9b).The crux is the evaluation of the cross-section (37) which gives the correlated part of C q q, for which we will employ the MV / GBW model (40).For simplicity, we will take the large-N c limit, in which Organizing the terms based on the interactions, the correlated part is where we recall that u = αB 1 + (1 − α)B 2 is the center of momentum of the q q pair and is the longitudinal momentum fraction of the pair carried by the quark.We note that the saturation scale Q s contains an impact parameter dependence which can in principle be different from term to term.However, Q 2 s (b) ∝ T A (b) varies only over macroscopic scales proportional to A 1/3 , over which the perturbatively short-distance changes in coordinates are negligible; we can therefore choose to evaluate Q s at the same position (say, u) in all terms up to corrections of order O A −1/3 .Changing integration variables from b 1 to w = u − b 1 and employing the short-hand r = B 1 − B 2 gives The exponentials in the first set of terms above come from the interference of Ψ 1 and Ψ 3 from Fig. 3, the exponentials in the second set of terms comes from the interference of Ψ 2 and Ψ 3 , and the exponentials in the last set of terms comes from the interference of Ψ 1 and Ψ 2 .The terms with 1 in brackets come from the squares The d 2 w integral is involved but straightforward, and some of the terms contain a logarithmic IR divergence reflecting the (infinite) total radiation produced by the bare valence quark.Keeping the terms which are logarithmic in the IR cutoff Λ and dropping the finite pieces, the result is Note that we have not specified the scale in numerator of the logarithm, since choices of that scale differ only by finite pieces.In the same approximations, the inelastic cross-section (39) becomes with S ⊥ A ∝ A 2/3 the transverse area of the heavy target nucleus.The IR logarithms cancel in the ratio, leaving the correlated part of C q q as C corr q q (B 1 , k We observe that the fully differential correlation function ) is suppressed by a factor of the full phase space 1 S ⊥ A ∆Y .A more meaningful correlation function is therefore one in which the center-of-mass degrees of freedom u and q + = k + 1 +k + 2 have been integrated out to cancel this factor.Before we do that, however, let us first integrate out the antiquark degrees of freedom B 2 , k + 2 to obtain the inclusive quark production cross-section which contributes to the uncorrelated part of the correlation function C q q: The transverse integral can be done analytically in terms of Meijer G functions, but it will not generate a factor of the area S ⊥ A because the Bessel functions K 2 1 (mr T ), K 2 0 (mr T ) limit the size of the splitting to be of order O (1/m).As a consequence, the inclusive quark production term is suppressed by a factor of the transverse area, Therefore, in the full correlation function (9b), the correlated part C corr q q is proportional to 1/S ⊥ A , while the uncorrelated part (σ q /σ inel )(σ q/σ inel ) is proportional to (1/S ⊥ A ) 2 .The uncorrelated part is thus suppressed by a factor of A −2/3 relative to the correlated part and can be neglected: C q q ≈ C corr q q .After integrating out the center-of-mass degrees of freedom u and q + for fixed r and α, the correlation function is The dq + integration of (70) trivially cancels the rapidity phase space factor ∆Y , and the d 2 u integration only couples to the impact parameter dependence of the saturation scale: Since this impact parameter dependence corresponds to the nuclear profile function Q 2 s (u) ∝ T A (u), the precise result of this averaging depends on the geometry of the target nucleus.
One can explore the effects of different nuclear geometries like a Woods-Saxon profile, but for our present purposes we will simply neglect the impact parameter dependence of Q s , effectively treating the target nucleus as having uniform transverse density.In this approximation, the averaging is trivial and cancels the area factor S ⊥ A .
Having cancelled the phase space factor 1 S ⊥ A ∆Y , the resulting correlation function is This correlation function contains three-dimensional information about the distribution of quarks and antiquarks generated through single pair production, but since the splitting  fraction α is not fixed in a given event, it may be desirable to integrate out α as well.Doing so gives the final result for the q q correlation function, At this accuracy, the double-pair production cross-section is suppressed and can be neglected, so that the baryon number correlation function (9d) is given entirely by the q q correlation functions: The expressions (76) and (77) are one of the primary results of this paper, and they illustrate what we hope to achieve in future work which will extend this result to include correlations from double-pair production.We note that, although much of the physical content of this correlation is due to the mass of the quarks, Eq. ( 76) does have a well-behaved massless limit, for which lim m→0 [m 2 K 2 1 (mr T )] = 1 r 2 T and lim m→0 [m 2 K 2 0 (mr T )] = 0.The functional dependence of (76) on m and Q s is illustrated in Fig. 9, and the correlation function for physical values of the parameters is shown in Fig. 10.In these plots, we take α s = 0.3 for our fixed-coupling calculation, and we choose the saturation scales to reflect gold ions at x = 10 −2 at RHIC and lead ions at x = 10 −3 at the LHC.For the former case, we take the value of Q s quoted in the IPsat model (see, for instance, [58]), and for the latter we extrapolate using the rough pocket formula Q 2 s ∝ ( A x ) 1/3 .As expected, the range of the single-pair q q correlation function (76) is controlled by the quark mass and not by the saturation scale.However, as seen in It is also interesting to examine the diagrammatic origins of the correlation function (76).
The terms containing the IR logarithms which generated (76) all arose from the last line of Eq. (67), which received contributions from interference of the time orderings Ψ 1 and Ψ 2 (the first and second diagrams of Fig. 4) as well as the square (the third diagram in Fig. 4).As such, the dynamical part of the correlation function (76) reflects the correlations generated by the interference of the q q pair scattering and the gluon scattering: the stronger the effects of multiple scattering (quantified by Q s ), the stronger the resulting correlations.These correlations are therefore genuine effects of multiple scattering which would would be absent or weak in minimum-bias pp collisions, but enhanced in pA collisions.They also dominate the correlations present over short distances in heavy-light ion collisions, although as discussed previously, at longer distances double-pair production mechanisms will begin to dominate.
We emphasize the crucial role of the time ordering Ψ 2 in generating the quark/antiquark correlation function (76).This time ordering, in which the radiated gluon first scatters in the target field and then pair produces in the final state, is responsible for both the exponential dipole term in Eq. ( 68) by its interference with the time ordering Ψ 1 and for the 1 by its contribution to was omitted in the Pauli blocking calculation of [27] because it was not expected to generate statistical quark/quark correlations.In our case, however, the Ψ 2 contribution is essential to obtaining the interaction-mediated quark/antiquark correlations (76), where statistical effects are absent.
Last, we note that the approximations used to arrive at our results are rather simple ones -the MV / GBW model for the interactions, the large-N c limit, and the back-of-theenvelope setting of Q s .They illustrate the physical picture of the correlations and provide concrete benchmarks, but these approximations can and should be improved in future work.

B. The Physical Picture
As a result of the previous analysis, we can identify three different regimes (see Fig. 12) in which the correlation functions C qq , C q q, C q q will be dominated by different mechanisms: Region I : Region I covers distances short enough that a single q q splitting can generate the cor- 12: Example diagrams contributing to C qq , C q q , and C q q in Regions I, II, and III.
relations.For C q q, the dominant mechanism is single-pair production, and the relatively simple result is given in Eqs. ( 37) and (76).For C qq and C q q, the full complexity of doublepair production contributes, including fermion entanglement and gluon entanglement.This maximally complex situation couples to high-order color multipoles, like sextupoles and octupoles, which are difficult to handle analytically, so it may be necessary to resort to numerical [59] or approximate methods like the large-N c limit [60][61][62].
Region II only exists for heavy quarks; it covers distances large enough that single-pair and fermion-entanglement contributions have died off, but small enough that perturbative mechanisms still generate the correlations.All correlations are dominated by double-pair production without fermion entanglement, although perturbative correlations are generated by gluon entanglement and the interactions.The interactions in this case, while simpler and fewer in number than when fermion entanglement is considered, still couple to high-order color multipoles and may require numerical simulations or approximate analytical methods such as large N c limit.
Region III covers the longest distances over which correlations exist.Over these nonperturbative scales, two independent pairs are produced without dynamic correlations from entanglement or the interactions.Despite this, nontrivial correlations still persist due to the geometric correlations of the pairs from being produced within the density profile of the light nucleus.
This overall physical picture is presented in Fig. 13 for the case of C q q; this cartoon illustrates the transition from the single-pair mechanism calculated in (76) for Region I to the double-pair mechanisms in Regions II and III.The physical picture presented in Fig. 13, the associated diagrams presented in Fig. 12, and the ingredients with which to calculate them -the wave functions in ( 41) and ( 43) and the Wilson lines in (31) -are the second main of this paper.With these components, constructing the double-pair correlation function is straightforward but lengthy, as the particular example (48) illustrates.We will defer a detailed analysis of the double-pair results for a dedicated paper in Part II of this study.

C. Conclusions
In this paper, we have constructed the spatial correlation functions among quarks and antiquarks produced by q q pair production at mid-rapidity in heavy-light ion collisions.
Depending on the choice of correlation function and length scale, the process may receive contributions from single-pair production (17) or double-pair production (19), with doublepair production opening up many possibilities for entanglement between the pairs.All of these cases are constructed by tracing the appropriate fermion spin and color flows over combinations of the fundamental building block (30), the dressed amplitude to produce a single q q pair.
Because of the length and complexity of the full analysis, we have chosen to divide this calculation into parts.This paper constitues Part I of the analysis, in which we have calculated the single-pair contribution to C q q (76) explicitly and outlined the length scales Region I C qq ∼ C q q ∼ C q q ∼ I Region II C q q ∼ C qq ∼ C q q ∼ II Region III C q q ∼ C qq ∼ C q q ∼ III over which various double-pair contributions occur.The primary difficulty for the doublepair case arises from the proliferation of the number and complexity of the Wilson line color traces, as illustrated in the specific example (48).Each double-pair topology admits 16 distinct cases for the interactions due to the various time orderings, and there are 3 distinct topologies associated with fermion entanglement and another 3 with no fermion entanglement.For this reason, we defer the presentation and analysis of these results for Part II of the analysis in a future publication.
The calculations presented here are limited to the quasi-classical approximation, in which the Wilson line interactions are rapidity independent.At high enough energies, small-x evolution will modify this, so the proper inclusion of evolution in the total collision energy s will be a priority among future extensions.We have focused on constructing the general form of the interactions in terms of Wilson line multipoles, which can then be evaluated in a variety of methods.For analytic results which can be understood intuitively on an eventaveraged basis, we can take advantage of the MV model, supplemented with the large-N c limit as needed.For a more detailed and realistic evaluation of our correlation functions, we can instead sample the color multipoles using Monte Carlo techniques [59]; this approach would be well-suited to studying event-by-event fluctuations.
Ultimately, the goal of this analysis is to characterize the profile of quark and antiquark correlations in the initial stages of heavy-ion collisions.A final outcome of this work should be the development of a numerical code which can initialize not only the energy density, but also other conserved charges like flavor and baryon number.Then, when coupled with state-of-the-art hydrodynamics techniques, we will be able to address novel questions about the role of these conserved charges throughout the evolution of heavy-ion collisions.One possible observable that may be sensitive to these intial-state correlations is the production of J/ψ and other quarkonium states, because the rates of quarkonium regeneration may well be affected by the presence of initial-state quark pairs in close proximity.The initial-state (anti)quarks produced with these correlations may also translate, after hydrodynamic evolution and hadronization, into observable correlations between baryons and anti-baryons [36].
The resulting correlations between same-sign or opposite-sign hadrons may also provide an important conventional background for the Chiral Magnetic Effect [28].Although much work still remains to be done to fully assess the potential impact of conserved charges in the initial state of heavy-ion collisions, we believe this analysis represents an important step toward their incorporation in the next generation of hydrodynamic codes.

Fig 9 ,
the saturation scale does control the strength of the correlations.The corresponding plot for the baryon number correlation function (77) is shown in Fig. 11.Here we have summed over up, down, strange, and charm quark flavors; note that the negative value of C BB (anticorrelation) reflects the conditional probability of finding a negative baryon number charge in the vicinity of an associated positive one (and vice versa).

FIG. 13 : 1 Λ
FIG.13: Cartoon illustrating the general expected form of the correlation function C q q.The singlepair production channel calculated in (76) dominates in Region I, with a typical range of 1/m.The double-pair channels dominating Regions II and III are drawn schematically here; compared to the single-pair correlations in Region I, these channels are suppressed in magnitude by a factor of α 2 s a 1/3 but persist over longer ranges.The interaction-and entanglement-mediated correlations which characterize Region II are cut off at O 1 Λ QCD , while geometric correlations persist in Region III up to the size R a of the light ion.
Evaluation of the correlation function (76) physical quark masses and with saturation scales chosen to reflect collisions with gold ions at RHIC and lead ions at the LHC.
] Baryon Number: u + d + s + c FIG. 11: Evaluation of the baryon number correlation function (77) for physical quark masses, summing over up, down, strange, and charm quark flavors.