Probing superfast quarks in nuclei through dijet production at the LHC

We investigate dijet production from proton-nucleus collisions at the Large Hadron Collider (LHC) as a means for observing superfast quarks in nuclei with Bjorken $x>1$. Kinematically, superfast quarks can be identified through directly measurable jet kinematics. Dynamically, their description requires understanding several elusive properties of nuclear QCD, such as nuclear forces at very short distances, as well as medium modification of parton distributions in nuclei. In the present work, we develop a model for nuclear parton distributions at large $x$ in which the nuclear dynamics at short distance scales are described by two- and three-nucleon short range correlations (SRCs). Nuclear modifications are accounted for using the color screening model, and an improved description of the EMC effect is reached by using a structure function parametrization that includes higher-twist contributions. We apply QCD evolution at the leading order to obtain nuclear parton distributions in the kinematic regime of the LHC, and based on the obtained distributions calculate the cross section for dijet production. We find not only that superfast quarks can be observed at the LHC, but also that they provide sensitivity to the practically unexplored three-nucleon SRCs in nuclei. Additionally, the LHC can extend our knowledge of the EMC effect to large $Q^2$ where higher-twist effects are negligible.


I. INTRODUCTION
The dynamics of quantum chromodynamics (QCD) in the nuclear medium is one of the most interesting areas of modern nuclear physics. Many aspects of this are currently being investigated, including the formation of collective quark-gluon states such as quarkgluon plasma, the shadowing of the small x parton densities in nuclei, the hadronization of quarks and gluons within the nucleus, and the medium modification of partonic distribution functions in nuclei.
Another interesting aspect of nuclear QCD is the possibility for partons to carry a light cone momentum fraction higher than that of a free nucleon at rest. Deep inelastic scattering (DIS) from such a parton will correspond to Bjorken x > 1, and we will hereafter refer to partons with such a light cone momentum fraction as superfast partons.
Due to the short-range nature of strong interactions, detecting a superfast quark in a nucleus requires probing the nucleus at extremely short distance scales. The characteristic space-time distances in nuclei become shorter with an increase in Q 2 for fixed x owing to a property of QCD evolution, namely that a probed parton at high Q 2 will have come from a parent quark with a higher light cone momentum fraction. Thus, the theoretical expectation is that superfast quarks at large Q 2 will allow one to probe unprecedented small space-time distances in nuclei [1,2].
Our current understanding of the dynamics of nuclei at short distances is extremely limited. Due to the short range nature of the strong interaction, one expects that short distances will be dominated by multi-nucleon short range correlations (SRCs) [3], which includes non-nucleonic degrees of freedom (such as ∆∆ and NN * contributions from the SRCs), followed by the transition from baryonic to quark-gluon degrees of freedom.
There has been considerable progress recently made in studies of two-nucleon SRCs in inclusive and semi-inclusive nuclear processes, which have been dominated by quasi-elastic scattering of a high-energy probe (either an electron or a proton) off of a nucleon in the SRC [4][5][6][7][8][9][10][11][12][13][14][15][16]. These studies unambiguously established the existence of two-nucleon SRCs and measured the probability of a nucleon existing in one for the given nuclei. Due to the quasi-elastic nature of the scattering processes, however, it is much more difficult to reach kinematics dominated by three-nucleon SRCs [7]. Moreover, probing the transition to quark and gluon degrees of freedom has so far remained elusive.
The possibility of probing superfast quarks in hard nuclear processes provides a new venue in studies of SRCs. From the theoretical point of view, as was mentioned above, the QCD evolution of quark distributions gives an additional handle on the ability to probe exceedingly small space-time distances through an increase in Q 2 (at the scale of Ioffe time 1/m N ). From the experimental point of view, deep inelastic processes do not restrict the phase space of the final states (as is a case for quasielastic processes), and hence one can use closure to express the cross section through the parton distribution function of the nucleus, which in turn can be expressed through light cone density matrix of the nucleus. Measurement of the parton distribution at x ≥ 1 allows one to explore the transition from baryonic to quark degrees of freedom. For x ≥ 1.5, additionally, the parton distribution is very sensitive to the three-nucleon SRCs.
One of the conventional methods for probing superfast quarks in nuclei is deep inelastic scattering (DIS) from nuclei at Bjorken x > 1. A number of attempts have been taken over the years. The attempts to measure quarks at x > 1 were undertaken at CERN using a muon beam [17] and at FNAL using a neutrino beam [18], with contradictory results.
Measurements using electron beams were taken for x ∼ 1 at SLAC [19] for Q 2 ≤ 10 GeV 2 and at Jefferson Lab for Q 2 = 7 GeV 2 [20]. However, this x range gets a significant contribution from higher-twist quasieleastic scattering up to fairly large Q 2 (∼ 15 GeV 2 ). For example, the nucleus/deuteron cross section ratio is reduced for x = 1 and Q 2 = 10 GeV 2 by a factor of 2 to 2.5 [4]. The only way to avoid this is to probe larger values of Q 2 ≥ 30 GeV 2 , The only way to avoid this is to probe larger values of Q 2 ≥ 30 GeV 2 , for which the quasieleastic contribution will be a small correction [2]. Such experiments are currently included in the physics program for the 12 GeV upgrade of Jefferson Lab [21], and the first experimental data will be available within the next few years [22].
In this work, we propose a new approach to probing superfast quarks using data from the recent proton-nucleus run at the LHC. This method is based on the standard method of studying dijet production in hadron-hadron collisions, which allows for the light cone momentum fractions of the initial partons to be given in terms of momenta of the dijets. This gives the possibility of selecting transverse jet momenta and pseudo-rapidities to isolate scattering off the superfast quarks within the nucleus. We develop a theoretical framework for calculating this reaction, which requires addressing several issues, such as modeling the high-momentum (short range) properties of the nuclear wave function, and calculating the medium modification of parton distributions within the nucleus and evolving this modification. Within the framework we develop, we calculate the absolute cross section for the reaction and study its sensitivity to two-and three-nucleon SRCs.
The article is organized as follows: In Sec. II we review the formalism of the dijet production reaction, including its kinematics (and the meaning of x > 1) and the cross section formula. The cross section for dijet production depends on the nuclear parton distribution function (PDF), which is discussed in Sec. III. Sec. III also discusses short range correlations and medium modifications, and how they factor into and affect the cross section. Sec. IV discusses the hard subprocesses that contribute to dijet production. In Sec. V, we present numerical estimates for the cross section. Conclusions and outlook are given in Sec. VI. The Appendix gives detailed derivations of the factorization formula and the SRC parts of the nuclear light cone momentum distributions.

II. BASIC FORMALISM
The reaction we consider is the production of a dijet from a proton-nucleus collision, where the kinematics of the two outgoing jets are fully determined. The reaction is treated at the leading order (LO) in perturbative QCD (pQCD), meaning that the jets are produced by a two-body parton-parton scattering process, in which one parton originates in the proton and the other in the nucleus. In our derivation, we restrict ourselves by considering only nucleonic degrees of freedom in the nucleus, and thus consider the nuclear parton as originating from one of its bound nucleons. This scenario is presented in Fig. 1, where we also define the kinematic variables of Eq. (2).

A. Jet Kinematics
We consider a scenario with a proton moving in the +z direction and a heavy nucleus, with charge number Z and mass number A, moving in the −z direction. We describe their four-momenta using light cone coordinates, namely where p ± = E ± p z and p T is the two-component transverse momentum. Using the on-mass shell condition and the fact that the energies of the proton and nucleus greatly exceed their masses, in the collider reference frame one has where E 0 is the beam energy per proton in the collider reference frame, and s avg. N N = 4 Z A E 2 0 is the square of the average center-of-mass energy per nucleon. As an example, a lead-proton collision with a beam energy of 4 TeV per proton would have an average center-of-mass energy per nucleon of s avg.
N N ≈ 5.02 TeV. However, since the motion of the bound nucleon inside the nucleus will in general be variable, the actual center-of-mass energy per nucleon √ s N N is not a fixed parameter, though it will be equal to s avg.
N N in the limit where the nucleons do not interact and all move forward with equal momenta.
At leading order, the collision results in an interaction between two partons, one each from the proton and the nucleus. Their respective four-momenta are labeled p 1 and p 2 . We use a collinear approximation, in which the initial partons are treated as having zero transverse momentum. Moreover, they are treated as massless and on-shell, so The light cone momentum fractions are defined for each parton as Note that x A is scaled by a factor of A. The rationale behind this is that the parton from the nucleus is found within one of its nucleons, and in a limiting case where the nucleons are all non-interacting and carry equal momenta, their light cone momentum is equal to , meaning x A ≤ 1 in this case. In reality, however, the bound nucleons do interact and it is possible for p − 2 > p − N,avg . In this situation, the nuclear parton originates from a nucleon which has larger-than-average momentum (p − N > p − N,avg ). This indicates that finding an exceedingly large x A > 1 will identify a bound nucleon with momentum significantly larger than average.
The parton momentum fractions cannot be directly measured. However, they can be related to the kinematic parameters of the jets. The jets from the proton and the nucleus are respectively ascribed four-momenta p 3 and p 4 . At leading order, they come from the fragmentation of two partons (with the same momenta), which are treated as massless and on-shell. From energy-momentum conservation, it follows that and due to the assumed collinear approximation, p 3T = −p 4T ≡ p T . Using this relation, and neglecting the masses of the produced jets, we obtain: To proceed, we define the rapidity η as and use this, with the massless limit (in which p + p − = p 2 T ) to obtain p ± = p T e ±η .
Applied to the jets of reaction (1), this results in p ± 3 = p T e ±η 3 and p ± 4 = p T e ±η 4 . Using these relations, and energy momentum conservation, viz. Eq. (7), the light cone momenta of the initial partons can be expressed through jet kinematics in the following form: . These relations can be used to express the momentum fractions x p and x A in terms of jet observables, namely The main question that concerns us is whether partons with x A > 1 can be seen in leadproton collisions at the LHC. Eq. (12) suggests that we should look for three conditions: large p T , small (or highly negative) η 3 , and small (or highly negative) η 4 . The first two conditions are expected to lead to a drop in the cross section, however, so a compromise must be found between the conditions that lead to large x A and those that lead to an appreciable cross section. We anticipate that the best scenario is one with a forward jet from the nucleus (highly negative η 4 ) and a central jet from the proton (small η 3 ). A plot of the dependence of x A on η 4 is given in Fig. 2.
In addition to the momentum fractions, we wish to express the characteristic "hardness" scale Q 2 , to be used as both the factorization scale and renormalization scale, in terms of jet observables. It is natural to use Q 2 = p 2 T , since the Mandelstam variable −t for the parton-level process is

B. Dijet cross section
The factorization formula for the hadronic cross section can be presented in terms of the partonic cross section as follows: where f i/p (x p , Q 2 ) and f j/A (x A , Q 2 ) are parton distribution functions (PDFs) for the proton and nucleus, respectively. This is similar in form to the standard factorization formula for pp collisions [23,24], and in fact reduces to it in the free nucleon limit. A derivation of Eq. (13) can be found in Appendix A.
Since, at leading order, the momenta of the outgoing partons coincide with the jet momenta, we can integrate out the transverse momentum of one of the jets and express differentials through the rapidities of the outgoing jets. For this purpose, we present the differential form of Eq. (13) as where we have expressed four-momentum conservation through the light cone momenta.
Here, the indices i, j, k, and l indicate parton types, and M ij→kl is the invariant Feynman amplitude for the hard partonic scattering process.
Using the definitions of the light cone momenta fractions in Eqs. (5,6), we can express the delta functions for p ± conservation as: Inserting this relation into Eq. (14), one can integrate out p 4T , x p and x A , resulting in The p z elements can be rewritten as rapidities, since dη = dpz E . In addition, as 2( Since the square of the average center-of-mass energy per nucleon is the differential cross section can be written in the standard form:

III. NUCLEAR PDFS
The main theoretical issue to be worked out is the nuclear PDF f j/A (x A , Q 2 ). Currently, there is a lack of strong experimental constraint on nuclear PDFs. Phenomenological parametrizations exist for several nuclei, and are based on experimental measurements of the nuclear structure function F (A) 2 (x, Q 2 ) in deep inelastic scattering (DIS) experiments in a restricted range of x and Q 2 . (cf. e.g. Refs. [25][26][27][28][29][30].) Most treatments of the nuclear PDF parametrize the ratio between the nucleus and the nucleon, which only makes sense for treating x A up to 1. Ref. [26] instead relates the nuclear and nucleonic PDFs though a Mellin convolution, and parametrizes the function that is convoluted with the nucleonic PDF. This approach has the possibility of accounting for x A > 1, but owing to a lack of experimental constraint, Ref. [26] did not treat this region.
In order to describe x A > 1 in the absence of robust empirical parametrizations of the nuclear PDF, we must account for the nuclear PDF in terms of the nucleonic PDF theoretically. To this end, we must look at the theoretical relationship between the nuclear and nucleonic parton distributions.
Theoretical and experimental studies of DIS reactions have demonstrated that, in a limit where only nucleonic degrees of freedom are accounted for in the nucleus, the nuclear structure function can be expressed as a convolution between the bound nucleons' structure functions and the nuclear light cone density matrix [1,31,32]. In the Bjorken limit, where the photon virtuality Q 2 and energy q 0 both go to infinity at a fixed x = AQ 2 2M A q 0 ), the convolution formula takes an especially simple form: Here, ρ N/A (α) is the light cone density matrix of a nucleon N in the nucleus A, described in Sec. III A. The "eff" in the superscript of F (N,eff) 2 x α , Q 2 signifies that this is an effective nucleonic structure function, which differs from the free nucleon structure function owing to modifications from the nuclear medium. When the factorization scale µ F is equal to the characteristic hardness scale Q, we have Using this relation and Eq. (18) allows us to obtain a convolution relation between nuclear and bound nucleon PDFs in the form: This relation indicates that to construct nuclear PDF's one need to address two theoretical issues: the light cone density matrix of ρ N/A (α), and possible medium modification effects on the bound nucleon PDFs.

A. Light cone density matrix and SRCs
We formally define the light cone density matrix in terms of the nuclear wave function as: where the (scaled) light cone momentum fraction α given by and p T is the transverse momentum of the nucleon. The light cone density matrix is normalized to obey two sum rules, namely baryon number conservation and momentum conservation: For our calculations, we decompose the light cone density matrix as a sum of contributions from the nuclear mean field and short range correlations (SRCs). The decomposition takes the form: where ρ (N ) M F (α, p T ) is the mean field part, and ρ (N ) j (α, p T ) is the density of j-nucleon SRCs.

Mean field density matrix
The mean field part of the light cone density matrix describes how the nucleons in the nucleus would be distributed if they were only acted upon the mean field generated by the bulk of the nucleus. The mean field density can be related to the wave function of the nucleus, which is calculated in the non-relativistic limit since the relevant momenta are smaller than the typical Fermi momentum of heavy nuclei. The crudest mean field model considers the heavy nucleus as a degenerate Fermi gas, and the typical Fermi momentum is around 250 MeV [33]. More sophisticated momentum distributions (calculated based on, for example, Hartree-Fock approximations) still fall off sharply after the Fermi momentum, so the leading order relativistic corrections are at most on the order of the mean field momentum distribution strength above k F , which is at most a few % in magnitude.
The mean field density is related to the non-relativistic momentum-space wave function using the sum rule of Eq. (23). Short range correlations are briefly neglected, so that the mean field density matrix obeys Eq. (23) by itself. The momentum-space wave function Ψ (N ) M F (p) (which carries an index of (N) since the wave function will in general be different for protons and neutrons) is likewise normalized to unity, so we equate In the non-relativistic limit, we can write, in the nuclear center of mass frame, Consequently, we can identify One modification must be made when accounting for SRCs. It is not the mean field part of the light cone density matrix, but the light cone density matrix taken as a whole that is normalized to satisfy the sum rules of Eqs. (23,24). Therefore, ρ i.e. it subtracts off the probability that the nucleon is in a short range correlation. For instance, if a nucleon is in a short range correlation 25% of the time, then a (N ) 1 (A) will be 0.75. In general, however, a (N ) 1 will be different for protons and neutrons. For numerical estimates, we will use the momentum distributions calculated in Ref. [34] for the mean field.

Two-nucleon correlations
Around 25% of the time, a nucleon in a heavy nucleus is in a two-nucleon short range correlation [4]. A short range correlation occurs when two nucleons are separated by a short distance on the order of 1 fm and consequently have a relative momentum larger than k F . The analysis of recent experiments indicates [12] that in the momentum range from k F to ∼ 650 MeV, the nucleus is well-described by two-nucleon SRCs with negligible center of mass motion. Due to the short distance between the nucleons, the dynamics of the nucleons are primarily influenced by their mutual interaction rather than by the mean field. The most immediate consequence of two-nucleon SRCs is a large tail in the momentum distribution above the Fermi momentum, which is not reproduced by mean field models [14].
Within the last decade, there has been considerable experimental and theoretical effort put into studying two-nucleon SRCs and their properties. There are a fair number of triplecoincidence experiments demonstrating their existence [10,11,16] and demonstrating that two-nucleon SRCs primarily form between a proton and a neutron [9][10][11]. The fact that most two-nucleon SRCs are predominantly pn pairs has an important implication for momentum distributions (and thus for the 2N light cone density): for neutron rich nuclei, a given proton is more likely to be in a short range correlation than a given neutron [35][36][37]. In fact, it means ρ 2 (α, p T ), and consequently (for the sum rules Eqs. (23,24) to be satisfied), a Another important aspect of two-nucleon SRCs is the universality of their behavior. The high-momentum tail that they introduce to the momentum distribution is universal in its form among nuclei [36,38]. Most significantly, because of the dominance of spin-one, isosinglet SRCs, the high-momentum tail behaves like a scaled version of the tail in the deuteron momentum distribution.
Based on the arguments discussed above, a model of the two-nucleon SRC contribution to the nuclear light cone density matrix should incorporate two main properties: first, the universality of the form of the high momentum tail, and second, that the SRC density matrix for the proton and the neutron should be inversely proportional to its respective relative fraction. For the first property we use the light cone approximation of Ref. [1,39], which uses the requirement of rotational invariance to relate the light cone deuteron wave function to the non-relativistic wave function through the effective pn relative momentum: For the second property, we use the model of Ref. [40], in which the high momentum distribution of nuclei is inversely proportional to the relative fraction to the nucleon type. The model additionally includes a scaling factor a 2 (A), which is extracted from SRC studies in inclusive eA processes. Combing these two properties produces the following for the two-nucleon SRC density matrix: where the extra factor of α(2 − α) is from the phase space of the struck and spectator nucleons. The details of the derivation of Eq. (29), can be found in Appendix B 1. Here, ψ d (k) is the relativistic, light-cone deuteron wave function, related to the non-relativistic wave function by and χ N is the relative abundance of nucleons of type N in the nucleus, i.e. χ p = Z A and This relation is confirmed by variational Monte-Carlo calculations for light nuclei up to A = 12, and for medium to heavy nuclei it was confirmed using a correlated basis calculation of the nuclear momentum distributions in a non-relativistic approach [41]. In the present model we neglect by center of mass motion of the SRC. As a result, the two-nucleon SRC to the nuclear light cone density matrix sets in as soon as k > k F , which is signified by the step function Θ(k − k F ) in Eq. (29).
The remaining factors in Eq. (29) need to be accounted for. a 2 (A) is a scaling factor that describes how big the high-momentum tail of the nucleus is relative to the high-momentum tail of the deuteron. It is determined experimentally by examining ratios of quasielasitc inclusive cross sections of A(e, e ′ X) and 2 H(e, e ′ X) reactions: where a roughly flat plateau in the ratio of the quasielastic cross sections is known to occur [7,8,13,15]. The values of a 2 (A) for various nuclei are well-constrained by experiment [8] and it is equal to roughly 5.6 for heavy nuclei such as iron-56. In this work, we will also use a 2 ( 208 Pb) = 5.6, as no experimental data for this quantity exist for lead.
For numerical estimates, we use the Paris potential for parameterizing the non-relativistic deuteron wave function [42] in Eq. (30).

Three-nucleon correlations
When the light cone momentum fraction α > 2, one expects the nuclear light cone density matrix to be dominated by three-nucleon short range correlations. As with two-nucleons SRCs, we consider a three-nucleon SRC to occur when the center of mass momentum is small, and when the struck nucleon has large (> k F ) momentum relative to the other two nucleons. Such a configuration can naturally generate a light cone momentum fraction exceeding 2.
There are two distinct mechanisms that could generate three-nucleon correlations. The first is through a sequence of two-nucleon interactions, and the second an irreducible threenucleon interaction. The second contributes primarily to the large removal energy part of the nuclear spectral function, and since the light cone momentum distribution is an integral over removal energies, one expects it to be dominated by the first mechanism. Therefore, we develop a model of three-nucleon SRCs where the struck nucleon obtains its momentum from a sequence of two-nucleon interactions. This model is based on the collinear approximation of Ref. [3], where the three nucleons in the SRC are moving collinearly prior to the SRC generation mechanism.
In the present work, we develop the model further using the recently observed dominance of pn pairwise correlations. This allows us to express the three-nucleon SRC as occurring through a sequence of pn interactions, and thus the 3N SRC part of the nuclear light cone density matrix as a convolution of two 2N SRC light cone momentum distributions. A full derivation of the three-nucleon SRC density matrix can be found in Appendix B 2, with the final result: three-nucleon SRCs to appear is that the relative light cone momenta of the pairs should each satisfy the threshold condition for which short range two-nucleon interactions occur, namely they should both be above the Fermi momentum k F . It is worth noting that k 12 and k 23 are both functions of the light cone momentum fraction α 3 of a spectator nucleon, which is integrated over. (c.f. Appendix B 2 for details.) Because of this, despite the jump discontinuity in the step function, the discontinuity is smeared out and ρ 3 (α, p T ) is itself a smooth function. The form of ρ 3 (α, p T ) is given in Fig. 3.
The factor {a 2 (A)} 2 appearing in Eq. (31) is a consequence of the fact that three-nucleon SRCs arise from a sequence of short-range pn interactions. The probability of having a three-nucleon correlation should be proportional to the square of the probability of having a two-nucleon correlation. We have taken them to be equal, i.e. for the proportionality factor to be 1. In principle, the proportionality factor may be less than 1, owing to restrictions such as selection rules. However, in the present work, we will assume the maximal reasonable coefficient, which is {a 2 (A)} 2 .
Eq. (31) may also be written in terms of a scaling factor a 3 (A), similar to a 2 (A) the scaling factor for two-nucleon SRCs. If it is assumed that three-nucleon SRCs are universal in their behavior, the parameter a 3 (A) would be possible to extract in a similar way to a 2 (A) in the case of 2N SRCs. In particular, as is the case for a 2 (A), it should be possible to extract a 3 (A) through the ratio of quasielastic cross sections for eA and e 3 He processes at high x > 2, viz.
provided that another plateau is actually observed in this ratio. In principle, the observation of  work. Consequently, we have The experimental status of a three-nucleon scaling plateau is, however, ambiguous. While observation of an x > 2 plateau was reported in Ref. [8], a later experiment at similar kinematics and better resolution did not observe the plateau [13]. The discrepancy may be due to a resolution issue in the first experiment [43]. It is not expected that a plateau should be seen at the kinematics probed in these experiments, since the momentum transfer was too low to probe three-nucleon SRCs and thus to observe the scaling plateau in the ratio of Eq. (32) [44].
In lieu of experimental data for a 3 (A), we calculate ρ 3 (α, p T ) in this work using Eq. (31), which requires knowing only the experimentally well-known quantity a 2 (A). Fig. 3 contains a parametrization of just the 3N part of the nuclear light cone density calculated using Eq. (31) with a 2 (A) = 5.6. In Fig. 4, we present the nuclear light cone density matrix for lead. In Fig. 4a, we compare the light cone density using Eq. (25), both considering only the mean field, and considering two-nucleon and three-nucleon correlations in addition. Despite most of the three-nucleon light cone density being concentrated at α < 2, it makes a negligible contribution on top of the mean field and 2N density until around α > ∼ 1.7. In Fig. 4b, we have compared our model for the nuclear light cone density to a model explicated in Chapter 5 of Ref. [3]-in particular, in its Eq. (5.11)-which we call the FS81 model. This model was constructed to fit p + A → p + X data with 400 GeV protons [45]. The model implicitly considers four-nucleon, five-nucleon, etc. contributions to exist to the nuclear light cone density matrix, and accordingly is expected to exceed our model, as can be seen in Fig. 4b.

B. Medium modifications
If a nucleon within the nucleus were identical to a nucleon in free space, then with both the light cone density matrix described in Sec. III A and an empirical parametrization of the proton PDF, we could now use Eq. (20) to obtain the nuclear PDF and proceed to calculate the dijet cross section. However, the nuclear medium is strongly interacting, and each nucleon has a high probability (20-30%) of being in a short-range correlation where the nucleons themselves overlap. As a result, it should be expected that the parton distributions for nucleons immersed in the nuclear medium should be modified in some way.
Modification of nucleons within the nuclear medium was first observed by the European Muon Collaboration [46] in measurements of the ratio of cross sections for deeply inelastic muon scattering for nuclear and deuteron targets, i.e. in The small angle of the scattering that was measured allowed the cross sections to be related to the structure functions. It was originally expected that, except for smearing from Fermi motion at high x, the structure functions should just be the structure functions for free nucleons, and thus the ratio should be 1. However, there is a dip below 1 in the range 0.3 < x < 0.7 that cannot be reproduced by nucleonic motion alone [47,48]. The effect is commonly referred to as the EMC effect. This dip in the ratio R EMC is ubiquitous throughout nuclei, and is roughly proportional to the nuclear mass number [49]. Additionally, it is proportional to the local density of the nucleus considered [50], suggesting that it is an effect of the nuclear medium. Since the structure function F 2 (x, Q 2 ) is directly related to the parton distribution functions, the EMC effect is concluded to be due to a modification of the PDF of a bound nucleon in the nuclear medium. Even though there is a consensus about this, there is a plethora of theoretical models based on different aspects of nuclear medium effects (for reviews on the EMC effect see [51,52]). Despite differences in the approaches taken by these models, one thing they have in common is that in order to satisfy locality, the size of the effect should be proportional to the virtuality of the bound nucleon. This indicates that more modification should occur in the high momentum part of the nuclear wave function.
Since our goal is to study the effects of SRCs on the reaction (1), and since SRCs dominate the high momentum part of the nuclear wave function, it will be necessary to account for the medium modifications that produce the EMC effect when studying this reaction. In particular, the medium modification effects will be prominent in the nuclear parton distribution f i/A (x, Q 2 ) present in Eq. (17). Numerical estimates of the EMC effect will be based on the color screening model [1,53], which satisfactorily describes the phenomenology of EMC effects observed in inclusive DIS reactions.
Before we proceed to describe medium modifications, we present estimates of the ratio given in Eq. (34) in the absence of medium modifications. The quantity F 2 (x, Q 2 ) is calculated using the convolution formula Eq. (18), and the formalism of Sec. III A is applied for computing the light cone density matrix. These estimates are presented in Fig. 5, where a significant effect from SRCs can be seen for x > 0.6. The discrepancy between theory and experiment becomes worse in this region with SRCs present, requiring that SRCs are more strongly modified by the nuclear medium than nucleons in the mean field. This is consistent  [49]; circles (red) are from [54].
with our expectation that modification should increase with nucleon virtuality, since SRCs occur predominantly at especially large momenta.

Correctly defining x
Up until now, data from DIS experiments have been plotted against the Bjorken scaling variablex = Q 2 2mpν , where m p is the mass of the proton. However, the Bjorken scaling variable that enters into the dynamics of QCD, including the convolution formulas of Eqs. (18,20), is instead x = AQ 2 2M A ν . Thus, it is as an argument of x rather thanx that the ratio of Eq. (34) should be presented in order to see dynamical QCD effects. By contrast, presenting data as a function ofx instead of x artificially shifts the arguments of the nuclear and deuteron structure functions in Eq. (34) by different amounts, since Since the deuteron is a loosely bound system with little binding energy, the arguments for the nuclear and deuteron structure functions are shifted by substantially different amounts when the nucleus has a high binding energy.
It has been observed [30,58] that presenting structure function ratios as an argument of x rather than x introduces an artificial dip in the ratio for x > ∼ 0.5. In Fig. 6, we present theoretical calculations for R EMC (x) (as a function of the dynamical QCD scaling variable x) and R EMC (x) (as a function of the kinematical Bjorken scaling variablex) for both 56 Fe and 208 P b. In these plots, the use ofx as a variable produces an artificial dip, which partially explains the EMC effect up to x ∼ 0.5. However, there continues to be a discrepancy between theory and data, which requires medium modification to explain.

Color screening model of the EMC effect
Any nucleon can be expected to spend some of its time in a point-like configuration (PLC), in which its constituent quarks are compressed into a tiny volume compared to the average nucleon radius. A PLC is largely invisible to the color force, in analogy to neutrally charged atoms in a gas whose van der Waals forces become weaker if the atoms are compressed. The nucleon, after all, has an overall neutral color charge, so any color interaction between nucleons owes to higher moments (dipole, quadrupole, etc.), which decrease with distance between the color-charged constituents. Moreover, it can be shown by the renormalizability of QCD that meson exchange between nucleons, one of which is in a PLC, is suppressed [53].
Since nucleons in an average-sized configuration (ASC) and a PLC will interact differently, the probability that the nucleon can be found in either configuration should be modified by the immersion of a nucleon in the nuclear medium. In particular, PLCs are expected to be suppressed compared to ASCs since the bound nucleon will assume a configuration that maximizes the binding energy and brings the nucleus to a lower-energy ground state. The change in probability can be estimated using non-relativistic perturbation theory, as has been done in Refs. [1,53]. What is found is that the light cone density matrix should be modified by a factor δ A (k 2 , x), which depends on the nucleon momentum (or virtuality) as In analogy with electric charge screening, this is called the color screening model of the EMC effect. We shall use it as an example of accounting for medium modifications when calculating dijet cross sections. For small x < ∼ 0.45, there should be little or no color screening, and the region between 0.45 < x < 0.6 is smoothly interpolated with a linear damping factor: Since the suppression factor depends on the total nucleon momentum rather than just the light cone momentum fraction α, it is necessary to use the three-dimensional light cone density ρ(α, p T ) when applying the color screening model. Moreover, since the suppression of PLCs depends on inter-nucleon dynamics, it is expected not just that the parameters of δ A (k 2 , x) should vary with the nucleus considered, but with whether the nucleons are moving in the mean field or are in an SRC. For a nucleon in the mean field of a heavy nucleus, we expect the excitation energy ∆E A to be in the range 300 − 500 MeV, namely between the excitation energies of a ∆ and an N * resonance. The best bit to data appears to be with the N * excitation energy ∆E A ≈ 500 MeV. However, for the deuteron, as well as for a nucleon in a 2N SRC, we expect the lowest excited state to be a ∆∆ configuration, giving ∆E d ≈ 600 MeV. We take ǫ A to be the binding energy per nucleon of the nucleus under consideration when in the mean field, and estimate ǫ A = ǫ d ≈ 2.2 MeV for a nucleon in a two-nucleon SRC, and since a three-nucleon SRC consists of a sequence of two two-nucleon SRCs. We expect the characteristic excitation energy for a nucleon in a 3N correlation to be around 600 MeV, since the 3N correlation is effected through a sequence of 2N correlations, and likewise we take ǫ A = 2ǫ d ≈ 4.4 MeV for a nucleon in a three-nucleon SRC. As can be seen in Fig. 7, this results in a fairly good account of the EMC effect. The color screening model is now applied to PDFs rather than structure functions. We assume that the suppression factor δ A (k 2 , x) that appears in the convolution formula for F (A) 2 similarly appears in the formula for nuclear PDFs, namely The form of k 2 , as well as the values of the parameters in δ A (k 2 , x) depend on whether the nucleon is moving in the mean field or in an SRC, as before. There is one ambiguity in applying the color screening model to PDFs, however: the correct modification appears in F large x, which is the region where valence quarks predominate, gluons in this region should originate as radiation from quarks, and should therefore inherit medium modifications from the quarks. For this reason, we apply Eq. (38) to gluons as well as quarks and anti-quarks. The result of applying color screening to the nuclear PDF is demonstrated in Fig. 8.

Evolution of medium modifications
The EMC effect has so far been observed in a restricted range of Q 2 , around Q 2 ∼ 10 GeV 2 . One would expect that medium modification effects will be present at larger Q 2 values as well, such as at the Q 2 values characteristic of dijet production at the LHC, which can reach tens of thousands of GeV 2 . Therefore, it is necessary to predict how the EMC effect will manifest at Q 2 well above the domain where the effect has been observed. Assuming that no higher twist effects are involved in the EMC effect, the modified nuclear PDFs at high and low Q 2 should be related by QCD evolution equations. For this reason, we evolve the nuclear PDFs obtained through medium modification at Q 2 = 10 GeV 2 using DGLAP evolution [60][61][62]. The standard DGLAP formula is modified to account for the possibility of x > 1 inherent in dealing with nuclei; in particular, we have Here, P ij are the Alterelli-Parisi splitting functions, found at leading order in Ref. [62]. In our calculation we solved Eq. (39) numerically, using the nuclear PDFs obtained in the previous sections as input.
The results of this evolution are shown in Fig. 9, which compares the effects of evolution on nuclear PDFs with and without medium modification accounted for. In both cases, the Unevolved lines are at Q 2 = 10 GeV 2 , and evolved at Q 2 = 10000 GeV 2 . The Q 2 = 10GeV 2 PDFs are computed using the CT10 parametrization [59], as well as the convolution formula Eq. (20).
ratio of the u + d quark distribution for the nucleus to the proton is used, since the effects of evolution are most easily seen in ratios, and since adding the u and d distributions removes peculiarities of isospin asymmetry.

IV. HARD SUBPROCESSES
We have developed a model of nuclear PDFs that accounts for both short range correlations and effects from medium modification. In addition to a model for PDFs, however, we also need to calculate the squares of the invariant matrix elements |M ij | 2 for the hard partonic subprocesses that contribute to Eq. (17). We consider two partons, labeled i and j, to exist in the initial state of the reaction-one from each of the proton and the heavy nucleus participating in the reaction (1). At leading order, a dijet is associated with two partons in the final state, which we label k and l. Each parton can in principle be a quark, an anti-quark, or a gluon, although what partonic subprocesses are possible is limited by quark flavor conservation. A list of the possible two-parton to two-parton subprocesses and their squared matrix elements at leading order is given in Table I. At next-to-leading order, a dijet can be associated either with two or with three partons in the final state, provided two of the three partons have small differences in both their rapidities and their azimuthal angles. This is possible because jets are identified as sprays of hadrons that fall within a cone of radius R in η-φ space, i.e. for which ∆η 2 + ∆φ 2 < R 2 . If ∆η and ∆φ between two partons are both small, the sprays of hadrons produced by each will overlap and will be detected as a single jet [64]. In addition to the concern of overlapping sprays of hadrons, it is theoretically necessary at next-to-leading order to account for final states with three partons, as the next-to-leading order matrix elements for two-parton to two-parton processes contain infrared divergences that are only canceled when two-parton to three-parton processes are also accounted for [65].  [63].
Since, at next-to-leading order, the relevant dijet processes include final states where two partons correspond to a single jet, jets will not in general be massless at this order. However, the kinematical formula Eq. (6) for determining x A in terms of jet kinematics requires that the two jets are associated with two on-shell, massless partons in the final state. This means we would like to consider two-parton to two-parton subprocesses only, which requires treating the problem at leading order.
The question is whether the leading order in pQCD is sufficient to calculate inclusive dijet production in the kinematical regime we are investigating. To conclude that leading order is sufficient, we perform a leading order calculation of the reaction (1) for the case of A = 1, i.e. for dijet production from proton-proton collisions, for which a fair amount of experimental data in the kinematical regime of interest exist. We compare the calculation within the leading order to experimental data from Ref. [66]. The data, however, are given in terms of a two-fold differential cross section d 2 σ/dM JJ d|y * |. We will briefly describe the variables used in Ref. [66].
At next-to-leading order, a jet may be massive, which requires distinguishing between rapidity (y) and pseudo-rapidity (η). At leading order, with massless jets, η = y, so for our calculations we will henceforth denote rapidity as η. With jet rapidities η 3 and η 4 ,η is the rapidity of the dijet as a whole, and η * is the rapidity of an individual jet in the dijet Exp. (0 < |η * | < 0.5) Exp. (2 < |η * | < 2.5) FIG. 10: (Color online) Comparison of leading-order (LO) calculation of the two-fold differential cross section using Eq. (43) to experimental data from Ref. [66]. Proton PDF used is CT10 [59] center-of-mass frame, whereη Note that the Jacobian for the transformation from (η 3 , η 4 ) to (η, η * ) gives us dη 3 dη 4 = 2dηdη * . In addition to the rapidity in the center-of-mass frame η * , Ref. [66] uses the dijet mass M JJ , which is given by or M JJ = 2p T cosh(η * ). We note that the overall Jacobian is given by dp 2 T dη 3 dη 4 = 2p T cosh(η * ) dM JJ dηdη * , and thus the two-fold differential cross section given in Ref. [66] can be written as The three-fold differential cross section is evaluated using Eq. (17), but with f A = f p and Z = A = 1 to compare with proton-proton collision data. The data of Ref. [66] are given at √ s N N = 7 TeV, and are given in bins of |η * |. Since bins are of |η * | rather than η * , it is necessary to add the cross sections at η * = |η * | and η * = −|η * | together in the theoretical calculation. We have performed the calculation at leading order in α s , and present the calculation against experimental data for two |η * | bins in Fig. 10. As can be seen in the figure, the leading order is adequate for describing the data.

V. NUMERICAL ESTIMATES
In the previous sections, we have supplied all the ingredients necessary to calculate the three-fold differential cross section of Eq. (17) for the dijet production reaction (1). It is our intention to use these ingredients to determine the sensitivity of the dijet production reaction (1) to the presence of short range correlations in the nucleus, as well as to medium modification effects. Additionally, we will look at nuclear partons with x A > 1 in particular. Since x A can be related to measurable jet kinematics via Eq. (6), it is necessary to elaborate on the kinematics we will consider. Once the kinematics considered are explicated, we will present numerical estimates of the three-fold differential cross section of Eq. (17) at these kinematics, and additionally we will estimate the partially and fully integrated cross section to demonstrate that events corresponding to x A > 1 are frequent enough to be measurable at the LHC.

A. Kinematics considered
Since it is our goal to probe x A > 1, we consider kinematics that maximize x A . According to Eq. (6), three variables are related to x A . The jet rapidities η 3 and η 4 should be as "small" (as highly negative) as possible in order to maximize x A . Additionally, the transverse jet momentum p T should be as large as possible. On the other hand, non-central jets are more difficult to detect and identify, and the jet cross section for large p T is known to drop rapidly. It is necessary to find a balance between kinematics that will increase x A and those that will produce an appreciable yield.
Since central jets are easier to detect, and since we are interested in superfast partons in the nucleus, we consider jets that are central from the proton, namely −2.5 < η 3 < 2.5, and we consider jets that are moving forward in the direction of the nucleus beam, namely 3 < −η 4 < 5. We also consider a p T range from 40 GeV to 200 GeV.
B. Three-fold differential cross section Eq. (17) can be used to obtain numerical estimates of the three-fold differential cross section d 3 σ/dη 3 dη 4 dp 2 T for the dijet production reaction (1). Numerical estimates are presented in Fig. 11, where estimates with and without short range correlations are presented for comparison, and likewise estimates with and without medium modification are presented. As can be seen in Figs. 11a and 11b, the presence of SRCs increases the cross section for superfast (x A > 1) quarks considerably. However, medium modification decreases the cross section, especially at large x where highly-modified SRCs dominate the cross section, as can be seen in Figs. 11c and 11d.
In addition to comparisons of the effects of short range correlations and medium modifications within the model for the nuclear light cone density explicated in the present work, we compare in Fig. 12 the prediction for the three-fold differential cross section using our model to the prediction made using the FS81 model [3]. The FS81 model uses exponential decay to model the large α part of the nuclear light cone density, and likewise the nuclear PDF calculated using this model exhibits exponential decay for large x A . It is for this reason the curve for the FS81 model appears as a straight line in the log plot of Fig. 12. Qualitatively, the model of this work can be distinguished experimentally from the FS81 model by determining whether the three-fold differential cross section of Eq. (17) falls exponentially or faster-than-exponentially.   ) Three-fold differential cross section for p+ 208 Pb→ dijet + X according to Eq. (17). η 4 is constrained to the range (−5, −3), and η 3 = 0. Proton PDF used is CT10 [59].

C. Partially integrated cross section
We will continue to see a large sensitivity to the presence of SRCs if we integrate over the η 3 and η 4 ranges considered in order to find the one-fold differential cross section dσ dp T . However, simply integrating over these ranges will count both x A > 1 and x A < 1 events for a fixed p T together, which is potentially a problem if we want to look at x A > 1 events in particular. However, since it is possible to determine x A from jet kinematics via Eq. (6), we can simply define two one-fold differential cross sections, one of which only counts x A > 1 events. Namely, we define dσ dp T = 2.5 The differential cross section of Eq. (44) is simply the differential cross section for the full kinematical range considered, whereas the differential cross section in Eq. (45) throws out    Computed using CT10 for proton PDF [59]. events for which x A < 1. By comparing these quantities, it can be seen how often these kinematics correspond to superfast partons with x A > 1.
In Fig. 13 we present estimates of the differential cross sections of Eqs. (44,45). Fig. 13a compares the differential cross section, in the absence of medium modifications, without and with short-range correlations accounted for. Most dijets with transverse momentum up to about 100 GeV can be attributed to nucleons in the mean field. In contrast, dijets with high p T > ∼ 150 GeV can be attributed predominantly to short range correlations. In addition, selecting only dijets with kinematics corresponding to an initial state nuclear parton with x A > 1 produces a differential cross section that is at least an order of magnitude larger when SRCs are considered than for the mean field by itself. This confirms that x A > 1 events can be attributed predominantly to the presence of SRCs in the nucleus, and also All events x A > 1 events only p T,min σ unmod σ clrscn σ unmod σ clrscn 50 GeV 7.8 mb 6.9 mb 12 nb 5.7 nb 60 GeV 2 mb 1.7 mb 6.  indicates that finding a large yield for x A > 1 events can demonstrate the presence of SRCs.
In the absence of medium modification effects, the value of the cross section could also be used to probe their strength.
On the other hand, since SRCs are highly modified in the nuclear medium, x A > 1 events in particular are suppressed by the EMC effect, as can be seen in Fig. 13b. This means the value of the one-fold differential cross section for x A > 1 dijet events suffers two effects that compete with one-another: it is increased by the presence of SRCs, but it is at the same time suppressed by medium modifications, which most strongly affect SRCs. It would be difficult to disentangle these competing effects, so it may not be possible to find precision values for a 2 (A) and a 3 (A) using the differential cross section of Eq. (45).
In addition to one-fold differential cross sections, we also present estimates for the total integrated cross section in order to show that x A > 1 events should produce a measurable yield. Since jet identification at low p T is difficult, we consider the total cross section integrated from a minimum p T value, and in particular consider several minimum values of p T . In Table II we present numerical estimates for the total cross section with different minimum p T values as indicated in the table. Table II considers the integrated cross section both in the case that all events are counted, and in the case that only x A > 1 events are counted. Since larger p T correspond to a greater proportion of x A > 1 events, we also include in the table estimates of the integrated cross section when p T is integrated from a minimum value of 150 GeV.
As can be seen in Table II, medium modifications suppress the x A > 1 and high-p T integrated cross section just as they suppress the one-fold differential cross section. This is consistent with the expectation that medium modifications affect SRCs especially strongly, since these kinematics are dominated by SRCs. However, despite the suppression by medium modification, the cross section remains large enough to be observable.

VI. CONCLUSIONS
In this work, we have shown that superfast partons inside a heavy nucleus with x > 1 are experimentally accessible at the LHC, and can be probed in dijet production reactions from lead-proton collisions. The value of x can be related to jet kinematics in the LHC rest frame in a straightforward way, so it is possible to select events that correspond to x > 1. Partons with x > 1 can predominantly be attributed to the presence of short range correlations between nucleons within the lead nucleus. Medium modifications are necessary to account for because they suppress the cross section appreciably, but the total x > 1 cross section remains on the order of a few nanobarns, making measurement feasible.
The observability of superfast partons in the nucleus owes itself to the presence of short range correlations. This fact tells us that it is inadequate to treat the nucleus as a collection of free nucleons, moving forward with equal light cone momenta. The internal structure of the nucleus, and the relative momenta of its nucleons, are necessary to account for. Additionally, accounting for the EMC effect using the color screening model resulted in an appreciable suppression of the dijet cross section at moderate values of x. Nuclear effects in general cannot be ignored in collisions involving heavy ions. This fact may be especially relevant to nucleus-nucleus collisions, in which jet quenching is used to study the existence and properties of quark-gluon plasma. The EMC effect by itself will suppress the cross section, while at large x (i.e. at high p T prior to quenching, and high jet rapidity) the presence of SRCs will increase the cross section. These effects may be necessary to account for in order to correctly predict the expected jet yields in the absence of jet quenching, and thus to correctly gauge the magnitude of quenching.
We expect the formalism to have applications beyond dijet production in proton-nucleus collisions. In particular, it will prove necessary to account for multi-nucleon short range correlations to correctly predict the high-x cross section in inelastic nuclear DIS experiments at high energy. The Electron Ion Collider [67] in particular will be able to carry out such experiments, where the formalism explicated in this work will prove necessary. In particular, the nuclear PDF at high Q 2 will need to be obtained through DGLAP evolution, since the phenomenological models that exist to account for nuclear modifications are constructed using low-Q 2 experimental data.

ACKNOWLEDGMENTS
We are thankful to Drs. Alberto Accardi, Wim Cosyn, Leonid Frankfurt, Shunzo Kumano, and Gerald Miller for many useful discussions. This work is supported by US Department of Energy grants under contracts DE-FG02-01ER-41172 and DE-FG02-93ER40771.

Appendix A: Derivation of factorization formula
The factorization formula is here derived using a Feynman diagram formalism. To be sure, QCD factorization is not rigorously proved here: we merely present a demonstration that, assuming the usual factorization theorem holds, Eq. (13) is correct for the nuclear PDF as we have defined it. This derivation is performed by treating hadrons as consisting of two parts: a parton that is probed, and the remainder of the hadron, which is a spectator to the probing reaction. Likewise, the nucleus will be treated the same way: it consists of a nucleon N that is probed, and a spectator consisting of the other (A − 1) nucleons. Subscripts with the letter s will be used to signify spectators to the reactions. e.g. p sp is the four-momentum of the part of the proton left over when i leaves it, which is a spectator to the parton-level interaction.
We make the assumption that the parton coming from the nucleus was found within a particular nucleon. The wave function of the nucleon itself may be modified by the nuclear medium, however.
For notational simplicity, we will use the symbol u to signify the part of a particle's wave function that transforms under a representation of the Lorentz group, even if it is not a spinor. For instance, u will be a spinor for fermions, but for vector bosons will be a polarization vector.
With this in mind, the hadron-level Feynman amplitude is given by is the Feynman propagator of a particle of type i with four-momentum p. For both types of particles, the completeness relation takes the form so applying (approximate) completeness relations to each of the intermediate particles gives The terms have been grouped for easy identification. The parton-level Feynman amplitude can be read off easily. There are also two partonic wave functions, for partons i and j, which can be observed using the definition [68] There is a light cone wave function for the nucleon N within the nucleus as well. In this work we shall use the definition a definition which ensures the normalization condition In light of these identities, the hadronic Feynman amplitude can be written Due to flavor and spin conservation, the spin and flavor of intermediate states can be identified using the spin and flavor of jets and spectators present in the final state. Consequently, the sums over these things are incoherent, and can be factored out of the square of the amplitude. Now taking the square, and then summing over final states and averaging over initial states, gives making the sum-and-average of the squares where spin indices have been suppressed since they are now averaged over. The hadronic and partonic fluxes are related as and the phase space elements are related as Thus the overall hadronic cross section element is From here, we can introduce several identities that are consequences of momentum conservation. For instance, x N = x A α . In addition, 1 = x p + x sp implies x sp integration can be replaced by x p integration. Similarly, x sN integration can be replaced by integration over dx N = 1 α dx A , and integration over α s = A − α can be replaced by integration over α, and all transverse momentum integrations can be transformed similarly. Thus we have With the convolution formula Eq. (20), this gives us the factorization formula Eq. (13).
Appendix B: Light cone densities for short range correlations In this Appendix, we will derive the functional forms of the light cone densities for twonucleon and three-nucleon short range correlations, as well as the functional forms of the light cone momenta that they employ as arguments. Appendix B 1 will focus in 2N correlations, while Appendix B 2 will focus on 3N correlations.

2N correlations
In this part of the Appendix, we will derive Eqs. (29,28), and additionally relate the light cone deuteron wave function ψ d (k) to the non-relativistic wave function in order to use a numerical parametrization.
To begin, it is known that two-nucleon SRCs dominate when the momenta are above the Fermi momentum k F . It is unknown how exactly the light cone density matrix behaves in the transition region around k F , so a cutoff Θ(k −k F ) is imposed: 2N correlations will contribute only at momenta above the Fermi momentum. Moreover, since the high-momentum tail of nuclei are known to scale as the high-momentum tail of the deuteron, we will specifically have ρ where ρ d (α, p T ) is the light cone density matrix of the deuteron. The scaling constants C (N ) (A) will be different for protons and neutrons (except in the case of symmetric nuclei), and in particular must satisfy since a proton can only be in a 2N correlation by being paired with a neutron (within the approximation we use, which neglects pp and nn correlations), and vice-versa. These can additionally be related to the scaling constants a 2 (A) through the scaling relation which tells us that Combining Eqs. (B2,B4) gives us where χ N is the relative fraction of nucleons of type N in the nucleus, i.e. χ p = Z A and χ n = A−Z A . We have most of the factors of Eq. (29), and now need to derive the form of ρ d (α, p T ). To do this, we will evaluate the Feynman diagram in Fig. 14, which describes the deuteron interacting with an external probe in the limit of zero momentum transfer, and apply the optical theorem to the result. The Feynman amplitude for this process is with the factor of 2 owing to the possibility of scattering occurring from either nucleon.
Here, momentum arguments have been left out for the initial and final state spinors since these do not affect the calculation. From here, the amplitude is evaluated within the light cone formalism. For instance, we write out the integration element as d 4 p 2 = 1 2 dp + 2 dp − 2 d 2 p 2T .
We rewrite this, additionally, in terms of other variables. Since the light cone momentum p + d = p + 1 +p + 2 of the deuteron is a constant, we can substitute dp + 2 with dp + 1 in the integration element. Similarly, we can substitute d 2 p 2T with d 2 p 1T . Additionally, as p + 1 = α 2 p + d , The next issue concerns the denominators of the propagators. In general, we have where for an on-shell particle, since p 2 = m 2 , The propagator for p 2 in particular can be rewritten without yet commiting to an interpretation of k as a momentum, we can write Now, we wish to justify the interpretation of k as a momentum. To this end, let us look at the light cone kinematics of the bound system. The deuteron itself is on its light cone energy shell, so p + d p − d − p 2 dT = M 2 d . However, since it is a bound system, its constituents are not on-shell. We have chosen for p − to contain the virtuality of the off-shell nucleon, so we can write where V is a term that represents the virtuality of the bound nucleons. Placing this expression for p − d into the on-shell condition for the deuteron gives Consider now the deuteron in its rest frame, so that it consists of two nucleons with equal and opposite momenta ±k. In the absence of binding energy, the rest and kinetic energy would for each particle would be E k = √ m 2 + k 2 , giving a total squared center of mass energy of 4(m 2 + k 2 ). The expression Eq. (B11) gives exactly this, with V p + d as a correction owing to binding energy. k does correspond to a momentum after all, and Eq. (B9) is equivalent to Eq. (28), in the special case that p dT = 0, which is relevant to our purpose since we are neglecting the transverse motion of the 2N correlation. Now, we combine Eqs. (B7,B8,B10) to obtain M Xd = 2 λ 1 ,λ 2 dαd 2 p 1T (2π) 3 1 2(2 − α)α 2 ū σ 1ū σ 2 Γ pn u λ 1 (p 1 )u λ 2 (p 2 ) 1 2 (M 2 d − 4(m 2 + k 2 )) To this, we apply the optical theorem by taking the imaginary part of Eq. (B12) on both sides. In particular, the optical theorem gives us Im(M Xd ) = s Xd σ Xd = 2s Xd σ XN (B13) and therefore 1 = λ 1 ,λ 2 dαd 2 p 1T 2(2π) 3 1 2 1 (2 − α)α ū σ 1ū σ 2 Γ pn u λ 1 (p 1 )u λ 2 (p 2 ) 1 2 (M 2 d − 4(m 2 + k 2 )) ū λ 1 (p 1 )ū λ 2 (p 2 ) 1 2 (M 2 d − 4(m 2 + k 2 )) Γ pn u σ 1 u σ 2 .
In its current form, Eq. (B15) allows us to identify ρ d (α, p 1T ) in terms of the given quantities. It also allows us to identify the light cone wave function up to some arbitrary conventions. For instance, we can define the wave function so that a factor of (2π) −3 must be present in the integration element for its normalization condition, or we may instead absorb this factor into the wave function itself. We will employ the following standard, also employed in Ref. [3]: meaning that Eq. (B15) gives us If ψ d (k) without helicity indices is considered as the spin-averaged deuteron wave function, then this, with Eq. (B1) and the expressions for C (N ) (A), gives us Eq. (29). Now all that remains is to relate ψ d (k) to the non-relativistic wave function, so that we can employ a numerical parametrization. We will find the relationship in particular through the normalization condition Eq. (B17). First, we find how α and the longitudinal component k z of the momentum are related. We already identify k T = p 1T − α 2 p dT , so the Pythagorean condition k 2 = k 2 z + k 2 T gives us The last few lines employed the condition, when the square roots of k 2 z and E 2 k were taken, that k z is positive when α > 1 and negative when α < 1, corresponding to the convention that α > 1 signifies a forward-going nucleon in the deuteron rest frame. Using Eq. (??), we can obtain the following relations: This means the normalization condition Eq. (B17) can be written which, in light of the non-relativistic normalization condition |ψ N R (k)| 2 d 3 k = 1, tells us Notice that the argument the non-relativistic wave function takes in this relation is the light cone momentum k.

3N correlations
In this section of the Appendix, we derive the functional form of the three-nucleon SRC density, namely Eq. (31). In addition, we derive the form of the momenta k 12 and k 23 that enter into the wave function as arguments.
As discussed in Sec. III A 3, we consider the 3N correlation as arising from a sequence of two short-range two-body interactions. Since we consider only pn pairs to interact at short distance, the 3N cluster can itself be either a proton and two neutrons, or a neutron and two protons.
We will calculate the 3N light cone density matrix by considering a coherent scattering process in the limit of zero momentum transfer, just as we did for the 2N density in Appendix B 1. On the face of it, we appear to have several diagrams that can contribute, but in fact we only have one. To see this, suppose (without loss of generality) that the probed nucleon is a proton: then either the cluster is pnp, in which case the diagram for the process is depicted in Fig. 15, or the cluster is pnn, in which case the diagrams for the process are depicted in Fig. 16. In the latter case, both Fig. 16a and Fig. 16b represent means by which the scattering process can occur, and since they are related by the exchange of an identical fermion, their amplitudes are exactly opposite and cancel perfectly [69]. Thus, we only have to consider the diagram in Fig. 15.
denominator, giving us meaning the bound mass-squared of the 2N system is short by