The LPM effect in sequential bremsstrahlung

The splitting processes of bremsstrahlung and pair production in a medium are coherent over large distances in the very high energy limit, which leads to a suppression known as the Landau-Pomeranchuk-Migdal (LPM) effect. We analyze the case when the coherence lengths of two consecutive splitting processes overlap, which is important for understanding corrections to standard treatments of the LPM effect in QCD. Previous authors have analyzed this problem in the case of overlapping double bremsstrahlung where at least one of the bremsstrahlung gluons is soft. Here we show how to generalize to include the case where both splittings are hard. A number of techniques must be developed, and so in this paper we simplify by (i) restricting attention to a subset of the interference effects, which we call the"crossed"diagrams, and (ii) working in the large-$N_c$ limit. We first develop some general formulas that could in principle be implemented numerically (with substantial difficulty). To make more analytic progress, we then focus on the case of a thick, homogeneous medium and make the multiple scattering approximation (also known as the $\hat q$ or harmonic approximation) appropriate at high energy. We show that the differential rate $d\Gamma/dx\,dy$ for overlapping double bremsstrahlung of gluons with momentum fractions $x$ and $y$ can then be reduced to the calculation of a 1-dimensional integral, which we perform numerically. [Though this paper is unfortunately long, our introduction is enough for getting the gist of the method.]

At high enough energy, particles passing through matter (cosmic rays through the atmosphere, high-energy partons through a quark-gluon plasma, electrons through an electromagnetic calorimeter) lose energy primarily through splitting: hard bremsstrahlung or pair production which, when repeated, produces a shower of lower energy particles. Naively, one would calculate the rate Γ for each such splitting by (roughly speaking) computing the crosssection σ for splitting during a collision between the high energy particle and a particle in the medium, as in the bremsstrahlung process depicted in fig. 1, and then finding Γ ∼ nvσ, where n is the density of things to scatter from and v the relative velocity. The flaw in this argument, as known since the 1950s, is that the duration of this splitting process, called the formation time, grows with energy. At high enough energy, the formation time exceeds the mean free time between collisions with the medium, and then consecutive scatterings from the medium may no longer be treated as quantum mechanically independent for the purpose of computing the splitting rate. This is known as the Landau-Pomeranchuk-Migdal (LPM) effect [1,2], which dramatically suppresses the splitting rate at very high energy. In the language of Feynman diagrams, the LPM effect represents important interferences between splitting before and after a sequence of elastic collisions with the medium, such as shown in fig. 2a for QED and fig. 2b for QCD. The analysis of the LPM effect in QCD was pioneered in the 1990s by Baier et al. [3,4] and Zakharov [5], known collectively as BDMPS-Z.
A natural question that arises is whether consecutive splittings of the high energy particle, and not merely consecutive collisions with the medium, occur within the formation time.
That is, once we use the LPM or BDMPS-Z formalism to compute the rate for a single splitting, can we then treat consecutive splittings as independent and so simply use the single-splitting rate in a Monte Carlo to compute the development of the shower (which we need for answering detailed questions about energy loss)? Or is there instead a significant contribution from processes where the formation times associated with consecutive splittings overlap, as depicted in fig. 3? Formally, the probability of a splitting is parametrically of order α (α s in QCD) in one formation time, and so the formation time is one power of α smaller than the typical time between consecutive splittings, as depicted in fig. 4. This means that the contribution of overlapping formation times, as in fig. 3, is formally suppressed by one factor of α. However, in QCD applications, (i) this α s is at best moderately small at the scales relevant for the high-energy splitting process in actual relativistic heavy ion collisions, and (ii) this factor of α s is accompanied by a potentially large double logarithm [6][7][8]. It is therefore important to analyze the effect of overlapping formation times.
Note that in the last figure we have drawn only the high-energy particles. This will be our E x) E xE (1− FIG. 1: High-energy bremsstrahlung (blue) during a collision (black) with a particle from the medium. The curly line ending in a cross represents the electromagnetic or gluonic fields in the medium created by sources, such as by a nucleus or a passing thermal parton. convention throughout the rest of the paper: the collisions with the medium are implicit, and the high-energy particle lines should be understood as decorated with numerous interactions with the medium, as in fig. 2. Our goal in this paper is to develop formalism for a full computation of the LPM effect (to include the treatment of overlapping formation times) in the case of two consecutive splittings, such as fig. 3. We will then implement this formalism to compute results for double-splitting rates in QCD in certain simplifying limits, to include the multiple-scattering approximation (also known as theq or harmonic oscillator approximation), which is appropriate for typical events at high energy when the medium is much thicker than the mean free path for collisions with the medium.
Previous authors [6][7][8] have performed explicit calculations for QCD to leading-log order in the limit of y ≪ x ≪ 1, where x and y are the momentum fractions of two of the final gluons. [That is, if the initial parton energy is E, the three daughters after the two splittings have energies xE, yE, and (1−x−y)E.] These results have interesting and important consequences, which we will briefly mention later. Wu [8] has also developed formalism for studying the somewhat more general case of x, y ≪ 1 without assuming y ≪ x, but so far has only performed explicit calculations when y ≪ x. 1 In this paper, we study the problem of general x and y, without requiring that either be small, and we will carry out explicit calculations in that case. In the context of the multiple-scattering (harmonic oscillator) approximation, we will be able to go beyond leading logarithms and develop methods for explicitly computing the full result for double splitting.

A. What we compute
In this, our first paper on the case of general x and y, we will simplify the discussion by focusing on the case of the large N c limit of QCD. This will simplify treatment of the color dynamics of high-energy partons (a simplification that is not needed in the y ≪ x ≪ 1 limit considered by previous authors). 2 For the sake of a minor, further simplification, we will also focus on the case where the initial high-energy particle is a gluon rather than a quark (and so, in the large-N c limit, all the daughter particles after each splitting are gluons as well).
Though we will develop formal results that are more general, our explicit calculations in this paper will be specialized to the case of a thick, static, homogeneous medium. "Thick" means that the medium is wide compared to the typical formation length.
Finally, in this first paper, we will only compute a subset of the contributions to the double splitting rate, depicted in fig. 5. It is convenient to also draw these same interference terms with the amplitudes and conjugate amplitudes tied together, as in fig. 6. We will refer to these contributions as the "crossed" contributions, since they involve two crossed lines when drawn as in fig. 6. 3 An example of an interference contribution that we will not The subset of interference contributions to double splitting that we will evaluate in this paper: the "crossed" diagrams. To simplify the drawing, all particles, including bremsstrahlung gluons, are indicated by straight lines. The long-dashed and short-dashed lines are the daughters with momentum fractions x and y respectively. The naming of the diagrams indicates the time order in which emissions occur in the amplitude and conjugate amplitude. For instance, xȳyx means first (i) x emission in the amplitude, then (ii) y emission in the conjugate amplitude, then (iii) y emission in the amplitude, and then (iv) x emission in the conjugate amplitude. calculate in this paper, but plan to address in future work, is the un-crossed diagram of fig.  7. (The evaluation of this contribution involves some additional subtleties beyond what is required for the crossed contributions, to be briefly mentioned in our conclusion.) In this paper, we will also, for simplicity, focus on the rate for a high-energy particle to split into three on-shell daughters, which all persist after the double-splitting process. That is, we being bremsstrahlung gluons, then the diagrams explicitly shown would be 1/N c suppressed, but some of the "permutations" referred to in the figure, such as the one shown later in fig. 9, would not be.) will not yet consider loop corrections to single splitting, such as shown in fig. 8. Such loop corrections are necessary for using calculations of splitting rates to compute energy loss, as has been analyzed in refs. [6][7][8] for the case x, y ≪ 1.
The "+ permutations" line of figs. 5 and 6 contains more types of interference terms than one might at first realize. For example, it contains the diagram of fig. 9 corresponding to the interference of (i) separate emission of x and y with (ii) the emission of an x+y gluon that then splits into x and y. This diagram is equivalent to the x ↔ 1−x−y permutation of the xyȳx interference shown explicitly in fig. 5. We will correspondingly refer to fig. 9 as the zyȳz interference, where z ≡ 1−x−y.

B. Overview of method and what's new
Here, in broad outline, are the elements involved in the calculation. We start by turning the problem into a problem in two-dimensional non-Hermitian quantum mechanics, slightly generalizing techniques that have been used by other authors.

3-and 4-particle quantum mechanics
Start by considering an interference contribution to the rate for single splitting, shown in fig. 10. In the time interval t x < t < tx between splitting in the amplitude and conjugate amplitude, the number of high-energy particles in the diagram does not change. As we shall review later, at high energy it is possible to reduce the problem in this time interval to time evolution in two-dimensional quantum mechanics for three non-relativistic particles, two representing the two daughter particles in the amplitude in fig. 10 during this time interval and one representing the parent particle in the conjugate amplitude in fig. 10 during the same time interval. The two dimensions of the quantum mechanics problem are the plane transverse to an axis approximately aligned with the initial direction of the parent. The "non-relativistic" character of the transverse dynamics comes from the large-p z (and small-  angle) approximation to the kinetic energy ε p of a high-energy particle of mass M, The right-hand side looks like a two-dimensional non-relativistic kinetic energy p 2 ⊥ /2m with a "mass" m with magnitude |p z |, and (for our purposes) |p z | may be treated as constant in the above approximation. 4 As we'll discuss, the 3-particle quantum mechanics problem that describes fig. 10 has the funny feature that the "mass" m which describes the particle in the conjugate amplitude is taken to be negative, which arises from the sign difference between the evolution exp(−iHt) of amplitudes vs. the evolution exp(+iHt) of conjugate amplitudes. Because of overall longitudinal momentum conservation in the splitting process, the two positive and one negative "mass" in this quantum-mechanics problem satisfy the unusual property (from the perspective of non-relativistic quantum mechanics) that Another unusual feature of the effective quantum mechanics problem is that it has a non-Hermitian Hamiltonian. As the particles propagate, they interact with the background fields in the plasma. For a thermal medium (or other random medium), these background fields are random with well-defined correlations. When computing a rate in a random background, one should average the rate over that randomness (e.g. average over the thermal ensemble). After this average, the interactions of the particles with the background fields, and their correlations, may be replaced by a "potential energy" term in the quantum mechanics problem, as we will later review more concretely. 5 However, unlike a normal quantum mechanics problem, this potential energy is not real-valued. The effective quantum mechanics problem that reproduces the medium-averaged splitting rate (rather than the unaveraged splitting amplitude) has a non-Hermitian Hamiltonian. The non-Hermiticity of the Hamiltonian accounts for quantum decoherence over long times: it causes the interference contribution of fig. 10 to decay exponentially for time separations tx−t x large compared to the formation time. Now consider a particular interference contribution to the double splitting rate, such as the xyxȳ interference shown in fig. 11. Within each time interval between splittings in the figure, such as (a) t x < t < t y , (b) t y < t < tȳ, or (c) tȳ < t < tx, the number of highenergy partons does not change. Using the same methods outlined above, we can reduce the problem of medium-averaged time evolution during these intervals to two-dimensional non-Hermitian non-relativistic quantum mechanics for (a) three, (b) four, and (c) three particles respectively. For the 4-particle evolution for t y < t < tȳ, the "masses" of the quantum mechanics problem will satisfy m 1 + m 2 + m 3 + m 4 = 0, (1.3) in analogy to the 3-particle case (1.2). If we can solve the time evolution in the non-Hermitian quantum mechanics problems, we can glue the results together using the field-theory matrix elements for the splittings at t = t x , t y , tȳ, and tx in fig. 11 to compute the result that we want for that interference contribution. We may similarly compute all the other interference contributions and add them together. Fortunately, for the sake of making the calculation practical, there is a major simplification to the problem, which we now outline.

Simplification to 1-and 2-particle quantum mechanics
We will need to solve for two-dimensional quantum mechanical evolution in problems of the form where b i are the two-dimensional transverse positions of the particles, p i are the corresponding transverse momenta, and V is the (non-real) potential that incorporates the mediumaveraged effects of scattering from the medium, which may depend on t ≃ z (where z in this context refers to the longitudinal coordinate x 3 ). At high energies, the medium will be (statistically) translation invariant in the transverse plane over the small transverse distances relevant to the problem, and so the potential V will be similarly invariant. For generic masses m i , that invariance lets us factor out the analog of the trivial "center of mass" motion of the problem (which turns out not to be relevant) and so reduce the N-particle problem to an effective N−1 particle problem. However, as we shall discuss, something very special happens when i m i = 0, which will allow us to eliminate yet another two-dimensional degree of freedom and so reduce the N particle problem to an N−2 particle problem. So, 3-particle evolution can be reduced to a simple problem of 1-particle quantum mechanics, which is equivalent to simplifications originally used by BDMPS-Z to analyze the problem of the single-splitting rate via fig. 10 in the case of general x. In our case, it will allow us (at the expense of having to develop some formalism) to reduce the 4-particle two-dimensional quantum mechanics problem required for the t y < t < tȳ region of fig. 11 to a simpler 2-particle two-dimensional quantum mechanics problem.

Further specialization
In the multiple scattering (harmonic) approximation, appropriate at high energy both for thick media and for "typical" events in relatively thin media, 6 the potential V in (1.4) becomes a (non-Hermitian) harmonic oscillator potential. Then we will just need to (i) solve for time-evolution of 1-particle and 2-particle harmonic oscillator problems; (ii) glue the results together with splitting matrix elements; (iii) integrate over the four times t x , t y , tx, and tȳ associated with those splittings in fig. 11, for example; and (iv) integrate over all relevant intermediate states (i.e. transverse positions) of the high-energy particles at the intermediate times t y and tȳ in fig. 11. That sounds like a lot of complicated integration. However, for the case of a thick, homogeneous medium, we will see that it is possible to do all of these integrations in closed form, except for a single time integral over the duration 6 Here, thin and thick are defined relative to the typical formation length in an infinite medium, which for QCD is ∼ ω/q, where ω is the smallest energy of the daughters. There is often a great deal of confusion about the applicability of the harmonic oscillator approximation to thin media because interest is often focused on calculations of the medium effect on the average energy loss ∆E , which in the thinmedia case can be dominated by rare events with relatively large ∆E. However, the harmonic oscillator approximation still describes typical events in the thin-media case, provided the thickness of the medium is large compared to the mean free path for collisions. See the notes and references given for (2.22) in appendix A. ∆t of the 4-particle evolution (i.e. over ∆t ≡ tȳ − t y for fig. 11). The final ∆t integral will be performed numerically.

Other aspects worth noting before we begin
Before we launch into the main body of the paper, we will first note some other aspects of our problem (double splitting for general x and y) that did not arise in previous work (specific to small x and y). The first is the necessity for a careful treatment of the helicity of high-energy particles between splittings. In the calculation of the single-splitting rate of fig. 10, it is possible to express the x dependence of the splitting matrix elements at t x and tx in terms of spin-averaged Dokshitzer-Gribov-Lipatov-Altarelli-Parisi (DGLAP) splitting functions. As an example, the differential splitting rate in the case of a thick, homogeneous medium is [10] dΓ where P (x) is the relevant spin-averaged DGLAP splitting function and Ω 0 is the (complex) frequency associated with the non-Hermitian harmonic oscillator in the multiple scattering approximation. For g → gg splitting for example, Ω 0 turns out to be (1.5b) (q R , known as the jet-quenching parameter, is the average squared transverse momentum per unit length that is transferred from the medium to a high-energy particle with color representation R.) When we study double splitting, such as in figs. 5 and 6, the helicities associated with the various splittings are tied together in a non-trivial way that cannot be packaged into spin-averaged DGLAP splitting functions for general values of x and y. One of our tasks will be to distinguish different intermediate-state helicity amplitudes in our treatment of the splitting matrix elements. The other non-trivial feature that we want to mention is that individual interference contributions, such as each of those in fig. 5, will turn out to have an unphysical logarithmic UV divergence arising from coincidence of three of the four splitting times in the diagram. 7 Since the diagrams for the amplitude are tree level (and similarly for the conjugate amplitude), there are no loops involved in the amplitude, and so there should be no actual UV divergences in the final result for the differential double splitting rate dΓ/dx dy once we sum over all contributions. We will see that the short-time UV issue is resolved by being careful about operator ordering and iǫ prescriptions.

C. What we find
Since we are only computing the "crossed" subset of contributions to the double-splitting rate ( fig. 6) and not yet all contributions, the results of this paper lie in the development of calculational methods rather than a final, total answer for the differential double-splitting rate dΓ/dx dy. That said, it is still interesting to ask if the contribution of this subset to the result for dΓ/dx dy has any interesting qualitative features. First, we will check that for y ≪ x ≪ 1, we find a contribution with y dependence of the form dΓ dx dy crossed ∝ · · · + α 2 s ln y When integrated over y, this gives a double log correction ∝ α 2 s log 2 . We will see (with some caveats explained later) that this result matches recent y ≪ x results by Blaizot and Mehtar-Tani [6] and Iancu [7] and Wu [8], after accounting for the fact that we have not included virtual corrections like fig. 8. The (ln y)/y contribution to (1.6) arises from interference diagrams of the form xyȳx, xȳyx, zyȳz and zȳyz, where the zyȳz interference was depicted in fig. 9. As discussed in refs. [6][7][8], this correction is due to small-x physics in the collisions of the high-energy particles with the medium, and it may be absorbed into a running of the parameterq that characterizes the strength of interactions of the high-energy particles with the medium in the multiple-scattering approximation [6,11,12].
However, in the limit of small y and fixed x, we will find an even more divergent contribution from crossed diagrams that scales with y as It is quite possible that this infrared divergence arises because our calculation only contains the crossed subset of diagrams rather than all diagrams, which we plan to investigate in future work.

D. Outline
At the end of this introduction, we will briefly comment on the relevant scale of the α s that controls a perturbative treatment in the number of overlapping hard splittings. In section II, we review formalism and results for computing single splitting processes, as in fig. 10, which will set up some of the language and formalism we later need for double splitting. Section III turns to how to use symmetry to reduce the problem of N particle quantum evolution in interference diagrams to an effective quantum mechanics problem involving only N−2 particles. We apply that technique to the double splitting problem in section IV, which also discusses the rules of diagrammatic perturbation theory in this language, how the large-N c limit simplifies the discussion of color in this paper, and the relationship of splitting vertex matrix elements to helicity-dependent DGLAP structure functions. In that section, we go as far as we can without either attempting complicated numerics or introducing additional approximations. To go further, we turn to the multiple scattering (harmonic oscillator) and thick-media approximations in section V, where, as a first example, we reduce the calculation of the xyȳx interference in fig. 11 to the form of a single integral over ∆t ≡ tȳ − t y . We also show that this integral is divergent as ∆t → 0 and isolate the divergence. In section VI, we show how all the other "crossed" interference diagrams of fig. 6 can be related to the xyȳx result, and we show that the small-∆t divergences cancel among these diagrams. We then discuss in section VII that the canceling divergences nevertheless leave behind a finite piece associated with the residue of a pole at ∆t = 0. Getting this right requires a discussion of the appropriate iǫ prescriptions associated with ∆t → 0. Section VIII gives a summary of our final results in terms of a single convergent ∆t integral which can be performed numerically. In section IX, we discuss our results in the limit where at least one of the radiated gluons becomes soft and show agreement between the pieces of our results that match up with pieces of previous authors' soft-gluon results. Finally, section X offers our slightly redundant but extremely short conclusion. A number of details are left for appendices.
E. Note on the scale of α s The size of α s depends on momentum scale, and there are two parametrically different scales relevant to the size of couplings in splitting processes such as fig. 2. One is the strength of the interaction of high-energy particles with the medium (the couplings to the black, vertical gluon lines in fig. 2). The smallest momentum transfers that are important are of order the inverse Debye screening length, m D , which is a characteristic of the medium. In theoretical studies, one may or may not want to consider α s (m D ) as small, but it is certainly not very small in quark-gluon plasmas created in experiment. The other important scale is the one associated with the high-energy splitting vertex in fig. 2. The relevant scale there is the typical transverse momentum Q ⊥ between the daughters, which depends on the energy. (This is the scale of the momenta of each daughter in the daughters' center-of-momentum frame.) In the simple case of hard bremsstrahlung in a thick medium, the LPM effect causes Q ⊥ to grow slowly with energy as 8 Q ⊥ ∼ (qE) 1/4 . If E is large enough, the α s (Q ⊥ ) characterizing the splitting vertex may therefore be small, regardless of whether α s (m D ) is small. This approximation-that the amplitude for each high-energy splitting vertex is small-is one that we will use implicitly throughout this paper. The factors of α s in this paper will all implicitly be factors of α s (Q ⊥ ) unless explicitly noted otherwise.

II. REVIEW OF SINGLE SPLITTING
We will set the stage by summarizing, in our own language, the starting point for the basic formalism (used in one form or another since BDMPS-Z) describing the LPM effect in QCD for single splitting as in fig. 2. Everything in this section is equivalent to results already known in the combined literature on this problem, but it will be useful for fixing our own conventions and in particular fixing the language we will use when we move on to discuss double splitting. 9 8 The relative Q ⊥ of the transverse momentum between the daughters is of order the total momentum transfer from the medium during the formation time t form , and so Q 2 ⊥ ∼qt form . For hard (x ∼ 1 ∼ 1 − x) bremsstrahlung in a thick medium, the LPM effect gives t form ∼ ω/q ∼ E/q, and so Q ⊥ ∼ (qE) 1/4 . 9 Our formulation of the problem owes the most to Zakharov [5], together with elements from refs. [2,13,14].
The notation we use is closest to ref. [14], whose appendix includes a translation table to the notation of several other works. Other noteworthy formalisms for studying LPM in QCD include BDMPS [3,4], but also, in various limits or with various extensions, GLV [15], ASW [16], and higher twist [17]. See ref. [18] for comparative discussion and further references. In the QED case, k x = k f since the photon does not directly interact with the medium.
A. Relating bremsstrahlung to 3-particle non-Hermitian quantum mechanics Let's first discuss the case of QED, as in fig. 12 with the dashed line representing a bremsstrahlung photon. Let δH be the part of the Hamiltonian that contains the splitting vertices for the high-energy particles. Then, working to leading order in δH, the differential bremsstrahlung probability dI/dx for producing a photon with momentum fraction x is 10 where the p is the transverse momentum of the charged particle and k is the transverse momentum of the bremsstrahlung photon, labeled as in fig. 12. The high-energy particles propagate through gauge fields sourced by the medium, and the double angle brackets denote statistical averaging over those gauge fields. The factors in the first parenthesis give the amplitude in fig. 12, which consists of a splitting matrix element p x k f |−i δH|p i at the first time t x , followed by the propagation p f k f , tx|p x k f , t x in the background field from t x to tx. In QED, the bremsstrahlung photon (k) is just a spectator during this propagation. The factors in the second parenthesis give the conjugate amplitude in fig. 12, which has propagation first, followed by splitting at the later time tx. The overall 2 Re{· · · } combines (i) the interference term of fig. 12, which is the case where emission happens first in the 10 In this paper, we do not consider bremsstrahlung associated with whatever earlier hard process created our initial high energy parton in the medium in the first place. We consider only late enough times (t ≫ 1/τ el where τ el is the mean free path for collisions with the medium) that the high-energy partons are already approximately on-shell. (By the uncertainty principle, a particle having collisions roughly every time interval τ el will be off-shell in energy by ∆E ∼ 1/τ el , which corresponds to p µ p µ ∼ E ∆E ∼ E/τ el , which is what we mean here by "approximately on-shell.") For a recent interesting discussion of the interplay between early vacuum bremsstrahlung and late medium-induced bremsstrahlung, see Kurkela and Wiedemann [19].
amplitude, with (ii) its complex conjugate, which is the case where emission happens first in the conjugate amplitude. The sum in (2.1) is over final state polarizations of the daughters. The symbol V ⊥ represents the (infinite) 2-volume (area) of the transverse plane, upon which final results will not depend.
In this paper, we will delegate additional discussion of some formulas to appendix A. In the case at hand, see appendix A if wondering why one need not include time evolution of the final state |p f k f after tx in (2.1), nor time evolution of the initial state |p i before t x . We also comment there on the prefactor E/2πV ⊥ in (2.1).
It is useful to rewrite the conjugate amplitude to express (2.1) as Time-evolution in the amplitude is given by the factor where H (2) is the Hamiltonian for propagating two high-energy particles through the medium: the charged particle and the photon. (Terms δH in the full Hamiltonian which change the number of high-energy particles are not part of H (2) .) The time evolution in the conjugate amplitude is analogously in the factor where H (1) is the Hamiltonian for propagating one high-energy particle through the medium. Since the interactions with the medium appear only in H (1) and H (2) , the medium-average in (2.2) can be restricted to the pieces This is just time evolution of an initial |p x k f p i | starting from time t x . The object |p x k f p i | lives in the spaceH where H e is the Hilbert space of states of a high-energy electron, and H e,γ is the Hilbert space of states with both a high-energy electron and high-energy photon. Instead of thinking of (2.6) as a product of Hilbert spaces, it is convenient to formally think of a single Hilbert space of three particles: one electron, one photon, and one conjugated electron. 11 Correspondingly, we rewrite (2.5) in the form of a 3-particle evolution where H (1+2) = H (2) − H (1) . Here H (2) operates on the two particles associated with the amplitude and H (1) on the one associated with the conjugate amplitude. It will be convenient to choose the convention that a p i above that represents the momentum of a particle in the conjugate amplitude has been negated. So, for instance, (2.5) may be recast in the form (2.7) with the identification (2.8) (Our convention is that the "first" of the three particles is the one from the conjugate amplitude.) With this sign convention for the p i , momentum conservation p i = p x + k f in fig. 12 means that The medium average in (2.7) only affects the evolution operator (which depends on the background fields), and so we may rewrite (2.7) as where H is going to be the Hamiltonian of our effective 3-particle quantum mechanics problem. Note from this definition that H need not be Hermitian, even though H (1+2) is.
B. Brief review of the form of H As we'll very briefly review, the form of H turns out to be (1.4), with non-relativistic "masses" given in terms of the initial parton energy E by We have used a similar minus sign convention for defining longitudinal momentum fractions x i as we did above for defining transverse momenta p i : momentum fractions x i appearing in the conjugate amplitude are negated. In our case here, and so m 1 is negative while m 2 and m 3 are positive. Note that, with this sign convention, which is equivalent to the mass relation (1.2). The kinetic terms in (2.11) can be understood by taking the limiting case where there is no medium, so that the high-energy particles are free. Then where ε p is the energy of a single free particle. Using the large-p z expansion (1.1) in (2.14) yields the kinetic term of (2.11a) plus a term for the potential. In this paper, we will assume for simplicity that the energy is large enough that effects of parton masses M i (including effective parton masses in the medium) are negligible. 12 The dominant effect of interactions with the medium is to add an imaginary part −iΓ H to H related to the differential rate of scattering from the medium. For the case of a weakly-coupled, QED plasma, this turns out to be where e is the charge of the charged high-energy particle, b 3 corresponds to the photon, Γ el ≡ e 2Γ el is the rate of elastic scattering from the medium, and d 2 Γ el /d 2 b ⊥ is the differential rate with respect to impact parameter. In terms of transverse momentum transfer q ⊥ , this is which will vary with time if the medium properties along the path of the high-energy particles vary with time. The second term in (2.16) corresponds to background field correlations between amplitude and conjugate amplitude in fig. 2a. The first term corresponds to selfenergies of the charged particle lines arising from correlations between the amplitude and itself, or between the conjugate amplitude and itself. 13 The relative sign in (2.16) arises because the second term corresponds, in the language of H, to the interaction of a charged particle (charge e) and a conjugated charged particle (charge −e).
To generalize to strongly-coupled plasmas (but weakly coupled splitting matrix elements δH for high-energy particles), it has been argued that the potential V may instead be related to the value of real-time Wilson loops which contain two long, parallel, light-like lines separated by b = b 2 − b 1 [21]. In this paper, we will not commit to weakly or strongly coupled plasmas. We will keep V general in the first part of our analysis, before later specializing to the multiple scattering approximation.

C. Generalization to QCD
The case of QCD is similar, but the bremsstrahlung gluon also interacts with the medium, and so correlations also arise between the bremsstrahlung gluon (b 3 ) and the other two particles (b 1 , b 2 ). For a weakly-coupled plasma, the corresponding potential in (2.11a) is where T a i are the color generators that operate on the color space of particle i, and T i · T j ≡ T a i T a j . The differential rate of elastic scattering from the medium for a particle with color representation R is normalized here (in weak coupling) as dΓ R,el = g 2 C R dΓ el . The first term in (2.18) is the sum of the self-energies of the three particles. The T i · T j terms involve correlations between particle i and j, with charges gT i and gT j respectively. This means that the T 2 · T 1 and T 3 · T 1 terms correspond to correlations between interactions in the amplitude and conjugate amplitude, according to our convention that particle 1 is the particle in the conjugate amplitude. The T 2 · T 3 involves correlations between background interactions of the two daughter particles in the amplitude. 16 Color conservation implies that Note that the potential V depends only on differences b i −b j , which is a consequence of transverse translation invariance. It is worth mentioning in passing a simplification in the case of small x: the masses m 1 and m 2 in (2.11) then have large magnitudes compared to m 3 . As a result, there is little (transverse) motion of particles 1 and 2, and so b 1 and b 2 stay close together compared to b 3 . Then particle 2 and conjugate particle 1 can be thought of as forming a tiny, fixed color dipole which interacts with particle 3 (e.g. aqq or gg dipole interacting with a softer gluon). This is one way that the 3-particle problem can be reduced to a 1-particle problem (of the small-x particle). However, we will be considering generic x and will not make such approximations in this work. 14 The g 2 here in section II C characterizes interactions with the medium, and so the relevant scale for this running coupling can be as low as the inverse Debye screening length. (We promised earlier to alert readers to those few cases where the strong coupling in this paper was not evaluated at a scale Q ⊥ that grows with energy E.) 15 Note that (gT 1 , gT 2 , gT 3 ) → (−e, +e, 0) turns the QCD result (2.18) into the QED result (2.16). Here, gT 3 → 0 represents the charge of the photon. 16 The effects of the imaginary part of this correlation dominate over those of the real part. 17 For example, (2.19)

D. The (optional) harmonic oscillator approximation
There is no reason that one cannot work hard to numerically solve the quantum mechanics problem with the full potential V [22,23]. However, at high energy (for a medium thick compared to the mean free path for elastic scattering) there is an oft-used simplification: it is difficult to deflect high-energy particles, and so (given that the particles start out together), the values of b i −b j will be small during the formation time. One might therefore expect that one can replace the potential V above by a quadratic approximation around b i −b j = 0. Formally, the small-b limit of (2.20) gives whereq R , the average squared transverse momentum transferred from the medium per unit length, isq In the case of a strongly-interacting plasma, the form of V may be more complicated than the weakly-interacting form of (2.20), but we give an argument in Appendix A that (2.21) holds for any quadratic approximation to the small b i −b j limit.
See appendix A for caveats and further discussion of the definition (2.22) ofq R .
E. Reduction from 3-particle to 1-particle quantum mechanics In our application, p 1 + p 2 + p 3 = 0, and so there are seemingly only two independent momenta in the Hamiltonian H (2.11a) in the case of interest. However, we get a further simplification from the 3-dimensional rotation invariance of our problem. Specifically, we could choose our z axis to be in a slightly different direction, while still retaining the large-p z approximations that we have made. In the context of those approximations, the answer for the splitting rate should accordingly be invariant under transformations of the form where ξ is an arbitrary vector in the transverse plane. The transformation (2.23) corresponds to a 3-dimensional rotation by θ = e z × ξ once one linearizes the rotation in |θ|. (Here, e z is the unit vector in the z direction.) Writing p iz as m i ≡ x i E, the above transformation is (In the language of two-dimensional quantum mechanics, this is Galilean boost invariance, with ξ playing the role of the relative velocity of the frames. 18 ) Because the rate we are computing is independent of the rotations implemented by (2.24), one expects that the calculation only depends on invariant combinations of the p i , such as give a more precise argument later.) Because p 1 + p 2 + p 3 = 0 and x 1 + x 2 + x 3 = 0 in our problem, there is only one independent such combination: And, indeed, the 3-particle kinetic energy term in H (2.11a) can be rewritten in terms of P in the form of a 1-particle kinetic energy: The most natural choice for a variable conjugate to Given the string of equalities in (2.25), one might expect that in our problem there is correspondingly a single independent transverse position variable B with This is indeed the case, but we will hold off on giving a more precise argument until section III, where we will simultaneously work out the generalizations that we need for the double splitting calculation (where we will additionally reduce 4-particle quantum mechanics to 2-particle quantum mechanics). Note that we could use x 1 + x 2 + x 3 = 0 to simplify x 1 + x 2 to −x 3 and so on in (2.29), but the combinations in (2.29) turn out to be the ones that generalize nicely to the case of more than 3 particles. Let's now slightly change notation for the translation-invariant 3-particle potential in H by writing (2.30) Listing three b i −b j combinations on the right-hand side is redundant, since only two are independent, but the redundancy makes manifest the permutation symmetries. Taking (2.29) at face value for the moment, the 3-particle Hamiltonian (2.11a) then reduces to the 1-particle Hamiltonian As an example, the potential (2.20a) for weakly-coupled plasmas would give (now using which is equivalent (with different notation and slight generalization) to the formalism originally used by BDMPS-Z. As another example (not restricted to weak coupling), the general harmonic oscillator approximation of (2.21) gives with In QCD, bremsstrahlung gluons can interact directly with the medium, and so the intermediate and final momenta k x and k f need not be the same in fig. 12. The single splitting probability (2.2) for QED correspondingly generalizes to The discussion in this section applies equally well to q → qg, g → gg, and g → qq, though in this paper we will eventually focus solely on gluons.) We will later explicitly discuss the details of reformulating this formula in terms of P while getting all the factors right. For now, we jump ahead to review the general expression for the final result, which is where P 1→3 (x) is the usual (spin-averaged) DGLAP splitting function. In this formula, Px, tx|P x , t x is the propagator of the 1-particle quantum mechanics problem of (2.31). The two δH matrix elements in (2.34) are responsible for the two factors of P in (2.35) and for the factor of αP 1→3 (x). One may alternatively work in B space instead of P space, in which case the above formula is [5] (2.36) We will find it convenient to work in B space when later deriving our results for double splitting. In the case of a thick and homogeneous medium, (2.36) reduces to for the differential rate of single splitting.
In order to make use of this formula, one could calculate numerical results for the propagator [22,23]. With the harmonic oscillator approximation, however, one can be more analytic. For example, for a thick, homogeneous medium, (2.38) Once one takes care of a small technical issue, using this propagator in (2.37) gives the long-known result (1.5a).
The technical issue has to do with appropriately handling a ∆t → 0 divergence in the integration in (2.37), which we will defer to section VII A where a thorough discussion will teach us how to handle more substantial issues in the calculation of double splitting.
We now give a general argument about reducing N particle quantum mechanics to effectively N−2 particles in the context of splitting rate calculations. First, we will identify a special sub-space of the Hilbert space that the N-particle Hamiltonian H of (1.4) leaves invariant, in the special case that Then we will show that this is the relevant sub-space for our problem. As already noted, transverse translation invariance means that total transverse momentum i p i is conserved, so that we may focus in particular on the subspace relevant to our problem. We will show that one may simultaneously impose the constraint that and that this also characterizes the relevant sector for our problem. From the form (1. which vanishes in the center-of-mass frame (3.3). In ordinary quantum mechanics, the relation (3.5) just says that total mass times the time derivative of center-of-mass position equals total momentum. If the total momentum vanishes, as in However, this does not normally imply that the operator i m i b i is constant when in the center-of-mass frame because FIG. 13: The notation used in (3.10) for the state of the system just before and just after the first splitting in fig. 12.
does not normally vanish, and so we may not simultaneously choose i m i b i constant and i p i = 0. In our case (3.1), however, (3.6) does vanish, and so both (3.3) and (3.4) may be imposed simultaneously. Time evolution with H will keep us in this sector if we start there.
Do we start there? For the sake of concreteness, begin by considering the case of single splitting, as in fig. 12 and eq. (2.34). First consider the state of the system before any splittings. We start with |p i p i |, which in our language we would call a "2-particle" initial state Because of the symmetry of the problem, we may formally average over the symmetry transformations (2.24) without changing the final answer. This replaces |p i p i | by something proportional to in the language ofH (1) ⊗ H (1) . From the left-hand side of (3.9), note that we start in the sector i p i = p 1 + p 2 = 0; from the right-hand side, we simultaneously start in the sector fig. 12 means that we will have x 1 b 1 + x 2 b 2 + x 3 b 3 = 0 just after the splitting. At t = t x , the conjugate particle (particle 1) is just a spectator, and so (b 1 , fig. 13 for reference.) As to the remaining particles: the splitting vertex is local, and so Putting this together gives and so the condition of i x i b i = 0 is preserved by splitting. This argument works for any splitting with any number of particles, and so the condition is preserved by all the splittings in the double-splitting diagrams of figs. 5 and 6 as well. For the 3-particle case, the constraint Applying these constraints to the 3-particle Hamiltonian H of (2.11a) allows us to reduce H to the effective 1-particle Hamiltonian (2.31), as described previously.

B. Reduction from 3-dimensional point of view
The constraint on i x i b i may also be understood in 3-dimensional language as a consequence of invariance under rotations that change the direction of the z axis by a tiny amount, consistent with our large-p z approximations. The conserved angular momentum is 20 where the superscript "(3)" indicates 3-vectors and S is the spin contribution to the angular momentum from the helicities of the particles. (In terms of the angular momentum J in the amplitude andJ in the conjugate amplitude, J = J −J .) In the high-energy, nearly collinear limit, (i) the z positions of the particles are all ≃ t, and (ii) S will be in the ±z direction. Then the transverse piece J ⊥ of (3.12) is given by Since rotational invariance is a property of the full Hamiltonian (including the δH for splitting), this gives conservation of We now turn to writing expressions for contributions to double splitting, focusing first on the xyȳx interference of fig. 11. Before we start, we should clarify something about the number of particles involved at various times in this diagram.

A. N =5 particle evolution unnecessary
The right side of fig. 11 is drawn in a way that indicates only three particles (two in the amplitude and one in the conjugate amplitude) are involved in the final time interval, tȳ < t < tx. However, based on the left-hand side of the figure, one might think there are five particles in that time region: all three final daughters (x, y, and 1−x−y) in the amplitude, and two particles (a final daughter y and an intermediate daughter/parent 1−y) in the conjugate amplitude. That is, the terms corresponding to the evolution during the time period tȳ < t < tx would be of the form 14: Labeling in (4.1) for momenta at beginning and end of time interval tȳ < t < tx.
with momenta labeled as in fig. 14 and where we have also included the integration over the final-state momentum κ f of the y daughter. We can write this in 5-particle notation as 2) The 5-particle problem could be reduced to an effective 3-particle problem using the constraints of the last section. Now recall that in the interval between splittings, the high-energy particles do not directly interact with each other but instead simply propagate in the background fields of the medium, which are later averaged with · · · . So, when computing p f k f κ f , tx|pkκ, tȳ in (4.1) for a particular background field configuration (before averaging), the three particles in the amplitude evolve independently with the 1-particle Hamiltonian H (1) appropriate to that particle. That is, and similarly for the two particles in the conjugate amplitude. We may therefore rewrite (4.1) as Inside the double angle brackets (before averaging), the 1-particle evolution of the y daughter is unitary, and so we may use the sum over the final states of the y daughter to simplify the above expression to What is left is what we would call (medium averaged) 3-particle evolution (the x and 1−x−y daughters in the amplitude, and the intermediate 1−y daughter/parent in the conjugated amplitude), which can be reduced to effective 1-particle evolution using the method of the previous section. The lesson is that the sum over final states of a final-state daughter may be performed as soon as it has been emitted from both the amplitude and conjugate amplitude, and one need not worry about its interactions after that. 21

B. First expression for xyȳx interference
As mentioned before, the diagrams in fig. 6 may therefore be treated as 3-particle evolution followed by 4-particle evolution followed by 3-particle evolution. After reducing each N particle problem to an (N−2) particle problem, we may reinterpret each diagram as effectively 1-particle evolution followed by 2-particle evolution followed by 1-particle evolution. We will work in b-space with the variables of which N−2 are independent for N particle evolution, once we impose the constraint For now, we focus attention on the xyȳx interference depicted by the fig. 11 and the first diagrams in figs. 5 and 6. We will see later that the other interference diagrams can be related to the xyȳx calculation, and so most of our task will be the evaluation of xyȳx. For the N=4 particle evolution, we label the x i as shown in fig. 15, so that (4.7) (Our numbering convention is not quite arbitrary: avoiding having x and y cyclically sequential in this list will turn out to be convenient for discussing the large-N c limit.) For the sake of avoiding confusion, we find it convenient to put hats atop the x i when specifically referring to the x i that characterize the N=4 evolution of xyȳx, and we will often use these variables in other time intervals as well. For example, the three longitudinal momentum fractions associated with the initial 3-particle evolution for t x < t < t y are and those associated with the final 3-particle evolution for tȳ < t < tx are For the sake of later relating results for xyȳx to other interference contributions, it will be useful to keep many things in terms of thex i instead of writing formulas directly in terms of x and y.
In order to help emphasize which variables apply to 4-particle evolution and which to 3-particle evolution, we will refer to the B ij of (4.6) by the symbol C ij in the 4-particle case. For the splitting matrix elements at the intermediate time t y , it will turn out to be convenient to choose our two variables describing the 4-particle (effectively 2-particle) system of the background fields we may ignore correlations of interactions just before the splitting time tȳ with interactions just after tȳ, within one correlation length of the plasma. This sort of assumption is implicit in many aspects of this work and standard treatments of the LPM effect, and its justification is that the effects of the background over a single correlation length about t = tȳ should be parametrically small compared to the effects over the entire formation time t form ∝ E 1/2 . This is similar to the justification of footnote 5 for using an instantaneous potential V in H, and the same sort of argument is implicit for our treatment of all the splittings in this paper. to be (i) the C ij involving the two spectators to the splitting, and (ii) the C ij involving the two particles involved in the splitting. The same will be true at the other intermediate time tȳ. So, referring to fig. 15, the convenient 4-particle variables will turn out to be (C 41 , C 23 ) at t y but (C 34 , C 12 ) at tȳ. With these conventions, the xyȳx interference contribution to double splitting has the form where all intermediate transverse momenta are implicitly integrated over-i.e. there is integration over Bx, Bȳ, Cȳ 34 , Cȳ 12 , C y 41 , C y 23 , B y , and B x above. The superscripts are just labels for the different intermediate states (and are written as superscripts rather than subscripts for the sake of compactness of names like Cȳ 34 ). The initial N=2 (effectively N−2=0) state is represented by | and will be discussed later, along with the final state |. The specific particles involved in each δH splitting above should be inferred from the xyȳx diagram in fig. 15; we have written δH above to indicate splittings that are in the conjugate amplitude. For comparison, the single splitting result, corresponding to fig. 10 and its conjugate, has the form in this notation. We shall discuss the δH matrix elements in sections IV D and IV E. They have the form for parent i to split into daughters j and k. Here the two-dimensional vector T i→jk has the form where T color is the color factor associated with the vertex [e.g. (T a A ) bc = −if abc for gluon splitting g → gg with color indices a → bc]. P is related to square roots of spin-dependent DGLAP splitting functions and depends on the x i and the helicities of the parent and daughter particles in i → jk (which also determine the helicity of the vector P). More details will be given later.
The corresponding matrix elements for δH linking the 3-particle to 4-particle sector will be reducible to the previous case (4.13). For example, if the daughters are particles 2 and 3, we will find that Diagrammatic rules which summarize all the splitting matrix elements needed are given in fig. 16. Let's now apply these rules to the xyȳx interference depicted in fig. 15 for the case where x and y are bremsstrahlung gluons. In the high-energy limit, the interactions of high-energy particles with soft background fields preserve the helicities of the high-energy particles, and so helicities of individual high-energy particles are preserved by the propagators in fig. 15. The color charges of individual particles are not preserved, and so propagators like Cȳ 34 , Cȳ 12 , tȳ|C y 41 , C y 23 , t y in (4.10) generically depend on color. As we'll discuss, the large-N c limit that we will take in this paper simplifies the color dynamics so that we may ignore this complication. Applying the rules of fig. 16 to fig. 15, naively contracting color indices, and using the δ-functions to perform as many integrals as possible, then yields (see Appendix A for more detail) Cȳ 34 , Cȳ 12 , tȳ|C y 41 , C y 23 , t y where d R is the dimension of color representation R. For the case considered in this paper of an initial gluon, the color factor is When discussing the single splitting rate in section II, we were able to determine the dot products T i · T j of color charges in the effective potential (2.18) in terms of fixed Casimirs and may refer, in different contexts, to ± the 3-particle B, or one of the 4-particle C uv , or to some mixture. However, note that B ji = B kj = B ik in the top rule, which can be used to always write expressions in terms of 3-particle B and/or 4-particle C ij 's. The blue arrows on the particle line indicates color flow of color representation R. (In the case of R=A, appropriate to g → gg splitting, the direction of the color flow does not matter.) b l , a l , x l , and h l indicate the transverse position, color index, longitudinal momentum, and helicity of each particle. The black arrows give the convention for the flow of x l and h l in the statement of the rule, and these values should be negated if they are instead defined by flow in the opposite direction. In the bottom rule, color and helicity indices and their contractions are not explicitly shown for the spectators because they are trivially contracted. Conservation of longitudinal momentum means x i + x j + x k = 0 (top) and additionally x m = x r and x n = x s (bottom).
C i , and so we avoided having to deal with color dynamics. This simplification is no longer possible, in general, when analyzing the N=4 particle evolution that appears in doublescattering: In this work, we sidestep this complication by focusing on the large-N c limit.
Because of the kinematics of our problem (high energy particles interacting with the plasma via momentum transfers that are soft compared to E), it has been convenient in our analysis to time-order our interference diagrams, as in fig. 6. To discuss the large-N c limit, however, we would also like to draw our diagrams as planar diagrams. In the case that all particles are gluons, the diagrams of fig. 6 can be drawn as time-ordered planar diagrams by drawing them on a cylinder (since a cylinder can be mapped into a plane), as shown in fig. 17a for the xyȳx interference. That is, the general N-particle potential in H among our adjoint-charge particles reduces to where V F is the large-N c potential (in the language of H) between a fundamental charge and anti-fundamental charge particle, and b 1 , b 2 , ... b N label the high-energy particles cyclically as one goes around the cylinder (e.g. from bottom to top in fig. 17a). In the harmonic oscillator approximation, this becomes which (in the large-N c limit) is the same as Note that for N=3 this agrees with the all-gluon case of the more general formula (2.21), which did not depend on large N c .
D. Reduction of states from N to N −2 particles We now turn to an explicit construction of the N−2 particle states such as |C 34 , C 12 appearing earlier. The utility of an explicit construction is that it will allow us to find the correct normalization factors associated with δH matrix elements, like the |x m + x n | in the bottom-right of fig. 16.
For i x i = 0, we may project any b-space state onto the subspace with i x i b i = 0 and i p i = 0 by (i) imposing i x i b i = 0 in b-space, and (ii) averaging over all b-space translations in order to project onto i p i = 0. Treating all N particles symmetrically, this procedure defines N is an x i -independent normalization factor described in appendix B. In the appendix, we show that this particle-symmetric definition of the projected states normalizes those states as where we have used our convention of referring to the 4-particle variables as C ij andx i . One may use any permutation of indices on the right-hand sides above. Note that (x 3 +x 4 ) 2 = (x 1 +x 2 ) 2 in the N=4 case and x 2 1 = x 2 2 in the N=2 case. The notation |C 34 , C 12 that we used earlier is related to the above symmetrically-defined states by so that the N=4 states are more conveniently normalized as The N=3 state |B used earlier is simply The normalization (4.22c) of | may seem a bit strange, but | is what we want for the final state. At the end, after the emission of both the x and y bremsstrahlung gluons, we want to sum over all possible final states of the emitter, which in our N-particle notation is By a shift of integration variable, this is N −1 | with | defined by (4.21) as noting that can be made about the initial state if we first use the symmetry of the problem to average over small changes in the directions of the initial momentum p i as in (3.7-3.9). Details of how the normalization factor N cancels in the final result are given in appendix B.
In the next sub-section, we will discuss in detail the form of the δH matrix elements for individual particles. For splitting i → jk in the amplitude, this has the form 22 We then need to project this result from N particle language to N−2 particle language. In appendix B, we show that (4.29) implies, in our constrained subspace, the results (4.13) and (4.15) quoted previously for the matrix elements B|δH| and C 41 , C 23 |δH|B .

E. Splitting matrix elements
The matrix elements for nearly-collinear bremsstrahlung (or pair creation) is textbook material, used, for instance, in discussions of DGLAP evolution. In our case, we will need individual helicity-dependent matrix elements (e.g from ref. [24]) and not just helicity-averaged matrix elements. We also need a few conversions compared to the standard literature. Our states |p have been normalized with non-relativistic normalization we have cast our problem into the language of nonrelativistic quantum mechanics. Results in the literature are instead for states with relativistic normalization 23 and so the matrix elements differ by factors of √ 2E i = 2|x i |E. Secondly, results in the literature are typically specialized to a choice of z axis that is exactly aligned with the direction of the initial particle: p i = 0 for p i → p j p k . We will want to generalize to slightly different directions of the z axis and so will express results in terms of the invariant P jk ≡ x k p j − x j p k instead of, say, p k . Finally, results in the literature are generally expressed with the convention that E is the energy of the parent particle. In our problem, the parent of the second splitting is a daughter of the first splitting, and so we will write parent energies more generally as |x i |E instead of E. Doing this will also make manifest in our formulas the symmetry between the three particles (parent and daughters) participating in a single splitting process.
We are going to just present here the formulas for the matrix elements. Readers interested in details about the connection to standard formulas in the literature should see appendix C. The matrix elements are given by (4.29) with 24 Note the that P jk = P ki = P ij if we were to pick the convention that all x l flow away from the vertex, so that x i is negative, and were to similarly negate p i . 23 In this paper, we have generally swept the longitudinal momentum conservation factors 2π δ(p iz − p ′ iz ) under the rug. See appendix A on (4.29-4.31) for a more detailed discussion. 24 See appendix A for a note on dimensional analysis. and where the P i→jk are helicity-dependent DGLAP splitting functions and e (±) ≡ e x ± ie y (4.32) are ± helicity basis vectors in the transverse plane with sign determined by the conservation of the z-component J z of angular momentum in the nearly-collinear limit. This is easiest to state if we adopt the convention that each helicity flows away from the vertex. So, instead of thinking of the helicities in a splitting as, for example, + → ++, we will think of them as (h i , h j , h k ) = (−, +, +). With this convention, (4.31) is, more precisely, For g → gg with x i → x j x k , the DGLAP splitting functions (for |x j |, |x k | > 0) are The remaining cases are obtained by negating all helicities and the subscript of e (±) . Note that the P's do not depend on the signs of the x l . Note also that in the soft bremsstrahlung limit |x k | ≪ |x i |, helicity is conserved among the harder particles (i.e. + → ++ and + → +− are unsuppressed but + → −+ is suppressed) and |P +→++ | ≃ |P +→+− |. These simplifications account for the fact that a careful treatment of intermediate helicities was not generally necessary in earlier work [6][7][8] focused on the soft bremsstrahlung limit. (See Appendix F 2 for a caveat.)

F. Helicity sums
It is now time to do the helicity sum in our expression (4.16) for the xyȳx interference. The only things that depend on helicities are the P factors. These factors will be the same for all of the diagrams in the first row of figs. 5 and 6, and in this case it will be more convenient to write in terms of x and y explicitly rather than in terms of the 4-particlex i 's. Specifically, we need hx,hy,hz,h,h where P n are the Cartesian components of P. If one prefers, this can be written equivalently as hx,hy,hz h which (reading right to left) shows more clearly how the successive splittings tie together in both the amplitude and conjugate amplitude. Because of parity/reflection symmetry in the transverse plane, the rate (4.16) is independent of the initial helicity h i , and so we may replace (4.37) by its helicity average over h i . Then, because of rotation invariance of the transverse plane, the above helicity sum (averaged over h i ) must have the form α(x, y) δn n δm m + β(x, y) δnmδ nm + γ(x, y) δn m δ nm (4.38) for some functions α, β, and γ. Using (4.35) in (4.37), we find  . (4.40) Note that α, β, and γ are all symmetric under x ↔ y. This is as far as we can go without either doing numerics or making additional approximations, to which we now turn.

V. xyȳx IN HARMONIC AND THICK-MEDIA APPROXIMATIONS
We can make further analytic progress by (i) restricting attention to the harmonic approximation, and (ii) assuming that the medium is sufficiently homogeneous thatq does not vary significantly over distances of order the formation length. In this section, we will continue to focus for now on the xyȳx interference. We will tackle the N=3 particle ends of the evolution in fig. 15 and integrate over the first and last splitting times, t x and tx. Then we tackle the N=4 particle piece of the evolution by solving the harmonic oscillator problem for the propagator in the large-N c limit, and then we do the B integrals remaining in (4.40). This will leave the result in the form of a single 1-dimensional integral over ∆t ≡ tȳ − t y .
A. Integrating over first and last times t x and tx During the 3-particle phases of the evolution of the xyȳx interference of fig. 15, our H is given by (2.33) and (2.27). For the initial 3-particle evolution corresponding to t x < t < t y , we'll write this as and (when all particles are gluons) For the final 3-particle evolution corresponding to tȳ < t < tx, and Throughout this paper, one should understand √ ±i to refer to the root e ±iπ/4 .) The propagator (2.38) for a harmonic oscillator is The factor corresponding to the initial 3-particle evolution in (4.40) is of the form Integrating over the initial time t ′ < t above gives The term with t ′ = t has a divergent exponent, corresponding to infinitely oscillatory behavior in B. Naively, one might expect to be able to drop it, since an infinitely oscillatory function vanishes if later integrated against a smooth function. We will drop it here but will later have to return and be more careful about what happens when times become coincident.
[In particular, if a third time in the double splitting problem becomes coincident with t and t ′ above, then (5.7) is not integrated against a smooth function of B.] In our problem, Ω has a negative imaginary part, so that cot(Ω∞) = i, and so the t ′ = −∞ term in (5.7) leaves us with For the sake of evaluating other interference diagrams in the future, it will be useful to generalize to include cases where Ω may have a positive imaginary part. See appendix A for an argument that the more general result may be cast in the convenient form A similar analysis, relevant to the final 3-particle evolution, gives (with the same caveats) Using (5.9) on the expression (4.40) for the rate gives Cȳ 34 , Cȳ 12 , ∆t|C y 41 , C y 23 , 0 where ∆t ≡ tȳ − t y , and we have used time translation invariance (in our thick-media approximation) to convert the expression for the probability I into an expression for the rate Γ. B. 4-particle normal modes and frequencies in large N c We now turn to the N=4 particle propagator Cȳ 34 , Cȳ 12 , ∆t|C y 41 , C y 23 , 0 .

Reduced Lagrangian
Consider the 4-particle Lagrangian (5.13b) We could now solve this 2-particle problem. However, it will be convenient to first change to one of the sets of variables that we actually use in our rate expression (5.10). We will trade the b 14 and b 34 used above for C 12 ≡ b 12 /(x 1 + x 2 ) and C 34 ≡ b 34 /(x 3 + x 4 ). Using 25 For readers wondering how we could get from the 4-particle description to the 2-particle description without ever explicitly imposing the condition p 1 +p 2 +p 3 +p 4 = 0 for our subspace: Whenever we can consistently project to i m i b i = 0 for all times (as demonstrated by the earlier arguments of section III A), then i p i = 0 follows automatically from (3.5).
Then (5.13) becomes A nice feature of these variables is that the kinetic term is diagonal, unlike in (5.13). Note that, for the x i of (4.7) relevant to the xyȳx interference, M above is positive definite. In harmonic approximation, (5.15) will reduce to the form with K depending on specific details that will be given in a moment. The squares Ω 2 j of the normal mode frequencies will be given by the eigenvalues of M −1/2 K M −1/2 . The corresponding normal modes (C j 34 , C j 12 ) will be orthogonal with respect to M, and we will normalize them as We will now turn to specifics by deriving normal mode results in the large-N c approximation when all particles are gluons. But we will then package our results so that we can continue to simplify our rate expression (5.10) in a way that is independent of those details, so that our results can easily be adapted at a future date to other situations.

Specifics for large-N c gluons
Combining the large-N c harmonic potential of (4.20) for gluons with (5.14) gives . (5.20) in (5.17). Note that, since this is a harmonic oscillator problem, the two transverse dimensions decouple, and so this may be treated as a problem in one transverse dimension rather than two. The result for the normal mode frequencies Ω ± should have the same symmetry with respect to exchanging particle labels as (5.19) has, and (mostly for aesthetic reasons) we use i x i = 0 to algebraically manipulate the result into a form where this is manifest. We find A more compact (but slightly less manifestly symmetric) expression is With normalization (5.18), the normal modes (C + 34 , C + 12 ) and (C − 34 , C − 12 ) are 26 For the x i of (4.7) relevant to the xyȳx interference, the x 1 x 3 /2N ± E under the square root in (5.24) is positive.

General form
Whatever the specific formulas for the normal modes are in a given problem of interest, we can write any (C 34 , C 12 ) as a superposition of normal modes with superposition coefficients A + and A − . We will write this in matrix form as The right-hand sides of (5.24) are in fact symmetric under 12, 34 → 21, 43 (which takes N ± → −N ∓ ), but we have not found an appealing way (i.e. other than brute force symmetrization) to write them that makes this symmetry manifest. [Also, one might think that 12, 34 → 21, 43 should take (C ± 34 , C ± 12 ) → (C ± 43 , C ± 21 ) = (−C ± 34 , −C ± 12 ) instead of to (+C ± 34 , +C ± 12 ), but the overall sign of a normal mode is an arbitrary normalization choice, reflected by the ambiguity of the square root in (5.24).] (5.28) We use the subscriptȳ because (C 34 , C 12 ) are the variables we want to use at tȳ in (5.10). Because of the normalization (5.18) of our modes, the Lagrangian (5.17) in terms of A ± is simply The corresponding propagator is At t = t y in (5.10), the variables we want to use are (C 41 , C 23 ) instead of (C 34 , C 12 ). However, it is easy to convert: (5.14) and and so The changes of variables (5.27) and (5.32) give 27 Cȳ 34 , Cȳ 12 , ∆t|C y 41 , C y 23 , 0 = Aȳ + , Aȳ − , ∆t|A y + , A y − , 0 | det aȳ|| det a y | . (which we will find useful later) and where We have used (4.7) to rewrite |x 1 x 2 x 3 x 4 | above as −x 1 x 2 x 3 x 4 for reasons that are described in appendix E 1 b, having to do with relating the result for the xyȳx interference to other interference contributions.

C. Doing the B integrals
We will now assemble our results and then carry out all remaining integrals except for the ∆t integral.
We want to use the propagator (5.39) in the expression (5.10) for the rate. It is convenient to first combine all of the exponential factors (before taking derivatives and setting various C ij to zero) and rewrite them as y Ω csc(Ω ∆t) a −1 y . With this notation, the rate (5.10) becomes The B integrals may be performed using , (5.44c) , (5.44d) × (βY y Yȳ + αY yȳ Y yȳ )I 0 + (α + β + 2γ)Z yȳ I 1 The ∆t integration in the xyȳx result (5.45) has both a linear and a log UV divergence associated with ∆t → 0. Specifically, if we expand the integrand in powers of ∆t (see appendix D 1 for a little more detail), we find something of the form We have not bothered to explicitly show the coefficient "stuff" of 1/(∆t) 2 because that divergence is easy to dispense with. Here is one argument that is easy to make now. The coefficient "stuff" turns out not to depend on any of the frequencies Ω and so does not depend onq. It is therefore the contribution to ( instead of [dΓ/dx dy] xyȳx . This simply subtracts out the 1/(∆t) 2 piece of (5.46). (We will give an alternative argument later that the 1/(∆t) 2 divergences cancel without relying on a priori knowledge that the total vacuum contribution must vanish.) In contrast, the 1/∆t divergences in (5.46) do depend on the medium, but in a very specific way: they are the sum of terms which depend on the medium (i.e. depend onq) either (i) only via Ω i or (ii) only via Ω f . In the first case, regarding the Ω i /∆t terms, this means that only the contribution from the evolution before t y in fig. 15 depends on the medium, and so (regarding the same terms) the 3-and 4-particle evolution after t y is equivalent to vacuum evolution. One expects the medium to be irrelevant over very short times, and indeed we show explicitly in appendix D 2 that the Ω i /∆t divergence arises specifically from the limit where t y , tȳ, and tx approach each other simultaneously in fig. 15. Similarly, the Ω f /∆t divergence arises when t x , t y , and tȳ approach each other simultaneously. These two situations are depicted in fig. 18 for future reference. 29 The vacuum rate is zero because we are analyzing the thick medium limit, which can be thought of as an infinite medium. If we had instead restricted splitting time integrals in (2.1) and (4.10) to start at some initial time 0, then there would be vacuum radiation associated with the initial appearance of the particle, related to the vacuum radiation associated with the hard process that created our initial particle (except lacking an ultraviolet cut-off). The usual analysis technique in this case would be to subtract off the vacuum result in order to isolate the part of dI/dx (or in our case dI/dx dy) due to medium effects.
18: A depiction of the (i) Ω i /∆t and (ii) Ω f /∆t divergence of the xyȳx interference as arising from the simultaneous approach of (i) t y , tȳ, tx versus (ii) t x , t y , tȳ. These divergences correspond to the short-time approximation of vacuum evolution in the shaded regions.
We will later see that the 1/∆t divergences cancel among different interference contributions, and so we will be able to carry out the ∆t integral numerically if we first combine results for different interference contributions before integrating. However, as will be discussed later, we will nonetheless have to be careful to account for an additional contribution associated with the pole at ∆t = 0.

VI. OTHER CROSSED INTERFERENCE CONTRIBUTIONS
We now turn to relating the other crossed interference diagrams in fig. 6 to the result just derived for the xyȳx interference. Details are given in appendix E, and here we will instead loosely motivate and then describe the results.

A. xȳyx interference
The second diagram of figs. 5 and 6 describes what we call the xȳyx interference contribution, which we depict with time labels in fig. 19. One difference with the xyȳx case analyzed previously is that the x i during the period of 4-particle evolution are different. In contrast to (4.7), we have numbered (for the sake of large-N c ) in the cyclic order one would get if drawing the diagram on a cylinder, analogous to the xyȳx case of fig. 17. Note that there are now two negative x i because two of the high-energy particles are in the conjugate amplitude during the 4-particle evolution (tȳ < t < t y ). We have chosen to indicate the 4-particle x i for xȳyx using primes, as in (6.1), and we will reserve the hats for the xyȳx case, as in (4.7). The two are related by The numbers depict our labeling of the 4-particle x i in (6.1).
In this notation, one of the changes in going from xyȳx to xȳyx can then be summarized aŝ x i → x ′ i . Another difference between xȳyx and xyȳx can be seen by considering the splittings that occur at the start (tȳ) and end (t y ) of the 4-particle evolution. At t=tȳ in fig. 19, particles 3 and 4 are spectators to the splitting, and so the natural variables for describing the system (following our discussion in section IV B) are (Cȳ 34 , Cȳ 12 ). Similarly, the natural variables at the end of the 4-particle evolution, t=t y , are (C y 41 , C y 23 ). Looking only at the variable names, this seems like the same choices as for the xyȳx analysis in (4.10). The difference is that (Cȳ 34 , Cȳ 12 ) are the variables at the start of the 4-particle evolution of xȳyx but the end of the 4-particle evolution of xyȳx. As a result, (Cȳ 34 , Cȳ 12 ) will be tied by the splitting matrix element to the initial 3-particle evolution (t x < t < tȳ) for xȳyx but was tied to the final 3-particle evolution (tȳ < t < tx) for xyȳx. The treatment of the initial and final 3-particle evolution was fairly symmetric in section V, and the effect of the 3-particle evolutions in the final result (5.45) appeared only through the parameters (M i , Ω i ) and (M f , Ω f ). So one might expect that one can account for this difference between xyȳx and xȳyx simply by interchanging those parameters: (M i , Ω i ) ↔ (M f , Ω f ). In fact, this change is automatic underx i → x ′ i if one chooses to write 3-particle M and Ω in terms of the 4-particle x i , as in the middle expressions in (5.2) and (5.4): and similarly M f → M i and Ω i ↔ Ω f . However, there is one additional element to the relation between xyȳx and xȳyx, which are the contractions of the helicity amplitudes associated with each splitting matrix to make the combination (4.38), α(x, y) δn n δm m + β(x, y) δnmδ nm + γ(x, y) δn m δ nm , (6.4) discussed for xyȳx back in section IV F, where the indices (m, n,m,n) were associated with the splitting vertices as in fig. 20a. Holding x and y fixed in α, β, and γ, the rules that we have discussed so far for relating xyȳx to xȳyx, which swap the initial and final 3particle evolutions, would correspond to (6.4) contracted with vertices for xȳyx as in fig.  20b. However, as discussed earlier and exemplified by (4.37), the helicity sums should be the same for all of the diagrams in the first line of figs. 5 and 6. Reading from (4.37), the vertices should be labeled as in fig. 20c instead of fig. 20b. That is, we need to swap m ↔n. From (6.4), this is equivalent to instead swapping α ↔ β. Readers who find the above arguments a little too impressionistic should refer to appendix E. In particular, the discussion above completely sweeps under the rug a subtlety concerning the overall sign of the xȳyx contribution, which is related to why we presciently replaced |x 1 x 2 x 3 x 4 | by −x 1 x 2 x 3 x 4 back in (5.39). (Though these are the same for xyȳx, they differ by a sign for xȳyx.) In summary, the steps necessary to change the result (5.45) for xyȳx to a result for xȳyx are i (to include the formulas for Ω ± , a y , and aȳ); The interpretation of the 4-particle evolution time ∆t in the resulting formula is now ∆t = t y − tȳ rather than ∆t = tȳ − t y , so that ∆t is still positive.
Applying these rules on the 1/∆t divergence of (5.46), the corresponding divergence for xȳyx is the ∆t integral of .
(6.5) Using (6.2), this can be rewritten as (6.6) Note that the (α+β) term will cancel when (5.46) and (6.6) are added together. To see the complete cancellation of all 1/∆t divergences, we must go on to analyze other interference contributions.

B. xȳxy interference
The xȳxy interference is our name for the third diagram in figs. 5 and 6. We will label the particles during the 4-particle evolution as in fig. 21 and designate the 4-particle x i with tildes: This is just a permutation of the x ′ i in (6.1), It is a convenient permutation of the labels because, as seen from comparing figs. 19 and 21, it preserves the labels of which particles are involved in the splitting at the beginning and end of the 4-particle evolution: particles 3 and 4 are the spectators at the beginning (t=tȳ), and 1 and 3 are the spectators at the end (t=t y for xȳyx and t=tx for xȳxy). So the structure of the two diagrams is the same, and the first rule of converting the results for xȳyx is simply x ′ i →x i , or equivalently 1 ↔ 2 and 3 ↔ 4. Note that if we also apply this rule to the formulas for M i and M f in terms of x ′ i , it gives the appropriate initial and final 3-particle masses M associated with fig. 21: Here,M is different from our previous M's because the particles in the final 3-particle evolution of xȳxy are different than those for xyȳx and xȳyx. Unlike M i and M f , thisM f is negative. Similarly, the transformation gives us the appropriatẽ Unlike the Ω i and Ω f discussed previously,Ω f is proportional to √ +i rather than √ −i and so has a positive rather than negative imaginary part. This is the reason that we earlier made the (at that time unnecessary) generalization from (5.8) to (5.9) in our discussion of integrating out the first and last splitting times.
We can now express the transformation directly from xyȳx to xȳxy in one step by rewriting (6.8) in terms of thex i : Now remember that, by itself, the transformation fromx i to x ′ i gave us fig. 20b instead of what we needed for xȳyx, fig. 20c. If we now make the further transformation 12 ↔ 34 that takes us from x ′ i tox i , fig. 20b will become fig. 22b for xȳxy. 30 But what we need, as read from (4.37), is fig. 22c. This requires mnn → nnm, which, from (6.4), is equivalent to αβγ → γαβ.
In summary, the steps necessary to change the result (5.45) for xyȳx to a result for xȳxy are •x i →x i (to include the formulas for Ω ± , a y , and aȳ);
The interpretation of the 4-particle evolution time ∆t in the resulting formula is now ∆t = tx − tȳ > 0. Applying these rules on the 1/∆t divergence of (5.46), the corresponding divergence for xȳyx is the ∆t integral of .
(6.13) Using (6.12), this can be rewritten as Note that all of the Ω i terms now cancel between (5.46), (6.6), and (6.14)-that is, between the three diagrams shown in the first line of figs. 5 and 6.

VII. 1/∆t DIVERGENCES AND iǫ PRESCRIPTIONS
We have now seen a subset of 1/∆t divergences cancel in the first line of fig. 6. To get the rest of the cancellations just requires a slightly bigger subset of diagrams: All of the 30 Again, see Appendix E is this be less than clear. 1/∆t divergences cancel between the six diagrams shown in fig. 23. The second line of fig.  23 is just the conjugate of the first line permuted by x ↔ y. One may check that adding together the 1/∆t terms of (5.46), (6.6), (6.14) and their conjugates with x ↔ y gives zero. The complete set of crossed diagrams depicted by fig. 6 can be written as the sum of fig.  23 and its conjugate, plus the remaining possible permutations of the three final daughters (x, y, 1−x−y). So 1/∆t divergences will cancel for the complete set of crossed diagrams as well. That means that, if we combine results for the crossed diagrams before doing the ∆t integral, and we subtract the corresponding vacuum results as in (5.47) and (5.48), the ∆t integral will be finite and can be performed numerically. Before we summarize formulas for the final result, however, we need to take care of a subtlety with the contribution from ∆t=0 to which we have previously alluded. And before we do that, it will be enlightening to first review some technical details of the calculation of single (as opposed to double) splitting.

A. iǫ prescription in single splitting
Return to the formula (2.37) for the single splitting rate. Using the propagator (2.38), this formula becomes where ∆t ≡ tx − t x . The integrand blows up like 1/(∆t) 2 as ∆t → 0. A simple way to avoid this problem altogether is to subtract out the vanishing rate of splitting in vacuum, analogous to (5.47): This integral is convergent and gives the result Re(iΩ 0 ), (7.3) which in turn gives (1.5a). However, it will be useful to see how to get the result without making the vacuum subtraction. We'll see how to deal with the 1/(∆t) 2 singularity in (7.1) directly, but the argument will perhaps look more familiar if we first transform the ∆t integral from 0 to ∞ to an integral from −∞ to +∞.

An integral from −∞ to +∞
For the sake of interpretation of ∆t, it will be useful to consider how we would have arrived at (7.1) if we had computed the xx interference diagram of fig. 10 and its conjugatē xx separately, as depicted in fig. 24. The 3-particle Ω given by (2.33b), gives (1.5b), The results for these two diagrams, after following the same steps as in section II, are Adding them together reproduces (7.1), as it should. Now let's take ∆t → −∆t in (7.6b) to write the sum of (7.6) as where now ∆t ≡ tx − t x in both cases. If the integrand in (7.7) were an analytic function of ∆t, we could now say that the problem of the singularity at ∆t = 0 is just a problem of understanding how one should integrate around the pole. The integrand is not analytic because of the various high-energy approximations that have been made, but the divergent piece at ∆t = 0 is analytic. We can rewrite (7.7) as the sum of a convergent integral [which is equivalent to (7. 2)] and a potentially divergent integral It doesn't really matter how we integrate around the double pole in (7.9) because we will get zero however we do it. However, getting straight the prescription will help deal with the 1/∆t divergences that arise in the double splitting case. So we now discuss the iǫ prescription that should be used for ∆t.

The iǫ prescription
Consider the δH matrix elements that appeared in the formula (2.2) for single splitting, Using completeness relations, (7.10) can be rewritten as This is a Wightman correlator: the δH for the splitting at tx is always ordered to the left of the δH for the splitting at t x , regardless of the time order of tx and t x . The iǫ prescription for the small-∆t divergence of a Wightman correlator is See, for example, the argument in Wilson [25] with regard to the singularity behavior of A(x) B(y) in the operator product expansion. (We provide our own summary of the argument in appendix A.) In our context, we can summarize this rule as follows: Regard times in the conjugate amplitude as displaced by −iǫ with respect to times in the amplitude.

Other ways to use the prescription
In our analysis of double splitting, we have always defined our ∆t to be positive. In single splitting, the analog is to use the 0 < ∆t < ∞ integrals of (7.1) or (7.6) instead of the −∞ < ∆t < ∞ integral of (7.7). It is useful to notice that the iǫ prescription works just as well in this form. In the case of (7.1), ∆t ≡ tx − t x . Since t x is in the amplitude and tx in the conjugate amplitude, (7.1) becomes The divergent piece is proportional to Equivalently (and most like what we will do for double splitting) we can get the same result from (7.6) by realizing that ∆t ≡ tx − t x → ∆t − iǫ in the xx piece (7.6a) but ∆t ≡ t x − tx → ∆t + iǫ in thexx piece (7.6b), so that the divergent piece of the sum is proportional to

B. Consequences for double splitting
Just like in the single splitting case, the vacuum 1/(∆t) 2 divergences in double splitting will be canceled by adding diagrams to their conjugates, or can be discarded by simply subtracting the (vanishing) vacuum result. However, the sub-leading 1/∆t divergences give something non-trivial.

A warm-up example
As a simple, warm-up example from calculus, imagine the integral for some complex frequency Ω and some bound R. We may replace ∆t → −∆t for ∆t < 0 to rewrite the integral as which is a 1/∆t analogy to the 1/(∆t) 2 integrals we had in (7.15). Now combine the two integrands in (7.17). If we ignored the important iǫ prescriptions in (7.17), we would naively conclude that the 1/∆t terms cancel each other, leading to the incorrect answer I = 0. One can equivalently write (7.17) as −Ω ∆t , (7.18) where the contours C 1 and C 2 are shown in fig. 25. The integrals indeed cancel except for the contributions from the tiny region around ∆t = 0, where the contours differ. The integrals around each quarter-circle give ±iπ/2 times the residue at the origin, which brings (7.17) to the same result I = iπΩ as (7.16).
A quick mnemonic for all of this is to use the standard relation 1 ∆t ∓ iǫ = P. P. 1 ∆t ± iπ δ(∆t), (7.19a) where "P. P." indicates the principal part prescription, and with the understanding that integrates over only half of the δ function. The principal part pieces then cancel between the two terms in (7.17), and the δ function pieces account for the answer I = iπΩ. (In this context, the principal part prescription does not really do anything since we are not integrating over negative ∆t.)

Double splitting
Let us now apply (7.19) to the 1/∆t divergences found previously for double splitting. For the xyȳx result, this divergence was given in (5.46), where ∆t ≡ tȳ − t y . Since times in the conjugate amplitude should get a −iǫ, we take ∆t → ∆t − iǫ in (5.46). After then applying (7.19), we will have a 1/∆t piece and a +iπδ(∆t) piece. The 1/∆t pieces will cancel between diagrams, as already noted in the discussion of fig. 23. The +iπδ(∆t) piece will give an additional contribution to the xyȳx interference that we need to retain, given by . (7.20) For xȳyx, we have a similar situation, but ∆t in (6.6) is ∆t ≡ t y − tȳ → ∆t + iǫ, and so the iπδ(∆t) in (7.19) has the opposite sign. As a result, the corresponding contribution arising from (6.6) is The xȳxy contribution of fig. 21 is a little more subtle because there ∆t ≡ tx − tȳ does not pick up a ±iǫ prescription from what we have discussed so far. Here, we will indicate how one might guess the answer, and we leave a more precise argument to appendix D 3. Analogous to the earlier discussion of fig. 18 for xyȳx, the Ω i /∆t andΩ f /∆t divergences of (6.14) arise from the situations depicted in fig. 26. Looking at the figure for theΩ f /∆t divergence, remember that, in the derivations of our expressions (section V A in particular), we have already integrated the first time t x all the way up to the second time (in this case tȳ). And so one might guess that the tx − tȳ will inherit the iǫ prescription of tx − t x , which is tx − t x − iǫ. Our guess is that ∆t ≡ tx − tȳ → ∆t − iǫ will give the correct result for thẽ Ω f /∆t divergence.
In contrast, for the Ω i /∆t divergence in fig. 26, t x is not close to tȳ, but instead t y is close to tx. One might guess that tx − tȳ then inherits the iǫ prescription of t y − tȳ, which is t y − tȳ + iǫ. So our guess is that ∆t → ∆t + iǫ will give the correct result for the Ω i /∆t divergence of xȳxy. These guesses are correct (see appendix D 3) and put together with (6.14) and (7.19) give The sum of the Ω i terms of (7.20-7.22) is (where we've finally used the fact that sgn M i = +1 since we will no longer be considering variable substitutions that will change the sign). If we add (7.20-7.22) to their conjugates with x ↔ y, as in the subset of diagrams of fig. 23, then we get (where we have now used the fact that sgnM f = −1). So, even though the 1/(∆t) terms naively cancel between the diagrams of fig. 23, the pole contributions arising from the ±iǫ's do not.
Recalling that the (α, β, γ) of (4.39) are symmetric under x ↔ y, (7.24) can be written more explicitly as where where the three terms correspond to xyȳx, xȳyx, and xȳxy respectively, and α(x, y), β(x, y), and γ(x, y) are given by (4.39). C is defined to have the vacuum result subtracted, so write D, defined as the integrand for xyȳx, is given by (5.45) as with the I n defined by (5.44); the (X, Y, Z)'s defined by (5.42); the 4-particle normal modes and frequencies used in their definition given by (5.21-5.25), (5.28), (5.33), and (5.40); and the 3-particle M's and Ω's in these formulas defined specifically in terms of the 4-particle x i (the arguments of D above) as and , The pole contribution in (8.2) is given by (7.25). Theq → 0 limit for the vacuum piece in (8.4) corresponds to taking all Ω's to zero. The vacuum piece in (8.4) could be worked out algebraically, but it is simpler to just use the same numerical code as for (8.5) but with the replacements Ω i → 0, Ω f → 0, Ω cot(Ω ∆t) → (∆t) −1 , Ω csc(Ω ∆t) → (∆t) −1 (8.8) [and so also Ω ± csc(Ω ± ∆t) → (∆t) −1 in the prefactor of (8.5)]. It is possible to scale out the factors ofq A and E from the above expressions by replacing ∆t by the dimensionless variable ∆t ≡ (q A /E) 1/2 ∆t. For numerics, it is convenient to work in units whereq A =1 and E=1, which then gives the result for the rate dΓ/dx dy in units of (q A /E) 1/2 .

IX. BEHAVIOR OF RESULT
Though this paper calculates only a subset of the diagrams contributing to dΓ/dx dy, it is interesting to examine the behavior of the results in the soft bremsstrahlung limit.
A. Comparison to earlier results for y ≪ x ≪ 1 As mentioned before, refs. [6][7][8] have studied the effect of double splitting on energy loss when one of two soft bremsstrahlung gluons becomes extremely soft compared to the other (y ≪ x ≪ 1). They find that the double-splitting correction to energy loss is enhanced by a double logarithm that, at leading logarithmic order, can be absorbed into the single splitting result (1.5) for energy loss by a redefinition ofq. This energy-dependent correction δq toq is seemingly universal and was computed yet earlier by Liou, Mueller, and Wu [11], with result In the current context of a thick medium, the relevant distance scale L is the formation time associated with emission of the x gluon: [The fact that this is a parametric statement (∼) rather than an exact equality does not matter at the level of identifying the leading logarithmic result.] The l 0 above characterizes the small distance scale, characteristic of the medium, at which the multiple scattering (harmonic oscillator) approximation breaks down. A discussion of resumming the leading logarithmic terms at all orders may be found in refs. [6,8,11,12]. The results are very interesting because they can change the power law dependence of how energy loss depends on energy. For instance, the stopping distance of a high-energy parton due to repeated single-splitting processes scales with energy as E 1/2 , but Blaizot and Mehtar-Tani [6] have discussed how resumming the double logarithm above can modify the scaling to E These earlier results provide a useful check on our own results: Do they agree if we take the y ≪ x ≪ 1 limit? We will focus on a comparison with Blaizot and Mehtar-Tani [6] and Wu [8]. Comparison is slightly complicated by the fact that in this paper we have only calculated certain contributions to dΓ/dx dy but have not calculated virtual corrections like fig. 8 that are needed for the energy loss calculation. On the flip side, the processes studied by refs. [6,8] do not include all possible time orderings of the splittings (presumably because only certain orderings are relevant to the double logarithm). So, we will need to compare a subset of our diagrams to a subset of theirs. In our terminology, this subset is xyȳx + xȳyx + zyȳz + zȳyz (plus their conjugates). For reasons explained in Appendix F 3, the xyȳx + xȳyx contributions are difficult to disentangle from the virtual processes also included in the analysis of refs. [6,8], and so we shall content ourselves with checking the contribution from zyȳz + zȳyz, which are just the xyȳx and xȳyx diagrams analyzed earlier with the substitution x → z ≡ 1−x−y. An alternative depiction of zyȳz was given earlier in fig. 9. (In Wu [8], our zyȳz and zȳyz are respectively the second diagram of his (30) and the first diagram of his (32). 31 In Blaizot and Mehtar-Tani [6], they are called A 3 and C 1 .) The work of Wu [8] is closest in technique to our own. In appendix F, we briefly sketch the relationship between Wu's expressions and the y x ≪ 1 limit of our formal expressions.
Wu then goes on to extract the double log behavior (getting the same result as others) to which we now turn. The contributions from zyȳz + zȳyz that give rise to double log corrections turn out to be y x which is equivalent to the regions found to give the double logarithms in refs. [6,8] (see appendix F for details). When we expand our own final results (from section VIII) in this range, also taking y ≪ x ≪ 1, we find 2 Re dΓ dx dy zyȳz+zȳyz =   (2 + 0 + 0) 1 + ln (9.4) Here, we have distinguished which contributions arise from interaction of the y gluon with the medium, using notation that we will explain in a moment. Specifically, the evolution of the y gluon for zyȳz +zȳyz is only relevant during the 4-particle part of the evolution, during which non-zeroq only enters the calculation via non-zero values of Ω ± . In the y ≪ x ≪ 1 limit, the relevant values of Ω ± (5.21) approach for zyȳz and for zȳyz. The "(12 − 3 + 1)" in (9.4) means that the contributions that do not depend at all on Ω ± (and so correspond to no interactions of the y gauge boson with the medium) give "12" there; contributions that are proportional to |Ω x | 2 give −3; and contributions that are proportional to |Ω y | 2 give +1. Similarly for the "(2+0+0)." The 1/(∆t) 2 and 1/(∆t) pieces of (9.4) have nothing to do with the small-∆t divergences discussed back in sections V D and VII. Those divergences and their cancellations arise in the limit of arbitrarily small ∆t rather than the ∆t range (9.3) being considered here. (Specifically, the small-∆t cancellations in [zyȳz + zȳyz + zȳzy] + [z ↔ y] * require ∆t ≪ y/x yE/q.) The double log effects which are found in the energy loss calculations of refs. [6,8] arise from single interactions of the soft y gluons with the medium, and so, when comparing similar diagrams, correspond to just the pieces of (9.4). Now integrate over y and ∆t, and compare to the single splitting result (1.5) for dΓ/dx in the limit x ≪ 1, which is The result is that the double-splitting contribution considered above can be absorbed (up to higher order corrections) into the single splitting result (9.8) byq A →q A + δq A with (see Appendix A) [where now we show only the contributions arising from Ω x and Ω y in the notation "(−3+1)"]. Even after adding in xyȳx + xȳyx (and conjugate) contributions, this will not exactly match (9.1) because in this paper we have not included the virtual corrections relevant to energy loss. However, we have checked that the contribution to δq A from just the zyȳz + zȳyz (and conjugate) diagrams in refs. [6,8] would have led to the same result for the double log pieces, i.e.
with δq A /q A on the right-hand side representing the total δq A /q A ≃ (α s /2π) ln 2 (L/l 0 ) of (9.1).

B. Small y behavior of full result
Our calculation contains other diagrams and effects besides the ones considered in the double log works of [6,8]. Numerically taking the y ≪ x ≪ 1 limit of our full result for the crossed diagram contributions to dΓ/dx dy, we find that the result behaves as 32 dΓ dx dy crossed ∝ 1 x 3 y 3/2 , (9.11) as shown, for example, by the numerical results of fig. 27. In combination with the x −3 , the crossed contribution to dΓ/dx dy is very divergent in the infrared. If we cut off x and y at some lower value x min , (9.11) implies that the crossed contribution to the total rate Γ for double splitting into non-virtual daughters scales as x −5/2 min . In contrast, the total rate for single splitting diverges as a relatively mild x −1/2 min . However, we have only computed a subset of diagrams ( fig. 6), and infrared divergences in gauge theories are notorious for requiring a complete set of diagrams to mitigate. We will not be surprised if this divergence is softened in a more complete calculation, which remains to be done.
Our crossed interference diagrams do not sum to be the square of something and so, unlike total dΓ/dx dy, the result for [dΓ/dx dy] crossed need not be positive. As an example, the values shown in fig. 27 are all negative, though [dΓ/dx dy] crossed is positive in other cases such as (x, y) = (0.3, 0.6). 33 32 See Appendix A for a discussion of how the pole terms (7.25) make a partial but incomplete cancellation of the leading y −3/2 behavior of (9.11). 33 As a numerical check for anyone who ever desires to implement our formulas of section VIII, our value of

X. CONCLUSION
This paper is long. Congratulations if you read (or even skimmed) this far! A complete calculation of dΓ/dx dy outside of the x ≪ 1 and/or double log approximation will require a little further work. But the analysis in this paper of all of the crossed diagrams provides most of the techniques that will be necessary for the full analysis, and we look forward to filling in the rest.
The reason that we have saved un-crossed diagrams such as fig. 7 for later work is that their 1/∆t divergences do not cancel as simply as fig. 23. One must instead take great care in separating the overlapping formation time corrections for double bremsstrahlung from the naive calculation of two independent, consecutive single-bremsstrahlung processes. But that is a story for another day.
with the normalization of the photon state because we wanted to emphasize that k f = k x . If concerned about the details of all factors of V ⊥ , refer to the more general formula (2.34) instead. If you want to see how engineering dimensions match up, then also see (4.29-4.31) and the discussion of them in this appendix.] The factor of E/2π in (2.1) comes from rewriting the longitudinal phase space factor dk z /2π in as dk z /2π = (E/2π) dx and dividing both sides by dx. Evolution of the final state |p f k f after tx can be ignored by unitarity of time evolution (before taking the statistical average · · · ) and the completeness relation p f ,k f |p f k f p f k f | = 1 within the two-particle sector. See section IV A for a more detailed argument in a different but closely related context. (The same hidden assumption discussed in footnote 21 applies to the argument here.) A similar argument can be made for time evolution of the initial state before t x by invoking rotational invariance. Rotational invariance of the rate we are computing means that we can average the initial p i over small rotations, which in the context of our large-p z approximation is discussed in detail in appendix B 2, with the result that one may replace |p i p i | by something proportional to p i |p i p i |. The same unitarity and completeness arguments used for the final state for t > tx then imply that we also need not follow the evolution of the initial state for t < t x in (2.1) and (2.2).

Eq. (2.21):
Here is a more general argument for (2.21). In general, any translationinvariant harmonic approximation for 3 particles can be put into the form with some coefficients A ij . In the special case of b 1 = b 2 , this gives However, in this case we have color charge T 3 at b 3 and total color charge T 1 + T 2 = −T 3 at b 1 = b 2 . This is effectively a two-particle problem with separation b = b 3 − b 1 , and the small-b limit of that problem is precisely what is used in proposals to non-perturbatively define (using real-time Wilson loops [21]) the parameterq for color representation T 3 , which means that Combined with (A3), this gives a constraint on the values of the coefficients A ij . Permuting the particle labels on this argument gives three constraints on the three unknown coefficients A ij , which determines (A2) to be (2.21).
Eq. (2.22): In weak coupling,q defined by (2.22) is UV logarithmically divergent, ∝ α 2 s n d 2 q ⊥ /q 2 ⊥ where n is the density of partons in the medium. In the harmonic approximation,q then has a mild logarithmic dependence on b, and the harmonic approximation is effectively a leading-log approximation (to which sub-leading-log corrections may be systematically computed 34 ). However, if the coupling α s is replaced by running α s (q ⊥ ),q defined by (2.22) is convergent. This form ofq is only relevant if the total momentum transfer Q ⊥ during a formation time is large enough that the integral (2.22) is not significantly modified if restricted to q ⊥ Q ⊥ . This happens when α s (Q ⊥ ) ≪ α s (m D ), where m D is the inverse screening distance in the medium. For thick media, Q ⊥ ∼ (qE) 1/4 for hard bremsstrahlung, and note that our assumption throughout this paper is that α s (Q ⊥ ) can be treated as small. Finally,qL describes the average Q 2 ⊥ picked up from the medium over a distance L and characterizes a Gaussian probability distribution for Q ⊥ after a large number of collisions. However, power-law tails of the probability distribution for Q ⊥ , representing rare collisions, can dominate certain types of calculations. 35 Eq. (4.16): In this formula, we make use of i x i = 0 to occasionally rewrite, for example, −x 3 −x 4 asx 1 +x 2 . Other than that, the expression follows from the rules of fig. 16 together with the overall factor of (E/2π) 2 of (4.10), which comes from rewriting dk z /2π = (E/2π) dx and dκ z /2π = (E/2π) dy, similar to the single factor of E/2π in (2.1). The one thing to be careful about in applying the rules is the signs of the B ij . For instance, consider the rule associated with the vertex at t = t x in fig. 15. Using the top rule in fig. 16, the vertex is associated with a factor of ∇δ (2) (B ji ) according to the labeling of that rule, but we may just as well write ∇δ (2) (B ik ) using the identity B ji = B kj = B ik noted in the caption. Comparing fig. 16 to fig. 15, this ∇δ (2) (B ik ) translates to ∇δ (2) (B 14 ) in the index labeling used in the latter figure. In (4.16), however, we have used C y 41 at the start t = t y of the 4-particle evolution. As a result of this choice, it was then convenient to define our B for the initial 3-particle evolution t x < t < t y as B ≡ B 41 instead of B ≡ B 14 . So our factor of ∇δ (2) . In contrast, the B defined for the final 3-particle evolution tȳ < t < tx is B ≡ B 34 , which is the same B that comes from applying the top rule of fig. 16 to the splitting at t = tx. So the corresponding factor for that splitting is +∇δ (2) (Bx). In a similar vein, the rules of fig. 16 give factors of ∇δ (2) (Cȳ 21 ) = −∇δ (2) (Cȳ 12 ) at t = tȳ and ∇δ (2) (Cȳ 23 ) at t = t y . We then integrate all the δ-functions (integrating by parts as necessary) to get (4.16).
Eqs. (4.29-4.31): We offer here a quick note on the scaling dimensions of our formulas. Since we use non-relativistic normalization p|p ′ = (2π) 2 δ (2) (p − p ′ ), then |p i has dimensions mass −1 and p j , p k | has dimension mass −2 , and so one would expect p j , p k |δH|p i to have dimension mass −2 . The right-hand side of (4.29) has an implicit (2π) 2 δ (2) (p i − p j − p k ). Including this factor gives the right-hand side dimensions of mass −5/2 . The discrepancy has to do with the normalization of the states of the longitudinal momentum. Here are two ways to handle the normalization. Method 1: If we normalize states so that p, x|p ′ , x ′ = (2π) 2 δ (2) (p − p ′ ) δ x,x ′ , with a Kronecker δ for the x's, then there is really a factor of V −1/2 z on the right-hand side of (4.29), where V z is the (infinite) length of the z-direction. (This V z should not be confused with the finite physical width of the medium in cases where the medium is finite.) This fixes up the dimensions of (4.29). With this normalization convention for states of longitudinal momentum, the sum over the longitudinal momentum of the bremsstrahlung particle in single splitting, for example, becomes V z dp z /2π instead of dp z /2π, and so the single-splitting formula (2.1) should have included an overall factor of V z . This V z then cancels the two factors of V −1/2 z that we get from the two δH matrix elements. Similar cancellations occur in the double-splitting case. We have not bothered to include any of these canceling factors of V z in our formulas. Method 2: Alternatively, one could normalize states as p, x|p ′ , x ′ = (2π) 3 δ (2) (p − p ′ ) δ(p z − p ′ z ) and avoid ever introducing V z . This normalization would change the dimensional analysis to make the two sides of (4.29) agree dimensionally, without any additional factor of V −1/2 z , if we take there to be an implicit (2π) 3 δ (2) (p i −p j −p k ) δ(p iz −p jz −p kz ) on the right-hand side.
Eq. (5.9): If Ω has a negative imaginary part, we get cot(Ω∞) = i and (5.8). If Ω had a positive imaginary part, we would instead get cot(Ω∞) = −i and so (5.8) with exp(− 1 2 MΩB 2 ) replaced by exp(+ 1 2 MΩB 2 ). We could therefore write the exponential in the general case as exp 1 2 MΩB 2 sgn(Im Ω) . However, there is a correlation between the sign of M and the sign of Im Ω that makes for a more compact expression. First note that if the 3-particle effective Hamiltonian H of (2.33) makes any sense for our problem, then it should have a negative imaginary part, corresponding to exponential decay rather than exponential growth. This means it should have M Im Ω 2 0 < 0, and so, also using (2.33b), sgn Im Ω = sgn Im Ω 2 = − sgn M.
This allows us to rewrite our general-case exponential above in the form of (5.9a), as promised.
Readers may wonder whether the condition that M Im Ω 2 0 < 0 could ever be violated. Using x 1 + x 2 + x 3 = 0 for 3-particle evolution, (2.27), and (2.33b), Requiring the right-hand side to be negative for all allowed values of x i satisfying x 1 + x 2 + x 3 = 0 is equivalent to requiring that the largest value of √q i be smaller than the sum of the two others. 36 So, ifq 3 is the largest, the condition would be In the weakly-coupled case, this would be √ C 3 < √ C 1 + √ C 2 , which can be rewritten as where T i are the color generators associated with each particle, acting on a 3-particle color singlet. The inequality (A8) follows from the triangle inequality, given that the action of T 1 and T 2 on the 3-particle color singlet are not simply proportional to each other. [We are not sure how to argue mathematically that (A7) must also hold outside of the weak-coupling limit, but a violation would mean that there is some instability in the problem.] 36 One (perhaps inelegant) way to derive this condition is to plug x 3 = −(x 1 + x 2 ) into (A6) and then rewrite the condition in terms of the ratio r ≡ x 1 /x 2 , which givesq 2 r 2 + (q 1 +q 2 −q 3 )r +q 1 > 0, which we require for all values of r. Requiring the minimum with respect to r to be positive gives a condition that can be written in the formq 2 1 +q 2 2 +q 2 3 < 2(q 1q2 +q 2q3 +q 3q1 ). By investigating what happens if the "<" is replaced by an equal sign and then solving for any one of theq i , this condition can then be translated into the one quoted in the main text above.
Eq. (7.12): Consider the short-time (t → 0) divergence of any matrix element ω 1 |A(t) B(0)|ω 2 , where the |ω i are energy eigenstates with energies ω i . Insert a complete set of states to write the matrix element as E ω 1 |A(t)|E E|B(0)|ω 2 . The short-time divergence will come from intermediate states with arbitrarily high E, and in particular E ≫ ω 1 , ω 2 . Fourier transforming from t to ω gives ω = E − ω 1 , and so we see that the short-term divergence is associated with purely positive frequencies ω. Now return to the original correlator ω 1 |A(t) B(0)|ω 2 in t space, and consider the Fourier transform integral dt e iωt ω 1 |A(t) B(0)|ω 2 taking it to ω space. For ω < 0, the contour may be closed in the lower-half complex t plane. In order for a short-time divergence of the integrand at t = 0 such as t −n not to contribute when ω < 0, the prescription for that divergence should then be (t − iǫ) −n , which is the prescription of (7.12).
Eq. (9.9): Following ref. [6], for example, the last equality of (9.9) is obtained by doing the y integral first after recasting the range (9.3) as The result is with L defined parametrically by (9.2) in our case. Then take the parametric range of the ∆t integration to be ℓ 0 to L.
Eq. (9.11): In both the cases of A(x, y) and A(1−x−y, y) in (8.1), there are some interesting cancellations between the pole and the integral pieces in (8.2). Individually, both of these pieces scale with y as O y 1/2 (α+β) for small y (and fixed x). When the two are added together as in (8.2), there is a cancellation among the leading contribution from α and β to give There is no similar cancellation for the terms involving γ. In the small y limit, (α, β, γ) scale with y as In consequence, both (A11) and (A12) scale as y −3/2 , and so the cancellation of the leading α+β term does not completely cancel the leading y −3/2 behavior.
Appendix B: Details on reduction of states from N to N −2 particles

Normalization of projected states
We first turn to the normalization factors, such as N in the definition (4.21) of |{B ij } , necessary to define our projection from N particle states to effective N−2 particle states. One of our constraints is that i p i = 0, and in our discussion we will encounter corresponding p-space δ-functions evaluated at zero argument: where V ⊥ is the 2-volume (area) of the transverse spatial directions. Final results will not depend on V ⊥ , which is taken to be infinite. Our other constraint is that i x i b i = 0, and it will be convenient to give a name V ⊥ to the analogous b-space δ-function evaluated at zero argument: Final results will not depend on V , which is also infinite. The projection P of an N-particle state |b onto the subspace with i p i = 0 and (Here, δ x i b 0 ,0 is a Kronecker δ rather than Dirac δ-function and gives 1, rather than V ⊥ , when x i b i = 0.) One may verify that P 2 | b = P| b . Because P| b above depends only on the differences b i − b j of transverse positions, it is a state that can be characterized as depending only on the values of B ij ≡ (b i − b j )/(x i +x j ). Except for normalization convention, it is the state |{B ij } that we defined in (4.21).
The normalization of P|b is i . These last constraints, together with i x i = 0, imply that and so the N−1 δ-functions left in the final expression in (B4) are not independent. (B4) is therefore equivalent to N−2 independent δ-functions times a factor of δ (2) (0). In order to relate this factor to the normalization factors we have defined, we must take care how we write things. Ignore (B5) for a moment and rewrite the i=2 δ-function in (B4) as In the presence of the other δ-functions, this can be replaced by Given the constraint (B5), this is (B9) In order to get rid of the nuisance factors of V ⊥ andṼ ⊥ , we find it useful to define so that Combining (B3) and (B10) determines the normalization factor N of (4.21) in the main text to be Eq. (B11) with N=2 and N=3 directly gives (4.22c) and (4.22b) for the normalizations in those cases. For N=4, it gives One may then use the transformations (5.14) to compute the Jacobian for trading variables (B 31 , B 41 ) in the δ-functions for (C 34 , C 12 ), remembering that we typically rename the B ij as C ij in the N=4 case. The result of this change of variables is (4.22a).

Cancellation of normalization factors
As discussed surrounding eqs. (4.26-4.28), the final state in the N-particle language of Now consider the initial state in this language. We have not yet picked a precise direction for the z-axis in our calculation-much of the formalism has been designed precisely to avoid that. For simplicity of presentation in the present argument, pick it for now to be exactly in the direction of the initial parton at the initial time. Then the system starts as |p i =0 p i =0|, which in the language ofH ⊗ H is Here, the factor of V −1 ⊥ is the initial state normalization factor, just as described for (2.1) in appendix A.
Because the projection onto the subspace of i p i = 0 and i x i b i = 0 is preserved by time evolution and splitting, and because the final state |end lies in this subspace, no harm is done if we also project the initial state onto this subspace. That is, end| · · · |start = end|P · · · |start = end| · · · P|start , and so we may replace Using (B3) and (B2), this is For the initial particle, (x 1 , x 2 ) = (−1, 1), and the above evaluates to where the last equality uses (B12). Combining (B14) and (B19), end| · · · |start = | · · · | . (B20) All the nuisance factors of V ⊥ and V ⊥ have canceled, as promised.
b ,x 3 3 b ,x FIG. 28: The notation used in (a) appendix B 3 and (b) appendix B 4 to label different particle states immediately before and after a splitting. (a) is similar to fig. 13, but it was convenient to introduce slightly more general notation here. The use of primes ( ′ ) here is completely unrelated to their use in (6.1).

3.
B|δH| matrix elements Here we work out the formula for the matrix element B|δH| . For simplicity and definiteness, we will focus on the case where the initial particle in the amplitude splits into two daughters, using the labeling conventions of fig. 28a. The starting point is the δH matrix element in the amplitude, written more conventionally in terms of the individual particles in the Hilbert space H (rather thanH ⊗ H), given by (4.29). In p-space, this is Now turn toH⊗H notation and our effective N −2 particle description. Use the definition (B10) and (B3) of the projected states to write (where we have suppressed writing the Kronecker δ-functions-the initial and final positions are to be understood as both satisfying the constraint i x i b i = 0). Using the amplitude matrix element (B22), this is (B24) At this point, the simplest way to proceed that will most easily generalize to other δH matrix elements is to note that (B24) can be rewritten, using (B3), in terms of the N=2 state normalization as where the subscripts (x ′ 1 , x ′ 2 ) on the matrix element show the two x i values that are to be used for both the bra and the ket. 37 Now we can use the N=2 version of the normalization formula (B9) to get The action of the combination is the same 3-particle B we defined in (2.29). Also, we can absorb the overall factor of ( This is eq. (4.13) of the main text.

4.
C 41 , C 23 |δH|B matrix elements The analysis proceeds similarly for δH matrix elements between N=3 and N=4 particle states, where we label particles as in fig. 28b. The analog of (B25) is (B28) Noting that b 4 is the third b i in the bra in (B28), the N=3 version of the normalization formula (B9) then gives (B29) The rest follows as before. Noting that x ′ 3 + x ′ 1 = x 4 + x 1 in the notation of fig. 28b, we can write the result as Using an appropriate permutation of the definition (4.23), we have which is equivalent to (4.15). The difference between this and the previous result (B27) for B|δH| is a factor of associated with the spectators. This is equivalent to the diagrammatic rule that we presented in the main text at the bottom of fig. 16.
37 Readers may reasonably wonder why these x i values instead of some other. The reason has to do with maintaining the consistency of our normalization convention (B2) for δ( x i b i ) for both 3-particle and 2particle states. In the above expression, b 2 = b 3 and (from longitudinal momentum conservation) x ′ 1 = x 1 and x ′ 2 = x 2 + x 3 , which means that the 3-particle expression x 1 b 1 + x 2 b 2 + x 3 b 3 for the final state is identical, in this context, to the 2-particle version Appendix D: More details on 1/∆t divergences

∆t→0 limit of xyȳx
Here, we will give some more details about how to take the small ∆t limit of xyȳx that gives rise to (5.46). For small ∆t, Ω cot(Ω∆t) and Ω csc(Ω∆t) both become (∆t) −1 + O(∆t). Then (5.42) becomes Except for the MΩ terms (coming from the evolution before and after the ∆t interval), this is the result one would get in vacuum (Ω ± = 0). Now use the simple formulas (5.16), (5.33), (5.35), which do not depend on details of the normal mode solutions, to get Using the definitions (5.2a) and (5.4a) of M i and M f , the combination X y Xȳ − X 2 yȳ that appears in the integrals (5.44) can be written in the form The fact that Y y and Yȳ vanish at the order of interest means that (5.45) simplifies to Using (D2) and the integrals (5.44), expanding in ∆t, and using x i = 0 to simplify, gives eq. (5.46) of the main text.

Triple time coincidence associated with 1/∆t
Here we justify more explicitly the claim in section V D that the Ω f /∆t divergent terms in (5.46) for xyȳx arise from the first three splitting times t x , t y , and tȳ simultaneously approaching each other.
In deriving results for xyȳx, we integrated the first splitting time t x from −∞ up to t y in (5.8) to get In order to determine whether the region where t x is close to t y is important, let us instead integrate only from t y − ∆τ to t y for some ∆τ : Comparing (D5) with (D6a), we see that the only effect that the change of integration region has on our results is to replace in (5.42) and therefore in the divergent pieces of (5.46). This change has no effect at all on the Ω f /∆t term in (5.46), and so the Ω f /∆t term arises from t x no more than ∆τ away from t y . This conclusion breaks down only if we make ∆τ so small that Ω eff i becomes so large that the small-∆t expansion made in (D2) breaks down. That happens when ∆τ ∆t. So the Ω f /∆t term arises from the case where t x , t y , and tȳ all lie within O(∆t) of each other.
A similar argument holds for the Ω i /∆t divergence, with the replacement In this section, we will more precisely justify the application of the iǫ prescription in section VII B 2, especially as regards the xȳxy calculation. As discussed above and in the main text, the 1/∆t divergences arise when three times become coincident, e.g. (t y , tȳ, tx). In order to sort things out, it will be helpful to know the explicit dependence on those three times as they approach each other. Unfortunately, we have already integrated over one of those times in our derivation of results like (5.46), (6.6), and (6.14). So let us step back and undo that integration. We could go back and rederive all our results (including doing all the B and C ij integrals) without integrating over the last (or first) time, but happily there is a trick that will allow us to extract the time dependence we want from the integrated results that we already have.

a. xyȳx
We start by discussing xyȳx, since this was the exemplar to which we related all other results. We will focus on the case where the last three times approach each other, giving rise to the Ω i /∆t divergence. In (D7), we saw how to integrate the last time tx over the interval (tȳ, tȳ + ∆τ ) instead of the interval (tȳ, ∞). If we take that result and then differentiate with respect to ∆τ , it is equivalent to never having integrated over tx in the first place. Using (D7b), this operation is equivalent to with the understanding that ∆τ should be re-interpreted as tx−tȳ after all of these operations have been completed. We are interested in what happens when the last three times approach each other, which is the case where both ∆τ and ∆t are small. So we may use the small ∆τ limit of (D7b), giving Now carry out this procedure on (5.45) to recover the tx integrand. Then expand the integrand for simultaneously small ∆t and small ∆τ , treating them as the same order. We find the result 38 dΓ dx dy xyȳx − dΓ dx dy where ∆t ≡ tȳ − t y , ∆τ ≡ tx − tȳ, and D ≡ −x 2x4 ∆τ + (x 1 +x 4 )(x 3 +x 4 ) ∆t.
One may check that doing the ∆τ integral indeed recovers the Ω i /∆t divergence in (5.46). The details of (D10a) will be mostly irrelevant to the present discussion: All we need to take away is that the only singularities of the integrand are associated with inverse powers of the denominator D defined by (D10b). D is proportional to the leading behavior of (X y Xȳ − X 2 yȳ )(∆t) 2 ∆τ [with Ω f → −i sgn(M f )/∆τ ]. And so one could have gotten what we will need for discussing iǫ prescriptions simply by looking at the structure of X y Xȳ − X 2 yȳ rather than deriving (D10a) in detail. 38 Note that here [dΓ/dx dy] vac , which is theq → 0 limit of dΓ/dx dy, does not correspond to all Ω's → 0 because Ω f has been replaced by −i sgn(M f )/∆t, which does not vanish asq → 0. b. Permutations, and the Ω i /∆t divergence of xȳxy We may write (D10) more explicitly in terms of the last three times by writing (D10b) as and rewriting the integration d(∆t) d(∆τ ) as dt y dtȳ dtx divided by a factor of total time (and restricted to the appropriate ordering of the three times). In this form, the result turns out to describe the Ω i /∆t divergences of not only xyȳx but also of xȳyx and xȳxy as well.
As a check, one may integrate the result over the last time in the xȳyx and xȳxy cases (tx and t y ) appropriately and obtain (6.6) and (6.14) with ∆t ≡ t y − tȳ and ∆t ≡ tȳ − t y respectively. 39 Now let's consider iǫ prescriptions in the denominator D of (D11). The times tȳ and tx in the conjugate amplitude should have small negative imaginary parts compared to the time t y in the amplitude, as explained in section VII. For completeness, however, we should also consider the iǫ prescriptions of tȳ and tx relative to each other. Two operators associated with times t 1 and t 2 both in the amplitude should be time-ordered, and so Two operators with timest 1 andt 2 in the conjugate amplitude should be anti-time-ordered, corresponding to the conjugate of (D12), t 2 −t 1 →t 2 −t 1 + iǫ sgn(t 2 −t 1 ).
A quick way to remember all of the iǫ prescriptions is to think of a Schwinger-Keldysh contour that runs from t = −∞ to t = +∞ for times in the amplitude and then turns around and runs back again from t = +∞ to t = −∞ for the times in the conjugate amplitude. The rule is that any time t 2 that appears further along this contour than another time t 1 should have a more negative (infinitesimal) imaginary part. So, for instance, we can implement all of the relative imaginary parts for the relevant times in xȳxy (for which tȳ < tx < t y ) by (tȳ, tx, t y ) → tȳ − i(ǫ 1 + ǫ 2 ), tx − iǫ 2 , t y . (D14) Using the explicit values (4.7) of thex i , (D11) can be put in the form for which the prescription (D14) gives The important point here is that we could get the same prescription by instead writing which is ∆t → ∆t+iǫ for the ∆t ≡ tx −tȳ relevant to xȳxy. This agrees with the prescription guessed in section VII B 2 of the main text for the Ω i /∆t divergences of xȳxy. [In this case, it happens to also be the same as the prescription one would get from simply applying (D13) in isolation to ∆t ≡ tx − tȳ. However, that is an accident, as we will see in a moment.] One may similarly check the iǫ prescriptions used for all of the other Ω i /∆t terms in section VII B 2.
It is convenient to reorganize (D19) as from which is it particularly easy to read off that (D20) gives This is the same prescription as just taking ∆t → ∆t − iǫ in (D19), which matches the guess that we made back in section VII B 2 of the main text. [Note that this is not the prescription we would have gotten by applying (D13) in isolation to ∆t ≡ tx − tȳ.] One may similarly check all of the other 1/∆t divergences of section VII B 2.

a. The basic discussion
For xȳyx, the analog of the xyȳx formula (4.16) is • As a result, the normal mode degrees of freedom A ± defined by (5.26), which were implicitly assumed to be real, are indeed real.
In contrast, when the 4-particle x i are the x ′ i (6.1) appropriate for xȳyx, the situation changes: • λ + defined above is still positive but λ − is negative, so that Ω + ∝ √ −i but Ω − ∝ √ +i.
• x 1 x 3 /2N + E is positive but x 1 x 3 /2N − E is negative, so that the normal mode (C + 34 , C + 12 ) has been normalized to be real by (5.24), but the normal mode (C − 34 , C − 12 ) has been normalized to be imaginary.
• As a result, the definition (5.26) makes A − imaginary instead of real.
Imaginary values for (C − 34 , C − 12 ) confuse the analysis that follows in section V B 3, especially the normalization of the relation (5.34). The easiest way to clear this up is to keep to real normalizations for the normal modes: and This means that the Lagrangian (5.29) for A ± is then normalized as Now the A new ± are both real, but the Lagrangian for A new − has an unusual overall sign. To understand the effect of the sign, it is useful to look at the corresponding Hamiltonian In time evolution exp(−iHt), negating a Hamiltonian has the same effect as negating time t. So the A − part of the time evolution (5.30) changes from to its conjugate The one change here that matters is the appearance of an overall minus in the prefactor, which we will address in a moment. The other change is to the sign of the argument of the exponential, which we now show is inconsequential. Combined with (E4), its effect is to change expressions of the form where, based on (5.28), (5.33) and (E4), Using the fact that Ω is a diagonal matrix (5.40), the new expressions (E11) are the same as the old (E10). The one important change is the overall minus sign in the prefactor of (E9), but we have already accounted for it in the main text when we decided in the formula (5.39) for the N=4 propagator to rewrite the overall factor |x 1 x 2 x 3 x 4 | as −x 1 x 2 x 3 x 4 . For xyȳx, for which the x i arex i (4.7), this had no effect because −x 1x2x3x4 = |x 1x2x3x4 |. (E13) For xȳyx, however, the x i are x ′ i (6.1), and so using −x 1 x 2 x 3 x 4 instead of |x 1 x 2 x 3 x 4 | introduces precisely the additional overall sign that we need to account for in the prefactor of (E9).
Another difference appears to be in the treatment of the triple gluon vertex in Wu (30). We find multiple terms, associated with the different helicity amplitudes in our section IV E and their combination into (4.38). The derivative associated with that vertex (our ∇ C y 23 ) is also more complicated than Wu's ∇ B 2 . Finally, the two 3-particle evolution frequencies in (F5) should not both be ω 1 when x ∼ y. Altogether, our result for y x ≪ 1 replaces (F5) by dI dx dy zyȳz ≃ α 2 s N c C A E 2 2ω 2 1 ω 2 2 (ω 1 +ω 2 ) 2 δn n δm m ω 1 − δnmδ nm ω 1 +ω 2 + δn m δ nm ω 2 z 1 <z 2 <z 3 <z 4 Note that, in momentum space, the combination ω 1 ∇ B 2 − ω 2 ∇ x 2 above is (proportional to) simply one of our rotationally-invariant combinations P ij ≡ x j p i − x i p j of the transverse momenta associated with that vertex.

Problem with comparing the double log contribution of xyȳx + xȳyx
There is a subtle problem comparing (i) our results for the xyȳx + xȳyx contribution to the double log to (ii) corresponding pieces of refs. [6,8]. (These contributions correspond respectively to the second diagram in Wu (31) [8] and the third diagram in Wu (32), and they correspond to B 3 and C 2 of Blaizot and Mehtar-Tani [6].) A simple indication of the problem can be found by comparing (i) our potential (5.19) applied to the x, y ≪ 1 limit of xyȳx or xȳyx to (ii) the potential of Wu (17). Consider xyȳx, where our notational convention (4.7) was that b 2 is the position of the y gluon and b 4 is the position of the x gluon. In the x, y ≪ 1 limit, the two harder particles stay extremely close together (relative to other transverse distance scales), corresponding to b 1 → b 3 . Adopting Wu's convention of using x for the separation of the x parton from the hard partons (our b 41 ≃ b 43 in this case), and B for the y parton from the hard partons (our b 21 ≃ b 23 here), the corresponding limit of (5.19) is This does not agree with Wu's The reason for the discrepancy is that Wu assumes that the two hardest particles form a color dipole with total color in the adjoint representation A. When those particles are gluons, however, they could in principle form a dipole in any color state contained in A⊗A, which contains many more possibilities than A itself (especially in the large N c limit). This situation is depicted in fig. 29a (which is drawn by distorting fig. 11 into a style similar to Wu's diagrams to depict relative transverse separations). In this picture, the initial singlet dipole first emits a gluon and must become an adjoint dipole to balance color. The emission of the second gluon in principle allows the dipole to become anything in A⊗A. However, all of these possibilities other than A itself will be suppressed by the small dipole size once we also add in virtual processes where the y boson connects to the other particle in the dipole, as in fig. 29b. Since in this paper we are just computing dΓ/dx dy (for fixed x and y) and not energy loss, we have not included virtual processes, and so the other color possibilities are not suppressed. The color dynamics of our calculation is therefore different than Wu's, which makes it difficult to directly compare partial results. (The same holds for comparison to Blaizot and Mehtar-Tani.) For zyȳz, however, all is well, as depicted in fig. 30: In the x, y ≪ 1 limit, the calculation of [dΓ/dx dy] zyȳz is also restricted to an adjoint color dipole at intermediate times. On a related note, following the numbering convention of (F3), b 4 → b 1 for x, y ≪ 1, and our x and y gluons correspond to b 3 and b 2 respectively. Wu's x is then our b 34 ≃ b 31 , and his B is our b 21 ≃ b 24 . In this limit, our potential (5.19) then indeed agrees with Wu's potential (F9).