QCD Coherence and the Top Quark Asymmetry

Coherent QCD radiation in the hadroproduction of top quark pairs leads to a forward--backward asymmetry that grows more negative with increasing transverse momentum of the pair. This feature is present in Monte Carlo event generators with coherent parton showering, even though the production process is treated at leading order and has no intrinsic asymmetry before showering. In addition, depending on the treatment of recoils, showering can produce a positive contribution to the inclusive asymmetry. We explain the origin of these features, compare them in fixed-order calculations and the Herwig++, Pythia and Sherpa event generators, and discuss their implications.


Introduction
The observation of a substantial forward-backward asymmetry in the production of top quark pairs at the Tevatron [1][2][3][4][5] has prompted renewed theoretical study of Standard Model predictions for this quantity: see Refs. [6][7][8][9][10][11][12][13][14][15] and references therein. Quantities of particular interest are the distributions of the asymmetry with respect to some observable, generally denoted by O: where the rapidity difference is defined as ∆y = y t − yt. Examples of possible observables O are the invariant mass m tt and transverse momentum p T,tt of the pair. For the former, QCD predicts an asymmetry that increases with m tt [6,9]. For the latter, the prediction changes sign as p T,tt increases, since the NLO loop contribution is positive at p T,tt = 0 while the real-emission contribution at p T,tt > 0 is negative. After matching to parton showers using the MC@NLO prescription [16,17], the predicted cross-over is around p T,tt ≈ 25 GeV: see Fig. 1. A surprising fact, also shown in Fig. 1, is that a leading-order parton shower event generator such as PYTHIA, with appropriate settings, displays a qualitatively similar p Tdependent asymmetry, even though the LO production processes have no asymmetry. As we shall see, the same is true of the HERWIG++ and SHERPA event generators, although their quantitative predictions differ. 1 The explanation is a nice illustration of the QCD coherence of parton showering. It is true that the inclusive asymmetry built into the LO generators is zero. However, in the hard process qq → tt the colour flows from the incoming quark to the top quark and from the antiquark to the antitop lead to a more violent acceleration of colour, and consequently more QCD radiation, when the top is produced backwards in the qq frame than when it goes forwards, as illustrated in Fig. 2. The additional radiation when the top goes backwards pushes the recoiling pair to higher transverse momentum. Correspondingly, events with forward-moving tops are left at lower transverse momentum, leading to the behaviour seen in Fig. 1. The effect vanishes at threshold and becomes more and more marked as the invariant mass of the pair increases, due to the increasing amount and scale of QCD radiation.
Event generators with coherent parton showers, implemented through dipole showering in SHERPA and angular ordering in HERWIG++, take account of these effects. (PYTHIA uses a hybrid between the two.) A full NLO treatment, included in MC@NLO but not in the stand-alone generators, adds a finite positive virtual contribution. Nonetheless, as we shall demonstrate in Section 4, even LO shower models can also generate a net inclusive asymmetry A FB , if the shower kinematics allow for migration between positive and negative ∆y regions. In the following section we examine in more detail the approximations made in event generators, in comparison to the fixed-order perturbative treatment. Then in Section 3 we explain in general terms how they can produce a positive inclusive asymmetry while only containing the LO production process. In Section 4 we present results from the HERWIG++, PYTHIA and SHERPA generators for the inclusive asymmetry and various differential asymmetry distributions. In Section 5 we summarize our findings and comment on their implications.

Comparison with fixed order
To establish notation we first consider the lowest-order process, for which the leading-order spin-averaged matrix element squared is where m is the heavy quark mass and s = 2 p 1 · p 2 ,t = −2 p 1 · p 3 ,ū = −2 p 1 · p 4 . (2. 3) The corresponding differential cross section, is used for the primary hard subprocess in the event generators. Clearly, it does not exhibit any forward-backward asymmetry. Thus for an asymmetry to be produced by a leading-order generator, some parton showering must occur.

One gluon emission
The leading-order shower contribution is the one-gluon emission process, For the asymmetry we require the difference between this and the process q(p 1 ) +q(p 2 ) →Q(p 3 ) + Q(p 4 ) + g(k) . (2.6) The difference between the spin-averaged matrix elements squared is [19][20][21] where and W ij is the dipole radiation function (2.9) The asymmetries for the processes gq → QQq and gq → QQq are obtained from the same expression after crossing. They are very small and will be neglected in the following. We see from Eq. (2.7) that the asymmetry vanishes for N = 2. This must be the case to all orders, because the fundamental representation of SU(2) is pseudoreal and so in that case qq → QQX is the same as qq → QQX.
In the event generators the gluon radiation is represented as coherent emission from the external lines of the Born process and so the new colour factor (N 2 − 4)/N is approximated by 2 C F = (N 2 − 1)/N . Thus we expect them to overestimate the asymmetry at non-zero p T ≡ p T,QQ by around 60% in lowest order. They also neglect the second term in the square bracket in Eq. (2.7), which is less singular at small k, and approximate the remaining terms by the Born term times dipole-like factors. Thus they effectively treat the asymmetry in the soft gluon limit. They treat the gluon radiation more accurately in the collinear regions, but those regions are not dominant in the asymmetry.

Soft gluon limit
As explained above, we expect the event generators to reproduce the soft gluon limit of the asymmetry, apart from the colour factor mentioned earlier. In the limit of small k we have s 1 , s 2 →s etc. and Eq. (2.7) takes the simple form (2.10) In the soft gluon limit we can write the differential cross section for emission of a gluon with energy ω as where Ω is the solid angle for soft gluon emission, and we can use the fact that [22] W ij dΩ = 4π ω 2 where v ij is the relative velocity of i and j: (2.13) Now we define the asymmetry cross section (2.14) The radiation functions F ij appearing in Eq. (2.14) have collinear divergences along the beam directions, which cancel in the full expression. Regulating them with a small light quark mass µ, we have in the limit µ → 0: (2. 16) In lowest order the QQ pair recoils against the emitted gluon, so the transverse momentum of the pair is p T = −k T where dk T /k T = dω/ω. We can also express Eq. (2.16) in terms of the qq → QQ scattering angleθ sincē where β = 1 − 4 m 2 /s is the heavy quark c.m. velocity in the Born process. Thus Finally, we can integrate this expression to find the recoil distribution of the asymmetry in the soft limit. It is convenient to normalize this to the qq → QQ Born cross section, 2 obtained by integrating Eq. (2.4): Then we can write in general where the coupling-and colour-stripped asymmetry function F (β, p T ) is given in the soft limit, p T p max Thus we expect the asymmetry in the event generators to become more negative with increasing top pair invariant mass √s , growing linearly with c.m. velocity β near threshold. The function F (β, 0) is shown in Fig. 3. It tends to an asymptotic value of −8 log 2 − 1 = −6.545 as β → 1. We note that the four-term power series expansion Eq. (2.21) gives a good approximation over a wide range of β.

Beyond the soft approximation
The coupling-and colour-stripped asymmetry function F (β, p T ) remains negative away from the soft region but decreases in magnitude as p T increases, vanishing at the phasespace boundary, as shown in Fig. 4. Here, as before, β is defined as 1 − 4 m 2 /s 1 , i.e. in terms of the overall centre-of-mass energy squared s 1 =s, rather than the invariant mass squared of the QQ pair, s 2 .
To go from this function to the p T -dependence of the asymmetry, A FB (p T ), as defined in Eq. (1.1), we have to include the strong coupling and colour factors and normalize to the differential cross section at the same p T . A full leading-order calculation of the asymmetry using MCFM [24] (computing tt production at NLO) yields the results shown in Fig. 5. For this calculation, the next-to-leading order parton distribution functions of Ref. [25] were used, with scale equal to the top mass (m t = 172.5 GeV), but the results for the asymmetry are rather insensitive to these choices.
The real-emission asymmetry cross section in the numerator of Eq. (1.1) diverges as p T → 0, due to the soft divergence discussed in the previous section. However, the denominator, shown on the left in Fig. 5, diverges faster, as it has initial-state collinear singularities that cancel in the numerator, so that A FB (p T ) is driven towards zero as p T → 0.
In the fully inclusive asymmetry, the divergence of the real emission contribution is cancelled by the singular virtual correction at p T = 0, and we have where f (β) is finite. If we limit the integration to p T < q T m, we therefore find where F (β, 0) is given by Eq. (2.21) and G(β, q T /m) is regular at q T = 0. Since F (β, 0) ∼ −4 β is negative, a cut on p T < q T adds a positive contribution to the inclusive asymmetry, which grows logarithmically as q T is reduced. This effect can be quite significant. For example, in top pair production at a pair invariant mass of 450 GeV (m t = 172 GeV, β = 0.645), a cut on p T < 20 GeV gives a contribution of 3.33 α S ∼ 35% from the logarithmic term. In a combined NLO plus parton shower treatment such as MC@NLO, the positive singular virtual contribution at p T = 0 is spread out over a finite region of p T , the so-called Sudakov region. This leads to a cross-over in the asymmetry from positive values at low p T to negative values at higher p T , as depicted in Fig. 1. In MC@NLO, the finite part of the virtual contribution, absent from the event generators, is also included and can affect the position of the cross-over.
Monte Carlo event generators with QCD coherence will not have the correct form for the function G in Eq. (2.23), due to the inexact treatment of hard, non-collinear emissions, but they should reproduce the logarithmic term, apart from the overestimate of the colour factor by 60% mentioned earlier. They will also display the spreading of the positive asymmetry over the Sudakov region at low p T .

Generation of an inclusive asymmetry
Despite the coherence effect elaborated upon in the previous section, the naive expectation is that the asymmetry should still sum to zero when integrated over all of phase space. 3 This expectation is based on the simple fact that showers are unitary (meaning real-radiation corrections cancel exactly against virtual-Sudakov ones), so even though they can move things around in phase space, they do not generate any corrections to total cross sections. At the most inclusive level, this is reflected in the fact that the total integrated tt cross section is the same before and after showering.
However, the asymmetry is defined in terms of two separate cross sections, one computed for ∆y > 0 and the other for ∆y < 0. If the shower kinematics allow any migration between these two regions, then unitarity no longer guarantees complete cancellation in each of the regions separately, leading to the possible generation of a net inclusive asymmetry. Formally, we can write the cross section difference that generates the integrated asymmetry as where the first line represents events that start (at the matrix-element level, before showering) with a positive value of ∆y and the second line represents events that start with a negative one. The terms in parentheses represent the action of the parton shower. The probability for no branchings to occur is represented by the Sudakov factor, ∆, with subscript ± reflecting that the probability to radiate can be different between an event with positive ∆y and one with negative ∆y. Indeed, as shown in the preceding section, events with positive ∆y have less phase space for emission and so are less likely to radiate. Therefore, in general, we have This, however, is not by itself enough to generate an inclusive asymmetry. The second terms in the square brackets in Eq. (3.1) represent those events that do experience one or more branchings. For these events, the final top momenta, and hence possibly their final rapidity difference, will depend on whether and how the top momenta are modified by the branchings. In the present context, we do not care about the details of how this occurs, merely about whether it is at all possible for an event with positive ∆y at the Born level to migrate to negative ∆y after showering, and vice versa. This is represented by the probabilities P +− and P −+ in Eq. (3.1). If the shower model preserves the rapidity ordering of the tops, then and so the integrated asymmetry remains zero, despite the two Sudakov factors being different. If, on the other hand, the shower model sometimes changes the relative rapidity ordering of the tops, for instance as a consequence of longitudinal recoil effects (as will be studied in more detail in the next section), then a total inclusive asymmetry can be generated. In the context of unitarity, this can be interpreted as due to the fact that unitarity involves an integral over the entire phase space, and hence the exact cancellation that occurs in the total inclusive cross section is here broken by the splitting-up of the realradiation phase space into two regions that enter with different signs in the asymmetry. From unitarity of the shower, we have so that Eq. (3.1) can be written as Because 1 > ∆ + > ∆ − , we expect the second term on the right-hand side of Eq. (3.5) to dominate, giving a positive inclusive asymmetry, unless there is a compensating excess of P +− over P −+ . However, on rather general grounds one would not expect such an excess, because there is less radiation when ∆y > 0 and hence a smaller probability of recoil effects changing the sign of ∆y. Indeed we shall see below that the treatment of recoils in shower generators normally leads to P −+ > P +− , enhancing the positive inclusive asymmetry due to the unequal Sudakov factors. Considering Eq. (3.5) from the viewpoint of perturbation theory, we observe that the factors of (1 − ∆ ± ) in the integrands are O(α 1 S ), while P ±∓ are O(α 0 S ), being the conditional probabilities that gluon emission will switch the sign of ∆y, given that at least one emission has occurred. Thus the recoil effect in showering generates an approximate inclusive asymmetry that starts at O(α S ), like the full perturbative calculation. The factors of (1 − ∆ ± ) provide information about the virtual contribution and the probabilities P ±∓ specify what fraction remains after real-virtual cancellation. Since these probabilities depend on the strategy for treating recoils in the shower, getting the best agreement with the full asymmetry at O(α S ) could be a good way to optimize this strategy.

Comparison between parton-shower models
In this section, we study the asymmetries produced by the following general-purpose event generators: 4 HERWIG++ [28] (using angular-ordered parton showers [29]), PYTHIA 6 [30] (using both its Q 2 -and p ⊥ -ordered parton-shower models [31,32], represented by tunes D6T and Perugia 0, respectively), PYTHIA 8 [33] (using p ⊥ -ordered parton showers [34]), and SHERPA [35] (using p ⊥ -ordered dipole showers [36]). Of these, HERWIG++ and SHERPA have QCD coherence built in and PYTHIA 6 has options with varying amounts of coherence, while the first ISR (initial-state radiation) emission is not subjected to coherence constraints in this version of PYTHIA 8. For both PYTHIA 6 and SHERPA, we include some additional illustrations of specific shower model variations.
A custom-made RIVET [37] analysis was used to process the events of all generators, ensuring uniformity of the analysis. Between 1 and 4 million events (at least) were generated for each model. All the generators include the leading-order qq → tt and gg → tt production processes, which are showered with default settings, 5 unless otherwise specified. Note that we do not study the effects of matrix-element plus parton-shower matching in this paper.

Inclusive asymmetry
The inclusive asymmetry and the asymmetries with an invariant mass or transversemomentum cut, produced by each model are given in Tab. 1. Of these, SHERPA's CSSHOWER produces the largest inclusive asymmetry. We interpret this as a consequence of its initial-final dipole kinematics [36,40,41]; part of the longitudinal momentum of the first emitted gluon has to come from the recoiling top quark, changing its rapidity and allowing ∆y to change sign. Later on, we will illustrate and discuss recoil effects in somewhat more detail, in a small CSSHOWER case study, see Section 4.2.1 and Appendix A.
In HERWIG++, coherence is implemented by angular-ordered parton branching rather than dipole showering. Parton showers associated with each incoming or outgoing hard parton are generated independently in angular regions defined by the colour structure of the hard subprocess. The showers are then combined according to a kinematic reconstruction 5 The choice of PDF set only gives small effects ( 10%) on the asymmetry, mostly via the relative fraction of gluon-initiated vs. quark-initiated tt production. For completeness, HERWIG++ uses the MRSTMCal PDF set (i.e. the LO fit from the MRST2002 family) [38], PYTHIA 6 with Perugia 0 uses CTEQ5L [39], and PYTHIA 6 with D6T, PYTHIA 8, and SHERPA all use CTEQ6L1 PDFs [23]. There is also a slight dependence on the choice of renormalization scale, see Appendix A.
algorithm [28] that again reflects the colour structure of the subprocess. In the case of qq → tt, there are two initial-final colour-connected systems, qt andqt, as illustrated in Fig. 2. The resulting treatment of recoils is similar to that in dipole showering of these systems, with a particular prescription for sharing recoil momentum within each. The separate recoils of the top quark and antiquark again imply that ∆y may change sign.
In contrast, the approach to initial-final dipoles in PYTHIA, for both p ⊥ -and Q-ordered showers [31,32], is to use the other incoming parton for momentum conservation, rather than the recoiling top. The relative rapidity ordering of the top and the antitop is normally preserved by this strategy, resulting in very little net asymmetry being generated.
Overall, the asymmetries in HERWIG++ and SHERPA are comparable to the LO perturbative results 6 shown in Tab. 1, suggesting that their similar recoil strategies are not far from optimal.

Asymmetry as a function of top quark observables
We now show differential spectra dσ/dO for four key observables and their related forwardbackward asymmetry distributions A FB (O), as defined in Eq. (1.1). The observables presented here are the azimuthal angle ∆φ between the transverse momenta of the top and antitop quarks, the |∆y| distribution itself, and the transverse-momentum and invariant mass distributions of the tt pair. We take a subset of the shower versions listed in Tab. 1 (neglecting the PYTHIA 6 default and SHERPA 1.3.1 versions) and compare their predictions with each other.
In Fig. 6 we show the |∆y| and ∆φ distributions. The dσ/d|∆y| predictions are very similar in shape; they differ in normalization because of the spread of the total inclusive cross sections evaluated at LO and different scales in the various event generators. The asymmetry rises for larger absolute rapidity differences of the top quark rapidities. The large |∆y| configurations emerge more easily in scatterings with small angle to the beam (or the qq axis). This produces a positive asymmetry since the associated initial-final qt andqt dipoles tend to emit softer gluons, with higher rate, in forward direction. The mechanism is also explained more fully in Section 4.2.1. All PYTHIA predictions increase rather mildly, while those of HERWIG++ and SHERPA show a steeper (approximately linear) slope, which is slightly flatter, but qualitatively comparable with the recent MCFM results given for the acceptance corrected case, see Ref. [15].
The ∆φ variable is a typical example of an observable separating the hard-emission domain from the Sudakov region, here located around large ∆φ ≈ π. We depict the dσ/d∆φ spectra in Fig. 6 reflecting the different levels of hardness produced by the different generators with PYTHIA 8 giving the highest levels. Because of the strong correlation with the p T,tt observable, ∆φ displays a qualitatively similar behaviour of the related asymmetry functions. As can be seen from the plot to the bottom right in Fig. 6, the more violent radiation emerging from the colour dipoles spanned with backward-moving top quarks leads to a negative asymmetry over a wide range of angles, except for very large angles where  the asymmetry turns positive as a result of the Sudakov effect. While for most predictions the cross-over occurs at ∆φ ≈ 3, two results deviate considerably from the fairly constant behaviour of the asymmetry (A FB ∼ −0.1) for ∆φ < 2. The PYTHIA 6 tune D6T and PYTHIA 8 mark these two very different ends of the low-∆φ asymmetry spectrum (with values of −70% and +20%, respectively). For the former, soft colour coherence effects are overestimated whereas, for the latter, coherence effects have not been implemented, in particular the initial-final qt andqt dipoles are not yet treated as such.  applies over the entire range of the pair mass. We can use the following equation to better understand this behaviour and the pair mass dependence of A FB : where E 2 T = m 2 + p 2 T and ∆φ is the azimuthal angle between p T,t and p T,t . It is sufficient to focus on the cosh ∆y and cos ∆φ dependence of m 2 tt . The cosh ∆y term is forwardbackward symmetric and the squared mass increases with larger absolute rapidity differences. The cos ∆φ term in Eq. (4.2) may however reduce m 2 tt , but in the hard region only. Consequently, the cosh ∆y dependence of A FB directly translates into a similar mass dependence. Neglecting this for a moment, we also notice that for given ∆y, it is cheaper to shift the masses to larger values. This is because of the enhancement of soft emissions (∆φ ≈ π) causing an overall plus sign in the cos ∆φ term. The imbalance in the softemission rate generated by soft colour coherence between the forward and backward region thus produces harder mass spectra in the forward region. Taken together with the |∆y| dependence of the asymmetry, we conclude that the cos ∆φ term induces an additional growth of the asymmetry with increasing pair mass and a small suppression in the low mass region. In cases where the |∆y| dependence of the asymmetry is almost zero, the same mechanism may generate slightly negative asymmetries at low mass.
Looking at the different generator results displayed in Fig. 7, the dσ/dm tt spectra can be seen to differ less than the p T,tt spectra. Again, PYTHIA 8 gives the hardest distributions. As expected, we find that the tt mass dependence of the asymmetry is determined predominantly by the asymmetric behaviour present in |∆y|. The pattern shown on the upper right of Fig. 6 repeats itself here. We also observe that small negative asymmetries are possible for low pair masses, e.g. as shown for the PYTHIA 6 tune D6T.
Finally, on the lower right of Fig. 7 one finds the results obtained by the different generators for the asymmetry plotted as a function of p T,tt . We already discussed the characteristics of A FB (p T,tt ) throughout preceding sections; thus, recalling the discussion of the ∆φ case, the p T,tt results are as expected and their interpretation is straightforward. Exhibiting the asymmetry in terms of p T,tt naturally allows for a better discrimination in the hard region. We thus observe that HERWIG++, PYTHIA 6 P0 and SHERPA differ in their description of the high-p T tail of the asymmetry, even though they sufficiently agree in the low-∆φ region. SHERPA predicts a slope change around 50 GeV indicating a possible return in A FB (p T,tt ) to zero for large p T , as seen in LO (Figs. 4 and 5). In contrast, HERWIG++ and PYTHIA 6 P0 maintain their trend towards more negative asymmetries. At the other end of the p T spectrum, the rise of the asymmetry towards lower p T is not shown by PYTHIA 6 D6T and PYTHIA 8, which is compatible with the findings for ∆φ ≈ π in Fig. 6.
We have studied more observables than we are able to present here, so for a more comprehensive comparison we refer the interested reader to the corresponding "Top quark (MC)" web-pages available under mcplots.cern.ch [42]. Among other things one can find, for example, the distributions and forward-backward asymmetries of the transverse momentum of the top quark, p T,t , or the rapidity y tt of the tt pair; in addition many observables are shown separately for the high and low pair mass or transverse-momentum region obtained by cutting on m tt at 450 GeV or p T,tt at 50 GeV, respectively.
Summary. We can say that the HERWIG++ and SHERPA predictions agree fairly well with each other, and compare quite nicely -on a qualitative level -to the A FB (m tt ) and A FB (|∆y|) results given in Ref. [9]. Both models incorporate soft colour coherence on a compatible level, which consequently may be interpreted as the source of the agreement; the differences lie in details such as the treatment of recoils, the shower variables and the form of the splitting functions used. These differences cause deviations in the highp T asymmetry spectra. The dependence on recoil effects in SHERPA is studied in more detail in Section 4.2.1 below. In PYTHIA, soft colour coherence is accounted for on a more approximate level. Although the P0 tune of PYTHIA 6 is similar to HERWIG++ in the hardemission domain, it differs in the description of the Sudakov region, yielding a milder mass and |∆y| dependence of the asymmetry. The PYTHIA 6 D6T tune and PYTHIA 8 exhibit larger differences. The dependence on the shower modelling in PYTHIA 6 is studied in more detail in Section 4.2.2 below.

Dependence on recoil effects: SHERPA's CSSHOWER
We have argued that recoil effects play an important role in producing the asymmetries generated in coherent parton or dipole showering. To illuminate the mechanism further, we have conducted a small case study based on results obtained with SHERPA's CSSHOWER. Some of the details have been postponed to Appendix A in order to maintain the flow of the main part.
Asymmetry enhancing longitudinal recoil effects. We have identified the unequal Sudakov form factors in forward and backward top production as a major source of asymmetry. The Sudakov imbalance (∆ + > ∆ − ) emerges as a result of soft colour coherence. In addition, based on Eq. (3.5) we have seen that any net migration of the type P −+ > P +− ≥ 0, from the backward to the forward ∆y phase space, leads to a positive inclusive asymmetry and an amplification of the Sudakov or coherence effect. We now want to trace the origin of the migration process.
In a simple dipole picture where a gluon emission stretches (further opens the initial angle of) the starting initial-final qt orqt dipole, one can easily account for ∆y = ∆ỹ + on average. 7 Here we denote the top quark rapidity difference before the emission (or at the LO generation level) by ∆ỹ, and > 0 expresses a small positive shift. Using a simple generation cut on ∆ỹ, we can test this hypothesis by counting and analyzing the events that end up in the backward/forward region after exclusively showering tt events produced at LO under the constraint ±∆ỹ > 0. Fig. 8 shows the main results of this migration test; more details are compiled in Appendix A. We have plotted the outcomes of tt production at leading, fixed order (labelled "before shower") and of several CSSHOWER runs (all labelled "after ...") grouped according to unconstrained, ∆ỹ > 0 ("fwd") and ∆ỹ < 0 ("bwd") LO phase-space generation: we distinguish between completely showered runs and runs where showering was terminated after just one emission. For the latter, we only show the results obtained with the restricted LO phase space. Focusing on ∆y distributions, we make a number of observations, broadly confirming the physics of the simple dipole picture suggested above: • showering generates a positive asymmetry, which is growing with larger |∆y| (compare "before" and "after shower" results).
• migrations are small, happen locally but yet across the entire ∆y phase space; the − → + direction and, therefore, − → + cross-overs are favoured: (1) the "bwd"  generated results extend into the ∆y > 0 domain filling the deficit close to the transition region left by "fwd" generated events that now populate larger ∆y, and (2) the migration processes in the opposite direction are largely suppressed.
• the largest effect already originates from the first emission, the gluon emission of  Figure 9: The forward-backward asymmetries versus |∆y| (upper panel) and p T,tt (lower panel) for the two different recoil strategies available for the CSSHOWER implementation in SHERPA. Results gained by the corresponding one(two)-emission(s) showers are depicted as well. Each subfigure is supplemented by a ratio plot using the default CSSHOWER prediction for reference. the initial-final qt andqt dipoles (compare dashed with solid lines). 8 This does not necessarily mean that the corrections from multiple emissions are negligible; they easily give 10-20% effects, as can be seen in both Figs. 8 and 9.
This pattern carries over to the high-p T region, p T,tt > 50 GeV, except for the fact that the total asymmetry now turns negative, cf. the lower panel of Fig. 8. The migration is more severe, but cannot overcome the overall negative trend, caused by the more violently radiating "bwd" generated initial dipole configurations: the radiation imbalance predominates over the migration effect.
Comparison of recoil strategies. SHERPA's CSSHOWER implementation provides two recoil schemes, the default one, see Refs. [40,43] and the original CS scheme as advocated in Refs. [36,44,45]. They differ mainly in their treatment of the transverse recoils. The distribution of the longitudinal recoil momenta effectively is the same in both schemes, as documented in the top panel of Fig. 9 where we show A FB as a function of |∆y| (exhibiting all of the characteristics described earlier). 9 We see again, the bulk of the asymmetry is already produced by the one(two)-emission(s) showers, similarly for A FB (p T,tt ) displayed in the lower panel of Fig. 9. The p T -dependent asymmetry function is the prototype of distributions discriminating clearly between the two recoil strategies: the original CS scheme is more like that in the Catani-Seymour NLO calculational scheme [44]: the (transverse) recoil from a gluon emitted off a qt configuration (and likewise off aqt one) is compensated by the top quark, regardless of its role in the emission process (emitter or spectator). The prediction given by the original scheme therefore remains flat at about −15% for high p T,tt while the default prediction levels off close to zero from below. In the default scheme, the gluon recoil is rather divided over the entire set of final-state partons. This requires an additional transverse boost combined with a rotation that in turn washes out the radiation imbalance between the forward and backward regions for very large p T,tt .

Dependence on shower model: PYTHIA
For processes like tt production, which do not contain any QCD jets at the Born level, both PYTHIA 6 and PYTHIA 8 use so-called "power showers" [46] to populate the ttj phase space. Since the LL splitting kernels generally represent an overestimate in the region of very hard jets, a factor that suppresses such emissions has been introduced in the p ⊥ -ordered showers in both PYTHIA 6 and PYTHIA 8, similar in spirit to a matrix-element correction [47] but with a much simpler analytical structure. In PYTHIA 8, the suppression factor is derived from universal t-channel arguments [34] and does not depend on the colour structure of the event, wherefore it does not contribute to the generation of any tt asymmetry. In PYTHIA 6, the suppression factor is [48] P accept = min 1, P 67 where P 67 corresponds to the parameter PARP(67) in the code, p T evol is the evolution scale for the branching, and s D is the invariant mass squared of the radiating parton with its colour partner, with all momenta crossed into the final state (i.e. it isŝ for annihilationtype colour flows and −t for an initial-final connection). This is motivated partly from 9 The AFB(m tt ) are also broadly unaffected by the scheme change; similarly the A  studies of similar factors in the context of "smooth ordering", introduced in [49]. Finally, in the Q 2 -ordered shower model in PYTHIA 6 [30], a veto on the emission angle is placed, which depends explicitly on the direction of the colour partner.
To illustrate the effect of these choices on the asymmetry, we show the dependence of the asymmetry on p T,tt for five PYTHIA 6 tunes in Fig. 10.
The Q 2 -ordered D6T tune includes the explicit angular cut on the emission angle mentioned above, which a priori should produce an effect qualitatively similar to that of the angular-ordered showers in HERWIG++. However, in PYTHIA's Q 2 -ordered shower, the effect is amplified, as follows. When an initially massless ISR parton evolves to become a massive jet, its virtuality is generated by reducing its momentum while keeping its energy unchanged. The momentum of whatever the ISR parton is recoiling against (here the tt system) is then also reduced to ensure momentum conservation. Thus, to get a tt pair with a certain p T , the Q 2 -ordered shower must first radiate a massless ISR parton with an initially much larger value of p T , which is then reduced when that parton acquires a virtuality. The fact that this compresses the p T spectrum is well-known from the Drell-Yan case (cf. e.g. [42]) and can be counteracted by choosing a low renormalization scale for α S (as has been done in D6T, which uses µ R = 0.45 p T ), thus bringing the p T spectrum itself back into rough agreement with other models, as illustrated in the left-hand panel of Fig. 10. The asymmetry spectrum, however, remains compressed, as shown in the right-hand panel.
Among the four p ⊥ -ordered tunes, the Perugia 0 (P0), Perugia HARD (PHARD), and Perugia SOFT (PSOFT) ones [48] use the suppression factor defined by Eq. (4.3), while Z1 does not apply any suppression. The central Perugia 0 (P0) tune uses P 67 = 1. This generates an asymmetry that begins to turn on at roughly p T = m t /2. The HARD and SOFT variations use P 67 = 4 and P 67 = 1/4, respectively, which modifies the turn-on point. Neither the HARD nor the Z1 tunes exhibit any significant asymmetries in the region plotted here, similarly to the case in PYTHIA 8.

Summary and implications
The studies presented above arose from the initially surprising observation that Monte Carlo event generators can produce non-zero forward-backward asymmetries in top pair production, even when treating the relevant subprocess qq → tt at leading order, which has no such asymmetry. Our studies show that these asymmetries arise from valid physics built into generators with coherent parton or dipole showering. While not quantitatively correct in every detail, the coherent showering approximation captures essential features of the physics not hitherto well understood, which may serve as a guide to the contributions of higher orders.
The generated asymmetries are of two kinds. First, in the differential cross section at non-zero transverse momentum of the top pair, a negative asymmetry results from the extra QCD radiation emitted when the top quark is produced backwards in the rest frame of the pair. This effect is manifest in the qq → ttg matrix element and is present in the generators in the soft approximation, with a colour coefficient that is exact in the large-N limit but 60% too large at N = 3.
In fixed-order perturbation theory, the asymmetry at non-zero p T of the pair tends to zero from below as p T → 0. However, precisely at p T = 0 there are singular virtual contributions that lead to a positive overall inclusive asymmetry which grows with increasing invariant mass of the pair. The event generators perform an approximate all-order resummation of perturbation theory, which smears out the singular contributions at p T = 0 over a finite Sudakov region, and so the asymmetry changes sign at some point and becomes positive at small p T . The precise switching point is sensitive to finite terms and higher-order corrections not included in the generators, but the change of sign is a striking general prediction that should be investigated experimentally. 10 It is also worth noting that, because of this switch, a bias towards low p T , or against extra jet production, in the method used to reconstruct the tops could lead to a significant upward shift in the measured inclusive asymmetry.
The other type of generated asymmetry is an overall inclusive one, positive and growing in value with increasing invariant mass of the pair. In fixed-order perturbation theory, such an asymmetry appears at order α S relative to the Born process and is due to a positive asymmetry in the virtual correction, which dominates over the negative contribution of real emission discussed above. The event generators implicitly contain virtual corrections, in the form of the Sudakov factors that drive the showers and produce p T smearing. These factors are a reflection of unitarity, which implies that showering cannot change the inclusive cross section from the value established by the primary subprocess. One might therefore think that the inclusive asymmetry also could not change from zero when the primary process is symmetric. However, that is not the case, because the asymmetry is not an inclusive cross section and so is not protected by unitarity.
In fact, the same effect that generates a negative asymmetry at non-zero p T , namely the extra radiation in backward top production, tends to produce a positive inclusive asymmetry. As expressed by Eq. (3.5), it arises from the difference between the Sudakov factors for forward and backward top production and the migration of recoiling top quarks between hemispheres. Thus it is a combination of real and virtual effects, which is of relative order α S , because the difference of Sudakov factors is of that order while the forward and backward migration probabilities are pure numbers, modulo higher-order corrections, with magnitudes that depend on how the event generator treats recoils. The fact that recoil strategies based on colour flow produce inclusive asymmetries, similar to the full fixed-order one, suggests that the asymmetry can be regarded as arising in this way. Such a viewpoint could serve as a guide towards the more correct treatment of recoils, and conversely as an indication of the possible effects of higher orders beyond the range of explicit calculations. 11 We believe that these findings have important implications for the interpretation of the experimental data. At the very least, one needs to be aware that the available event generators can produce significant asymmetries where none were previously expected. Monte Carlo estimates of corrections to asymmetries could be affected by this, particularly corrections to "parton level". Moreover, these corrections will likely be model-dependent as documented by the detailed parton-shower comparison presented here. The results depend on the way colour coherence is implemented in the various codes, which we have summarized at the end of Section 4.2.
On a more theoretical level, the fact that these asymmetries are due to recoils points to the importance of recoil effects, which are often neglected in estimates of higher orders based on soft gluon resummation.
There are clearly many directions in which the studies presented here could be extended. The effects of asymmetries produced or enhanced by parton showering in generators that match to an NLO calculation, such as MC@NLO and POWHEG, need to be assessed. Similarly for schemes that match to LO multi-parton matrix elements. Analogous effects will also be present in the tt charge asymmetry, currently being investigated at the LHC, and in other processes where the colour flow of the primary process affects parton showering.  Figure 11: (Left) SHERPA CSSHOWER predictions for the p T,tt distribution using different LO generation and shower modes. Dashed lines correspond to results taken from one-emission ("1 em") showers, while solid lines depict those after complete showering. Blue and green lines show the outcomes under constrained LO tt phase-space generation, to the forward ("fwd", ∆ỹ > 0) and backward region, respectively. (Right) SHERPA CSSHOWER results for A FB (p T,tt ) obtained under scale choice variations (m ⊥,t -like, i.e. default versus m t ) combined with simultaneous variation of µ R and µ F , which was also applied to the parton showering.

A. Appendix: Additional SHERPA CSSHOWER studies
Migration tests. In the left part of Fig. 11 the radiation imbalance between "fwd" and "bwd" initial dipole configurations, seen in Fig. 8, is clearly documented by the respective predictions for the p T distribution of the tt pair. The LO configurations emerging from the "fwd" (∆ỹ > 0) phase space generate a more steeply falling p T spectrum with respect to that produced by the "bwd" (∆ỹ < 0) generated dipoles. The turn-over already appears at p T,tt ∼ 10 GeV, slightly below the turn-over found in A FB (p T,tt ), cf. Figs. 9 (bottom) and 11 (right). The upward shift is a result of the migration in ∆y.
Another way to express the results of Fig. 8 uses the after-showering, forward (∆y > 0) cross section fractions, which we define as r (cut) fwd = σ (cut) ∆y>0 σ (cut) . Tab. 2 shows them for different before-showering, tt rapidity-difference regions, listed in ascending order of r fwd . This is to summarize, on a more quantitative level, all earlier findings: considerably larger migration in − → + than opposite direction (rows 1, 2 versus 4, 5); a factor ∼ 3 increased migration in both directions for harder emissions (p T,tt > 50 GeV); milder migration in the low-p T with respect to the high-p T region. The increasing p T -veto efficiencies reflect once more the radiation imbalance between forward and backward phase spaces. 12 Focusing on the near-transition regions (rows 2 and 4), we observe an enhanced migration activity with respect to that found for both hemispheres (rows 1 and 5). This nicely confirms the locality assumption, ∆y = ∆ỹ + .
∆y>0 σ (cut) . The LO cross section σ is 4.94 pb, dropping to 0.955 pb under the influence of the rapidity-difference generation cuts 0 ≤ ±∆ỹ ≤ 0.4 applied to LO tt hadroproduction. For the low-p T region, p T,tt < 50 GeV, the cut efficiencies are stated explicitly.
Scale variations. We have checked the scale dependence of the SHERPA CSSHOWER predictions, which we illustrate for A FB (p T,tt ) in the right panel of Fig. 11. The impact of using the default, m ⊥,t -like scale choice (m 2 ⊥,t = m 2 t + p 2 T,t ) versus a fixed m t scale, and varying the renormalization and factorization scales simultaneously in each case by factors of 2, is seen to be of the order of ±20% at intermediate p T,tt . This order of magnitude is consistent with that expected for a leading-order quantity, from variation of the scales and PDFs in the matrix element and parton showers, and is generally less than the variation due to different ways of treating recoils, or assessing the p T,tt distributions themselves.