Streamlining resummed QCD calculations using Monte Carlo integration

Some of the most arduous and error-prone aspects of precision resummed calculations are related to the partonic hard process, having nothing to do with the resummation. In particular, interfacing to parton-distribution functions, combining various channels, and performing the phase space integration can be limiting factors in completing calculations. Conveniently, however, most of these tasks are already automated in many Monte Carlo programs, such as MadGraph, Alpgen or Sherpa. In this paper, we show how such programs can be used to produce distributions of partonic kinematics with associated color structures representing the hard factor in a resummed distribution. These distributions can then be used to weight convolutions of jet, soft and beam functions producing a complete resummed calculation. In fact, only around 1000 unweighted events are necessary to produce precise distributions. A number of examples and checks are provided, including $e^+e^-$ two- and four-jet event shapes, $n$-jettiness and jet-mass related observables at hadron colliders. Attached code can be used to modify MadGraph to export the relevant leading-order hard functions and color structures for arbitrary processes.


Introduction
One of the main goals of collider experiments like the Large Hadron Collider (LHC) is to detect or bound phenomena beyond the Standard Model. Of course, to find deviations from the Standard Model, we must understand the predictions of the Standard Model itself. As the LHC analyses exhaust the limits of what can be learned from clean events with leptons and photons, the need for precise predictions of more sophisticated observables involving jets and hadronic event and jet shapes grows. The usual approach in processes too complicated for analytic predictions is to use Monte Carlo (MC) simulations. However, these MC simulations are generally limited to leading-log resummation. Soft Collinear Effective Theory (SCET) [1][2][3][4] provides a systematically improvable framework for resumming the soft and collinear logarithms in QCD. It has been used to compute cross sections for observables to next-to-leading logarithmic order (NLL), NNLL and even NNNLL. SCET has been applied to a variety of jet-related observables in different processes at colliders, from thrust [5,6] to jet mass [7,8] or the Higgs cross section with a jet veto [9,10].
Resummed computations have been performed for processes with 2 jets (as in e + e − colliders), 3 jets (or 2 incoming beams and 1 outgoing jet) [7,8,11], or 4 jets in special cases [12][13][14]. For processes with many jets, the phase space can become prohibitively difficult to integrate. For instance, in [12], the authors compute a distribution for a 4-parton observable, but only at a single phase space point. Many interesting observables have 4, 5, or even more jet or beam directions. For example, jet mass in dijet events has been measured [15] but not yet computed. Another example is 2-subjettiness [16], useful for telling boosted W or Higgs bosons from QCD background. The signal distribution has been computed at NNNLL level [17], but the background, which involves pp → X + 2 collinear jets, has eluded calculation so far.
There are basically three steps in computing a resummed distribution. First, one needs a factorization formula indicating the relevant objects (usually hard, jet, soft and beam functions) which dominate the distribution in a certain threshold limit. Second, to get to a particular accuracy, one needs the fixed-order computation of these objects and the relevant anomalous dimensions. These first two steps comprise the intellectual meat of the prediction. The third step is to put it all together and get numbers out. Unfortunately, this last step can take years. Anyone who has worked on precision calculations knows the pain of getting factors of two right, hammering out the small coding errors, and cross-checking against other calculations. Although the past decade has seen the complete automation of the numerical step for fixed-order calculations with public codes, resummed calculations are still often done with Mathematica on a user's personal computer. In this paper, we show that much of the tedium of the numerical step can be automated in resummed calculations as well. We propose a way to combine analytic resummation with a numerical computation of the leading-order hard function and phase space integrals using existing MC generators.
Distributions of observables in SCET are given by the product of a hard function with a convolution of jet, soft and beam functions, integrated over phase space. We can write this as dσ dO = channels colors I,J Where Φ is the phase space, H is the hard function, and F is the combination of jet, soft and beam functions. IJ indexes the matrix structure of the objects in color space. The basic idea is that the phase space integral dΦ and the hard function (which at leading order is just the tree-level squared matrix element for partonic scattering) are computed numerically with a MC generator. The rest is computed analytically. Since MC generators sample phase space proportionally to squared matrixelements, roughly speaking we simply make a MC sample, weight each point Φ by F IJ (Φ, O), and sum them to get the final differential cross section dσ dO (O). For hadronic collisions, parton-distribution functions are included in the H IJ , since they are handled efficiently by the MC. Then a correction factor is added to change the scale at which the PDF is evaluated or to turn the PDF into a beam function.
In addition to computing the difficult integrals, this method is nicely extensible and automatable. The resummed function F IJ (Φ, O) is universal in the sense that it does not depend on the partonlevel process, only the number and type of the jets. This allows the whole computation to be mostly automated. After the anomalous dimensions have been calculated, F IJ (Φ, O) can be calculated once and then the distribution can be generated automatically for any partonic process. For example, the calculation of jet mass in dijet events or dijet-plus-photon events involves the same function F IJ (Φ, O).
The method we propose is distinct from previous approaches to semi-automated resummation. The Caesar framework performs resummation of hadronic event shapes at NLL in an entirely automated fashion [18,19]. Geneva [20][21][22] attempts to produce fully differential distributions which reproduce certain observables at a given order. The connection between resummation in event generators and SCET has been explored in [23,24]. Recently, distributions for processes with a jet veto have been computed by interfacing with MadGraph at NLO [25]. Some differences with our work are that Ref. [25] exploited MadGraph to compute the NLO hard function, allowing for NNLL resummation, and that Ref. [25] only considered observables for which there was no color evolution. With color singlet final states Ref. [25] did not have to modify MadGraph to provide the color structures necessary to the evolution. In addition to showing that MadGraph can be used to provide color structures, we also demonstrate averaging distributions for each hard phase point (like dσ dT4 ) rather than using only single values of an observable at each point (like the W + W − transverse momentum).
To demonstrate the power of the method, in this paper we provide the first calculation of 2jettiness in pp → 2j + γ at NLL. We also explore various features and checks of related distributions such as thrust and 4-jettiness in e + e − collisions. All the results in this paper at at NLL. Going beyond NLL is straightforward using this method, and discussed in the conclusion.

Overview
The focus of this paper is the resummation of jet-based observables to higher order than that provided by current parton-shower based Monte Carlo event generators. The observables O for which resummation is useful typically have singular distributions at fixed order in perturbation theory. For example, in perturbation theory, the distribution of jet mass blows up for small mass. The singularities appear as large logarithms arise in the cumulative cross section typically of the form α n s ln m O, with m ≤ 2n. Formally, Monte Carlo event generators get only the leading singularities, with m = 2n correct. We are interested in extending the resummation of logarithms for such observables to higher order in a systematically improvable way using the framework of SCET.
Calculations in SCET proceed through the separation of a full calculation for a given process by way of a factorization theorem into contributions from sectors that are separately calculated and then assembled to provide a final result. A general framework for n-jet events schematically takes the form [26] dσ Here, σ 0 is the total leading-order cross section, dΦ is the differential phase space for a final state of N hard partons along with any other uncolored final-state particles X, H IJ is the hard function, encoding the Wilson coefficients providing information about the short-distance scattering process. J i and B a are jet and beam functions which describe the collinear evolution of final and initial state partons respectively, and S IJ is the soft function describing the crosstalk of soft QCD radiation between these collinear sectors. I and J are color-structure indices. All the terms involving beam functions in this and subsequent equations will be absent in the case of leptonic initial states. The differential distribution of a given observable is produced by integrating the differential cross section against a measurement function. In order to effect the resummation of large logarithms, it is necessary to evolve each of the pieces in Eq. (2.1) from their natural scales to a common scale using the renormalization evolution equation of SCET. Thus the resummed distribution for the relevant observables takes the following form.
where the U i (µ, ν) are evolution kernels encoding the resummation of large logarithms. (For a recent review with more details, see Ref. [27].) For the observables of interest, everything in this formula after H IJ (Φ, µ h ) can be computed analytically. The important point for this paper is that the hard function H IJ (Φ, µ h ) and hard-parton phase-space integrals dΦ are exactly what many existing MC codes are designed to calculate. We can thereby combine MC calculations with analytic resummation to efficiently compute the differential distribution. Compressing the notation even more results in the heuristic formula in Eq.
The procedure we propose is then 1. Use MadGraph to produce events with a probability proportional to HdΦ.
2. Compute the function F IJ (Φ, O) analytically as a function of the phase space point Φ (it is a matrix in color-structure space, indexed by IJ). At each point The result is the prediction for the resummed distribution. The function F IJ (Φ, O) is both analytically computable and general in the sense that it depends only on the number and type (quark vs gluon) of the jets. Thus, for instance, it needs to be computed once for the 4-jettiness observable, and then can be applied to 4-jettiness in e + e − → 4j, e + e − → γ +4j, e + e − → Z + 4j and any other 4-jet process, even when the phase space is very different. With suitable universal modifications, crossing 2 partons to the initial state allows the same results to be used to compute pp observables with 2 jets in the final state.
Normally, MCs produce only distributions of momenta. To perform the resummation, we also need to know the color structure of the hard colored particles. This information is already computed by the MCs, so extracting it is just a matter of modifying the code to include more information in the Les Houches Event Format (LHEF) file [28,29]. We do this by adding a comment to the event. For example an e + e − → udūd event might look like this: The middle six lines give the particle momenta, which we might draw as . The colordecomposition block at the end of the event specifies the entries of the hard function and the color structure, respectively, discussed in more detail in Sec. 3.
For each event, we then use the momenta to determine the directions of the Wilson lines in the soft function, and the kinematic variables on which the RGE evolution kernel for the hard function depends. With the directions fixed, we can then compute the differential distribution of the observable at that phase space point.
For an observable like thrust in e + e − , the distributions for each phase space point are the same (since each point has back-to-back quarks). More generally, the averaging is nontrivial and important.

The hard function numerically from MadGraph
In order to compute resummed observables the hard and soft functions need to be evolved using the RG evolution equations. In a given channel, if multiple color structures are allowed the evolution of the Wilson coefficients will mix these structures as a function of scale. In certain cases, such as electroweak production with a jet veto [25], only a single color structure contributes at Born level, and evolution can be effected by simply reweighting the tree-level matrix element evaluated by a MC generator.
In contrast, for hadronic collisions with jets in the final state the color structures are nontrivial. Even for a channel like qq → qq , described by a single Feynman diagram at tree level, multiple (in this case two) color structures mix during RG evolution. All the necessary ingredients for computing the function F IJ (Φ, O) are often available in the literature. As an example, for all fully-hadronic 2 → 2 processes these were presented in Ref. [13]. Unfortunately, the most natural color basis for presenting these results and carrying out the resummation is often not the basis in which MC generators carry out their calculations. Moreover, even if this were not the case, the details of the calculation are intentionally hidden from the end user. MadGraph by default returns only events from a sampling of the process phase space, and by the time these events are generated, the internal code itself only has access to the spin-and color-averaged squared matrix element, so that all color information is obscured.
We have implemented a modification of the MadGraph evaluation of matrix elements that allows us to retain this information. MadGraph evaluates squared matrix elements by picking a point in phase space, and evaluating all wavefunctions, propagators, and vertices for the given kinematics and every choice of color structure. This matrix element is then contracted with an appropriate color matrix, yielding a single number. The contributions from all helicity configurations are then added together. Schematically, for phase space point Φ, this corresponds to where I, J index the color structures of the process, and we write the matrix encoding the color contraction as S tree IJ since, up to a rescaling, it in fact corresponds to the tree-level soft function for some choice of basis in color space. For our purposes, the shortcoming of this procedure is that all color information is traced over before spin states are summed over, so that there is no way to extract the value of the hard function for each color structure.
The solution to this is to simply change the order of summation, so that MadGraph sums over helicity states before contracting the colors. This requires the introduction of an auxiliary matrix, The squared matrix element is then simply while H IJ tree (and S tree IJ ), passed along with the rest of the event information, contains the necessary information to implement the resummation and then contract the color structures in order to implement the necessary event reweighting. Sometimes, passing the full matrix H IJ tree can be unnecessary. If only a single helicity structure contributes to a process, H IJ tree can be trivially decomposed as C I † C J . This can sometimes be done even when multiple helicity combinations are non-zero, due to various identities making helicities of a given color ordering linearly dependent at tree-level [30][31][32][33][34]. In these cases C also acts as the Wilson coefficient matching the SCET operators onto full QCD at leading order in O(α s ). Since this is not possible for all processes, however, a fully general code should not implement such a decomposition. After modifying MadGraph's matrix element calculation procedure as described above, as events are generated H IJ tree and S tree IJ are also appended to the normal data stored for each event in the generated LHEF files. This information is then readily accessible in order to implement the resummation.
Beyond extracting the color information from MadGraph, it is necessary to understand the basis in which this information is presented, so that the values of the right operators are evolved by RG evolution at every phase space point. In this section, the structures needed for the observables we compute in this paper are described. Details of how MadGraph organizes the color information for arbitrary processes are provided in App. B.
Four-quark matrix elements are computed in MadGraph decomposed in the color dipole basis. For concreteness, we discuss the qq qq operator, for which MadGraph uses the basis Because the tree level process is proportional toqT a qq T a q = − 1 The basis in which the SCET resummation formulas we use in F IJ (Φ, O) have been computed is different from the basis above. For example, for the qq qq channel, Ref. [13] uses the singlet/octet basis: This is related to the MadGraph basis by Fierz identities, which are straightforward work out. Since we have thus far restricted ourselves to processes with at most four colored states in the matrix elements at a time, we have computed these transformations explicitly, but they can be easily automated for higher-multiplicity states. For this channel, for instance, we see that The fixed-order fully-averaged matrix element must be the same in every basis, and so we must transform H IJ tree and S tree IJ with opposite senses in order to maintain this property. Additionally, this provides a sanity check since then S tree IJ must then correspond to the tree-level soft function of the SCET calculations we use as inputs. The hard function can then be transformed to the preferred basis for resummation by acting on the matrix output provided by MadGraph as H F = R H MG R † , while the color matrix provided by MadGraph transforms into the soft function in the same basis as It is worth noting that this basis transformation is non-unitary, so we must be careful about when to use its inverse and when to use its adjoint.
The situation with qqgg operators is simpler, since both MadGraph and our formulas use the same basis, so no rotation is necessary. However, an additional operator appears in the RG evolution through mixing with operators generated at tree-level. We thus need to embed the hard function returned by MadGraph as a submatrix into one also including the operator (3.11) in its basis for the resummation. So in practice we take a matrix from MadGraph, computed from the decomposition of the matrix element into its preferred color basis, convert it to the SCET basis using a rotation matrix which is computed from the color basis information for the given process provided by MadGraph, and possibly supplement the tree-level basis with additional operators involved in the resummation. We also keep track of the color matrix MadGraph generates associated with the hard function entries to weight individual phase space points properly.
A note on the normalization of the MC sum is in order. MadGraph produced events Φ i with weights w i (which might be the same if we are using unweighted events, or different if we are using weighted events). It produces them such that they approximate the integral over phase space for any function f : using Eq. (3.3). The SCET calculation (Eq. (2.2)) needs us to calculate: Thus, according to Eq. (3.12), the function f which we need to evaluate at each phase space point (and then sum according to the weights) is:

Everything else analytically with SCET
We have discussed how to use MadGraph to produce the function H IJ (Φ) in Eq. (1.1) numerically. This is to be combined with analytical calculations of the function F IJ (Φ, O) using SCET.

e + e − observables
To begin, consider the observable thrust in e + e − collisions. There, the phase space Φ is trivial (all tree-level quark pairs are back-to-back with half the center-of-mass energy), and there is only one color structure so F IJ (Φ, O) = F (τ ). For thrust, we have an analytical closed form expression for F (τ ) [6]: See [6] for the definitions and perturbative expression for S(ν, µ), A(ν, µ) and η i as well as precise expressions for the fixed order jet-and soft-functionsj ands T . 1 Taking a step backwards, this F (τ ) is calculated as the product of evolution kernels and a convolution of fixed-order jet and soft functions. Let us write it as Here the U i (ν, µ) are RG evolution kernels. For example, the hard evolution kernel is In Eq. (4.2), J(µ, τ ) is a jet function and S T (µ, τ ) is the thrust soft function, both of which are included a fixed order. The factor F c in Eq. (4.2) is a correction factor, compensating for the parts of the hard function that MadGraph does not compute. One contribution to F c comes from the fact that MadGraph is tree-level, but for NLL resummation, we need the large-logarithmic parts of the 1-loop hard function. The hard function for thrust can be written as The tree-level hard function, H tree is computed by MadGraph. The rest has to be included in F c . For thrust, H tree = 1 and More generally, for NLL, only the 1-loop logarithms are included in H(Q, µ h ) and in F c . Thrust in e + e − events has H tree = 1. For other process, the hard function at tree-level is nontrivial. For example, for 3-jet event shapes (such as 3-jettiness) based on e + e − → qqg, the hard function is the tree-level differential cross-section: Rather than using the analytic form for this hard function, we use MadGraph to sample phase space according to it. At NLL, the 1-loop corrections to H are proportional to H tree , so the correction factor H Htree = 1 + α s (· · · ). For more complicated e + e − processes, closed form expressions for the cross sections are neither available nor necessary. For a generic process, the leading order cross sections will be proportional to α n s (µ h ), where n is the number of factors of g s in the tree-level Feynman diagrams. Because n is fixed, we do not have to vary µ h within MadGraph. Instead, we can run MadGraph with a fixed scale µ R and compensate for this scale in F c . Thus, for e + e − observables, we take This first factor in F c is how we include hard scale variation in our examples. Alternatively, one could run MadGraph separately at µ R , 2µ R and 1 2 µ R , but varying the scales through F c is easier since MadGraph only has to be run once. Note that this correction factor is in addition to the evolution kernel U (µ h , µ) which is always included as part of F (Φ).

Hadron collisions
For hadron collisions, there can be two additional features. First, in MadGraph, the incoming parton momenta are not fixed, but chosen according to parton distribution functions. Second, factorizable observables in hadron collisions often involve beam functions. We will now explain how to handle these two new features.
The inclusion of parton distributions in the hard function is straightforward. Although the incoming parton momenta are not the same for each event, we can still use the hadron collision events from MadGraph exactly like e + e − collision events. The only difference is that if we want to change the scale from the factorization scale µ R chosen by MadGraph, we must multiply and divide by PDFs. So the correction factor becomes Note that we know x 1 and x 2 from the hard kinematics on an event-by-event basis, so this factor is just a number at each phase space point (not a convolution). Also, note that although it is standard to choose the renormalization and factorization scales equal in MadGraph, we can separate them afterwards for the analytical part of the calculation.
For certain observables, such as boson spectra at high p T or threshold hadronic event shapes, there is no collinear radiation in the direction of the beams by construction. Alternative observables, like 2-jettiness, do not require a threshold expansion and control the collinear radiation in the beam direction through beam functions. Heuristically, with beam functions, This seems dangerous, since beam functions mix up information about the PDFs (which MadGraph was handling before) and jet-related information (which SCET was handling before). However beam functions are a simple convolution of these two pieces of information, so they can still be separated. As explained in [35], beam functions can be written as convolutions of PDFs and perturbatively calculable functions I ij : Because of this, the PDFs must be evaluated at scales ξ different from the scales x 1 and x 2 where MadGraph evaluates them. This is a complication, but not an obstruction. We simply calculate the beam functions numerically, then divide by the PDFs.
To be concrete, for n-jettiness, which involves two beam functions and n jets, we have with a correction factor Note that for each event in MadGraph, the energy fractions x 1 and x 2 are included in the event record.
These same x i are used in the correction factor, to remove the PDF, and in the beam functions.

Applications
Next we discuss some applications. We begin with the e + e − event shapes thrust and 4-jettiness, then look at pp observables. An obvious leading question is what do we compare to, and how do we know we are doing the calculation correctly? Comparing to data might be great, if there were appropriate data, and obviously that is the ultimate goal. However, data is complicated by many experimental issues and contributions from backgrounds and the underlying event, which we do not include. Instead, we can compare to Monte Carlo event generators like Pythia, which are known to agree pretty well with data in most cases. We can also compare to a full analytical calculation (without the MadGraph matching step) of an observable if that calculation is available.
An advantage of this framework, besides its efficient exploitation of the matrix element computation and sampling, is that one can easily look at the same observable computed using different phase space points. In an inclusive calculation, these points are integrated over, but using our method, we can explore how the shape of the observable changes point-to-point. For example, when all the partons have roughly the same energy, we expect resummation to work well. However, when there are very different scales involved at the parton level, there may be additional large logarithms which we do not resum, but which are resummed at the leading-log level by Pythia. We will discuss these issues as we go through some examples.
In our comparisons, we have switched off Pythia's hadronization and multiple-parton interaction models. We have also set the value of α s in the parton shower to the same value we use in our calculations, and run the coupling at two-loops, as in our analytic calculation. With these features off, for e + e − → 2 jet events, Pythia is reduced to a final-state, timelike-branching parton shower. It is performing essentially the same calculation as SCET and we expect Pythia to correspond to some particular scale choice of SCET. We can look at what this scale is and compare to the scale where the scale uncertainty is smallest in the SCET calculation.
When the processes are complex enough to involve multiple color structures, we expect to find bigger differences between Pythia and SCET, even in the leptonic collisions. For example, in pp → 2j + γ, in the all-quark channel qq → qq + γ, there are two color structures (the first incoming quark can be color-connected to the first outgoing or to the second outgoing). In MadGraph/Pythia, one of the color connections is chosen at random by MadGraph proportional to the relative cross sections. That color connection is then sent to Pythia for showering. In MadGraph/SCET, the entire vector of amplitudes is kept (H IJ (Φ) is a matrix, made up of the outer product of those vectors), and contracted with the SCET function F IJ (Φ, O)) . If the RG running of the hard and soft functions for the different color structures had no cross terms, i.e., H IJ (Φ) were diagonal, then we would expect that Pythia's random sampling and shower (up to a rescaling of the coupling strength [36]) to be equivalent to SCET's averaging even at NLL. However, when the resummation includes off-diagonal terms, then choosing one of the structures according to the diagonal entries loses some information. Thus we expect that in these more complex channels, MadGraph/SCET will be sensitive to the off-diagonal running that MadGraph/Pythia ignores. As this effect first appears at NLL, we do not expect it to be numerically large. It would certainly be interesting to find evidence for this effect in theoretical calculations, and ultimately in data.
One additional complication is that for pp collisions Pythia does not treat the final-state timelikebranching and initial-state spacelike-branching parton showers simultaneously. Pythia implements a parton shower for the initial and final states separately while imposing phase space restrictions to minimize overlap and dead zones in the final multi-body phase space. This is an imperfect proce-  Figure 1: Computation of thrust in e + e − collisions at center of mass energy of 500 GeV. The histogram is from MadGraph matched to Pythia, with hadronization off; the red curve is a purely analytic calculation [17]. The blue is from MadGraph matched to SCET, as discussed here. Not surprisingly, the central value of the blue band agrees exactly the red curve, as the MadGraph part is trivial.
dure, but the imperfections are partially compensated for in hadronized events through the tuning of hadronization parameters. Because of the greater sensitivity to tuning, we correspondingly expect the agreement between MadGraph/SCET and Pythia at the parton level to be worse for hadron collisions than for e + e − collisions.

e + e − → 2 jets
The MadGraph matching step is trivial for e + e − → 2j event shapes like thrust. For these cases, the soft function does not depend on the phase space point, and thus F IJ (Φ, O) does not depend on Φ. This means that the MadGraph sum is trivial, and we should reproduce the analytic thrust results exactly.
For scale choices, we use the natural scales with Q = 500 GeV the center of mass energy. One could use more sophisticated profile functions for scale choices [37], but we use the natural scales for simplicity. In Fig. 1 we show the output of our general framework for thrust. We generate e + e − →qq events with MadGraph. The Pythia histogram is generated by then showering those events through Pythia. For the central value curve, we use the same events and compute thrust for each event with SCET. Varying the scales by a factor of 2 produces the band. The red curve is the purely analytical calculation [5,6] at NLL (without using the MadGraph events). It agrees exactly with the central value of the blue band. Note that the MadGraph/Pythia histogram is slightly off the analytic result. Some discrepancy is not surprising, as we are just running Pythia out of the box and do not try to match scales or optimize the tuning.

e + e − → 4 jets
Next, we consider 4-jettiness in e + e − events. This process is much more complicated than thrust, since it involves multiple color structures, multiple channels and the phase space is too complicated to integrate analytically. Indeed, this is the first NLL resummation of 4-jettiness.
To define 4-jettiness [35,38], we need to partition phase space into 4 sectors J i and assign a lightlike direction n µ i to each sector. Then, 4-jettiness is While the original definition defined the n i with a minimization formula, the partitioning and assignment of n i can be done in different ways that are equivalent at leading power. Since we are interfacing to MadGraph, we take the n i as the parton directions on an event-by-event basis. Note that, unlike thrust, T 4 is conventionally defined to have dimensions of energy. For e + e − → 4 jets, there are two partonic channels relevant to the order we work: e + e − → qqgg and e + e − → qqqq. Thus we need two different functions F IJ (Φ, T 4 ), which we call F IJ qqqq (Φ, T 4 ) and F IJ qqgg (Φ, T 4 ). Details of these functions are given in App. A. This process also has non-trivial color structure; the qqqq channel has two color structures and the qqgg channel has three. Thus F IJ qqqq (Φ, T 4 ) is a 2 × 2 matrix, and F IJ qqgg (Φ, T 4 ) a 3 × 3 matrix. For scale choices, we take where E i are the energy of the partons and x is an adjustment factor. These are natural in the sense that the soft scale only depends on the observable, each jet scale only depends on the energy of the relevant jet, and the hard scale satisfies µ h µ s =μ 2 J , withμ j = µ 1 j µ 2 j µ 3 j µ 4 j the geometric mean of the jet scales. We have included a numerical adjustment factor x which should be of order 1. For example, reducing to thrust where µ h is the center of mass energy, x = 2. Varying x is equivalent to varying the hard scale keeping µ s µ h =μ 2 J (the anti-correlated scale variation from [6]). Because our method involves summing up distributions at each phase space point, we can compare the distributions from MadGraph/SCET to MadGraph/Pythia at each phase space point individually. To do so we take a single MadGraph event and shower it several thousand times with Pythia. Recall that Pythia showers the event assuming a particular color structure, chosen randomly once and for all at the start. So for purposes of comparing to Pythia in these single phase-space-point plots, we ignore the color structure of the hard function in the SCET calculation and instead use the same definite color structure as Pythia.
We consider two phase space points. First, a highly symmetric event: Of course, taking a tetrahedron structure rather than a square would be more symmetric, but having two back-to-back partons facilitations the comparison with T 2 in pp collisions, where the same phase space point can be used. Second, we consider a angled event with outgoing partons separated by 30 degrees The center-of-mass energy for this collision is E CM = 3200 GeV.
For these two phase space points, where all partons have E i = 800 GeV, the scales are we will show plots for different x, or equivalently different choices of µ h . We can make qualitative observations about the SCET calculation at a given phase space point in two ways: using scale variations, and comparing to Pythia. Comparing to Pythia is dangerous, of course, as SCET/MadGraph and Pythia are not expected to agree, and Pythia with hadronization and underlying event turned off is IR unsafe and tuning-dependent. On the other hand, since Pythia resums all logarithms at the LL level, while SCET resums some large logarithms (those of T 4 to NLL) and others not even to LL (e.g. logs of ratios of hard kinematic invariants, like ln s t ), the comparison at different phase space points can illustrate the relevant tradeoffs. Fig. 2 shows the results for the symmetric phase space point with e + e − → uūuū at the parton level (top) and e + e − → uūgg (bottom). We see that the scale variation is minimized for µ h ∼ 800 GeV (x ∼ 1). There is good agreement between Pythia and SCET for x 1 as well. We conclude that for symmetric phase points, which have only one scale, the resummation of T 4 using SCET is as effective as the resummation of thrust (T 2 ). This is not surprising, yet reassuring.
For the symmetric phase space point there is really only one scale in the hard function, and so we expect the resummation to miss no large logs. Therefore it is not shocking that there is a scale at which SCET and Pythia agree; Pythia's shower is equivalent to some particular SCET scale choice (around 600 GeV). However, there is no reason to expect this scale choice to be the same as the natural SCET scale choice (the one where the scale variations are minimized). This difference will be more pronounced in pp scattering later (Fig. 5)  SCET error bands are minimized closer to 800 GeV or higher, where the disagreement with Pythia is slightly worse. For the angled phase space point, the scales are all roughly the same, but there are some (small) logarithms which can be relevant. For example, p1·p3 p1·p2 = 3.8. Because of the lack of symmetry, we will consider two different color configurations:  9)) color-connected. The bottom row has the back-to-back partons, p 1 and p 3 , color connected. As discussed in greater detail at the end of Sec. 5.3, the agreement between events showered with Pythia and analytic distributions at individual phase space points is worse than in the case of Fig. 2 due to the presence of additional logarithms that the analytic expression does not resum. However, these additional logarithms cancel out in the averaging over phase space.
Finally, we show the complete result, integrated over phase space, summed over channels in Fig. 4.

pp → 2 jets + X
Now we change processes, form e + e − → 4j to pp → 2j + γ. This example will demonstrate the flexibility of our calculations, since most of the work from the previous process can be reused here,  Top row has the nearby momenta quarks color-connected, and bottom row has back-to-back partons color connected. since the two processes e + e − → 4j and pp → 2j + γ have the same partonic channels. It will also demonstrate the changes needed to apply the calculations in a pp collider with PDFs. A powerful feature of the MadGraph/SCET framework is that it neatly separates the observabledependent pieces from the partonic-channel dependent pieces and the phase-space dependent pieces. From the viewpoint of the jet and soft functions, the only difference between a e + e − → 4j event and a pp → 2j + γ event is a few factors of iπ from the crossing (see e.g. [13], Eq. (38)), and that the  latter should use beam functions for the two incoming partons instead of jet functions. For pp → 2j + γ, we will again look at n-jettiness type observables. The factorization theorem for 2-jettiness is formally differential in the two jet masses T 1 and T 2 and the beam jettinesses T a and T b . 2-jettiness is the sum of these Thus, 2-jettiness in pp collisions is a crossing of 4-jettiness in e + e − . Fig. 5 shows the T 2 distribution starting with the symmetric partonic configuration in Eq. (5.8). We show in the different rows different partonic channels (uū → uū, uū → gg, ug → ug and gg → uu). gg → gg can be studied similarly, but we do not consider it in this paper (there is no 4 gluon channel in e + e − → 4j or in pp → 2j + X with X = W ± , Z, γ). We can see that for T 2 , the agreement with Pythia is not as good as for T 4 in e + e − events. This is to be expected, because of the way initial state radiation is treated in the two approaches. Pythia combines initial and final state parton showers in an intricate way that avoids double counting; SCET handles initial-state radiation with beam functions and final state radiation with jet functions. Since these methods are different, we expect worse agreement than when only final-state radiation is present. Now that the overall agreement is less perfect, it becomes clear that the scale choice of Pythia is not the same as the natural scale choice of SCET. Take for example the bottom row (gg → uū).
Here we see that there is a SCET scale choice, around 400 GeV, that matches Pythia (as expected, since the tunes in Pythia get it to mostly NLL accuracy). On the other hand, the scale choice that minimizes the SCET error bands is around 800-1600 GeV. Each of these criteria can be achieved separately, but there is no reason that the Pythia scale choices should be the same as the SCET natural scale choices. This story is the same in each of the four rows. For uu → uu and ug → ug, the scale choice that matches Pythia are 200 GeV and 300 GeV respectively; although they are lower than 400 GeV and thus not shown, they match as well as the others. The analogous plots for the angled event with momenta in Eq. (5.9) are shown for the two uū → uū color connections in Fig. 6. We see that these angled events, in particular the top row where the color connection is associated with the forward scattering, produce uniformly worse agreement with Pythia than the symmetric events.
After computing the cross section differential in the phase space, we can average the distributions over phase-space points to obtain the full cross section. In Fig. 7, we show the prediction for T 2 in pp → 2j + γ events using 50,000 phase-space points generated at √ s = 8 TeV. The jets are constrained to have p j T > 300 GeV and |η j | < 5, with the equivalent cuts for photons at p γ T > 500 GeV and |η γ | < 2.5. To avoid collinear singularities and additional large logarithms, relatively stringent isolations cuts of ∆R jj = ∆R γj > 0.8 are imposed. The grey band represents the MadGraph/SCET NLL calculation and the blue histogram the MadGraph/Pythia result. Note that agreement in the full cross section is actually better than for the individual phase-space points, as expected for an inclusive observable.
Rather than T 2 , we can also look at masses of the two outgoing jets placing a cut on the mass of the incoming jets. A similar version of 1-jettiness was first studied in Ref. [8]. To be precise, recall that T a and T b are the jettiness of the two incoming jets, and T 1 and T 2 are those for the outgoing jets. Then, we consider , thus it is closely related to jet mass in dijet events. To make this change, we compute This integral is easily done analytically. Thus using MadGraph/SCET we compute the cross section for τ cut 2 with minimal changes from T 2 . Results for τ cut 2 are shown in Fig. 8. As observed in Ref. [8] (see Fig. 15), Pythia does not match the SCET distribution for τ cut 2 nearly as well as it did for T 2 . This is to be expected: while the T 2 distribution only depends on T 2 , the τ cut 2 distribution depends on both T cut and τ cut 2 , so there can be large logs of their ratio. With τ cut 2 , either T cut or τ cut 2 can be used as the soft scale and choose µ s = T cut τ cut 2 . This choice resums some large logarithms, but not others. Consistent with this explanation, we note that in the region where τ cut 2 ≈ T cut , MadGraph/SCET and MadGraph/Pythia match. We also show, in Fig. 9, the effect on the τ cut 2 distribution of changing T cut .

Convergence of the phase space sum
All the plots throughout this paper sum over many thousands of MadGraph events (10,000 to 50,000) to ensure sufficient convergence. Such a large number of events can be computationally expensive, so one might wonder how many are actually needed.  In Fig. 10 we show the T 2 distribution summing n phase-space points with n = 1, 10, 100 or 1000. In all four panels, the case with n = 50, 000 is displayed with uncertainty bands in grey. We see convincingly that, at least for 2-jettiness with our cuts, about 1000 phase-space points is enough given the scale uncertainties inherent to NLL resummation. Needless to say, using 1000 rather than 50,000 events greatly speeds up the calculation.

Summary and conclusions
Calculating a resummed cross section for a process like pp → 2j + W can be an arduous endeavor. For anything more complicated, such as pp → W +3j, performing the calculation completely by hand, i.e., with Mathematica, is essentially impossible. Fortunately, the most numerically intensive and errorprone part of such an endeavor is largely automatable. This paper discusses a method of combining numerical calculations of the phase space integrals and tree-level hard functions using MadGraph, with semi-analytic calculations of the jet, soft and beam functions using SCET and Mathematica. This method brings the calculation of many resummed cross sections for the LHC within reach. The method discussed in this paper uses MadGraph to generate a set of events. The finalstate partons in these events indicate the directions and energies of the jets, and the initial-state partons indicate the starting point for the evolution of beam functions (or simply PDFs for threshold resummation). We call the MadGraph output H IJ (Φ); it is equivalent to the tree-level hard function in SCET. For a given event, we can then compute the jet and soft functions, evolve the beam functions, and correct the hard function from tree level to the appropriate order (for NLL resummation, this  . We first tested this method on the thrust distribution at e + e − collisions, where analytical results from SCET are reproduced exactly. Next, we examined the 4-jettiness event shape T 4 in e + e − → 4 jets.
Here we compared to Pythia, finding agreement within errors. We also explored how the distribution changes for different phase space points. As expected for very symmetric phase space points, where the partons all have the same energy and no angles are small, the agreement with Pythia is very good. For less symmetric phase space points, the agreement is worse. This can be understood because there are logarithms of ratios of kinematic quantities which Pythia resums at the LL level, but which SCET does not resum at all. When the phase space is integrated over, the agreement with Pythia is very good, comparable to the agreement in the symmetric phase space points.
Next, we applied the method to the calculation of 2-jettiness T 2 in pp → 2j + γ events. The same calculation can be used for pp → 2j + W or pp → 2j + Z. We found in this case again good agreement with Pythia, but somewhat worse agreement than for e + e − events. Again, this can be understood because the treatment of initial state radiation, using beam functions or the parton shower, are more dissimilar than the treatment of final-state radiation with the two methods. We also looked at the sum of the two jet masses in pp → 2j + γ with a cut T cut on the beam thrust. This can be thought of as a generalization of the jet mass calculation of [8] for 1-jet events to the dijet case.
Besides being straightforward to implement, one of the main features of this method is that its components are readily recyclable. After calculating F IJ (Φ, O) and integrating against H IJ (Φ) for some process (e.g., T 2 in pp → 2j + γ), one can then calculate the same observable in the same process with different cuts, or even a different hard process (e.g., 2-jettiness in pp → 2j + W ), by simply generating a new MC sample. One can even make larger changes (changing the observable, or performing crossings on the incoming/outgoing particles), with relative ease, provided the jet, soft, and beam functions necessary remain the same. A new calculation only needs to be done when a new channel opens up (e.g., going from pp → 2j + W to pp → 2j opens up the gg → gg channel) or switching to an observable sensitive to a jet shape (e.g., going from observables sensitive to jet mass to ones sensitive to jet broadening).
The MadGraph/SCET method we describe is equivalent to a calculation in SCET, but with an efficient numerical computation of the hard function. Because the method is just SCET it is systematically improvable in the usual manner. The anomalous dimensions of the various components can be computed at any logarithmic order, and the fixed-order pieces computed to the corresponding order in α s . To go to NNLL resummation would require in addition the finite parts of the 1-loop hard, jet and soft functions. The jet functions can usually be computed to 1-loop without too much effort, and the soft functions can be computed to 1-loop numerically [39]. For the hard function, one can interface to an NLO matrix-element calculator, such as using MadGraph at NLO [40]. In fact, to compute the NNLL distribution, one does not have to worry about matching the NLO calculation to parton showers: one simply needs the dimensionally regulated virtual contribution (the NLO Wilson coefficient). Indeed an NNLL calculation has already been performed [25] exploiting MadGraph at NLO for the hard function, albeit for an inclusive color-singlet final state. Thus the technology is already available to compute nearly any factorizable observable at NNLL using MadGraph for the computationally intensive hard-function calculation.

Acknowledgments
MF thanks the Center for Future High Energy Physics, Bejing, China for its hospitality while this work was being completed. The authors were supported in part by the US Department of Energy, under grant DE-SC0013607. The work of MF was also supported in part by the National Science Foundation, under grant PHY-1258729.
A Appendix: Details of the SCET calculation Throughout this appendix, color-structure indices I, J on H IJ , S IJ and F IJ are left implicit.

A.1 Analytic piece
In this appendix we describe the modifications necessary to reproduce our results from those available in the literature and give explicit expressions for the functions F IJ (Φ, O) for the observables we compute.
Taking the hard-function anomalous dimensions for 2 → 2 QCD processes from Ref. [13], the primary modification is the redefinition of the kinematic variables in order to make them applicable for arbitrary kinematics. Since we are deviating from 4-particle kinematics, we must go back to their Eq. (69) and rederive the particular form for each channel. Moreover, the Mandelstam t is no longer well defined, since s 13 = s 24 . But by a fortuitous property of the 4-parton channels, rederiving M IJ from scratch (from Eq. (69) of [13]) indicates that we can reuse their 4 The above replacements have only one caveat for the qqqq and qqgg channels we consider. The anomalous dimensions of [13] contain a term i C i L(−t). In the qqqq channel this is unchanged (modulo the replacement of t). In the qqgg channel, however, this changes slightly: We write this more concisely as i C i L(−t) → i C i L(−tf i ), where f q = 1 for the qqqq channel and for the qqgg channel. The anomalous dimension of the jet function we get from Ref. [35], while the anomalous dimension of the soft function can be extracted from the condition that scale dependence should cancel orderby-oder between the hard, jet, and soft contributions, µ dσ/dµ = 0. For the fixed-order soft and jet functions, we take advantage of the fact that the anomalous dimensions of each are known to NNLL, and extract the desired fixed-order piece needed for NLL precision by expanding the anomalous dimensions as where f is soft or jet function and µ f is the natural scale, respectively, µ s or µ j . This will be ambiguous up to a constant, i.e., µ f -independent, piece, but the ambiguity occurs at higher order than that at which we are working. The fixed-order functions can be combined with the running to get the Laplace transforms of the full jet, beam, and soft functions. 2 In the above expression, the tilde represents Laplace transform, the index i indicates a parton which might be a quark or a gluon, C i is either C F (if i is a quark) or C A (if it's a gluon), Γ is the cusp anomalous dimension (defined without a factor of C F or C A ), S and A are defined from Γ as in [13], m S = µ s e γ E , and m i J = µ 2 j Ei e γ E . t n is the Mandelstam variable t applied to the directions n i only; t n ≡ n 13 n 24 , f i is the correction factor defined in Eq. (A.2), and M IJ is the hard anomalous dimension matrix taken from [13].
Note thatS tree , M IJ , and M † IJ are all matrices in color space which may not be simultaneously diagonalizable, so keeping track of their order is important. We now present two concrete examples of the functions F a (Φ, O) used in our calculations.
A.2 F for 4-jettiness at e + e − Putting together the separate pieces in App. A.1 and taking inverse Laplace transforms, the function F (Φ, T 4 ), fully differential in the separate T i contributions is Here, G is defined through and after performing the inverse Laplace transform, G depends on To get F itself from d 4 F/dT i we integrate against a δ function defining our observable as a function of the T i . Since Here σ i represents the ith permutation of (n−1) gluons, and runs over all (n−1)! possible permutations. Due to the symmetries of the resulting amplitudes, every ordering of gluons will give the exact same squared matrix element when summed over all permutations, but as it helps clarify the other cases, we specify MadGraph's convention here. The ordering in generated recursively, with the first term having all gluons in order, and all subsequent terms generated considering all permutations of the last m generators T a before incrementing the (n − m)-th one by one and repeating until the 2nd position has been occupied by all (n − 1) gluons in the permutation. For example, with 4 gluons, the order of permutations generated is σ = {234, 243, 324, 342, 423, 432}. In the RG evolution involved in resummation, these operators will mix with multi-trace terms, and therefore the basis provided by any tree-level MC generator will have to be enlarged.
For fermion-only operators, MadGraph takes the dipole basis of all possible fermion color-factor contractions into bilinears, q iq i . Crucially, the ordering of pairs in the color matrix can be of importance here. The convention MadGraph adopts is to have the first entry order the quarks in the bilinears as they are defined in the process. The antiquarks are then contracted in the order they appear, and shuffled around in the same order as the gluons in the all-gluon case. As another illustrative example, the basis for d 1 u 2 → u 3 d 4ū5 u 6 is Finally, mixed quark-gluon processes are significantly more complicated in generality. They involve a combination of permutations over the orderings of both the quark and anti-quark labels and partitions of adjoint operators over the quark bilinears. This is done with a combination of overcomplete enumerations of operators and vetoes of operators that have already been enumerated. At the same time, most mixed quark-gluon channels are of such high multiplicity that it is difficult to imagine them being phenomenologically relevant in the near future. We therefore content ourselves with listing the bases for all matrix elements with up to 6 partons, which will be sufficient for all processes up to pp → 4j(+X). The processes qqgg, qqggg, and qqgggg are handled exactly the same as the all-gluon case, except that the position of the first gluon's adjoint generator is not held fixed, the permutations generated being among all the gluons. For processes with multiple quark bilinears, MadGraph tries to keep the adjoint generators as far back as possible at every step. This means that after shuffling through the antiquarks, the basis must shuffle through the quarks as well to generate every permutation. For u 1ū2 u 3ū4 g 5 , this yields B i = {(u 1ū2 )(u 3 T aū 4 )A aµ 5 , (u 1ū4 )(u 3 T aū 2 )A aµ 5 , (u 3ū2 )(u 1 T aū 4 )A aµ 5 , (u 3ū4 )(u 1 T aū 2 )A aµ 5 }. (B.3) Adding another gluon for u 1ū2 u 3ū4 g 5 g 6 expands this basis to completing the listing of all bases for processes with up to 6 partons.