Soft evolution of multi-jet final states

We present a new framework for computing resummed and matched distributions in processes with many hard QCD jets. The intricate color structure of soft gluon emission at large angles renders resummed calculations highly non-trivial in this case. We automate all ingredients necessary for the color evolution of the soft function at next-to-leading-logarithmic accuracy, namely the selection of the color bases and the projections of color operators and Born amplitudes onto those bases. Explicit results for all QCD processes with up to $2\to 5$ partons are given. We also devise a new tree-level matching scheme for resummed calculations which exploits a quasi-local subtraction based on the Catani-Seymour dipole formalism. We implement both resummation and matching in the Sherpa event generator. As a proof of concept, we compute the resummed and matched transverse-thrust distribution for hadronic collisions.


Introduction
Jets play a central role in the physics program of the CERN Large Hadron Collider (LHC). A typical minimum value for the jet transverse momentum considered by LHC analyses is of the order of 20 GeV, which is more than two orders of magnitude smaller than the center-of-mass energy, resulting in a huge phase space for jet production. Events with a high jet multiplicity are therefore copiously produced at the LHC [1].
Moreover, typical signatures of new-physics models include cascade decays of new heavy states producing relatively hard quarks and gluons, which seed hard jets. Accurate theoretical estimates of the related QCD multi-jet backgrounds are therefore essential. This has triggered intense activity in the QCD community, resulting in more and more accurate calculations of cross sections and differential distributions for multi-jet final states.
Leading order (LO) perturbative QCD calculations for multi-jet processes can automatically be performed for large multiplicities [2,3]. Next-to-leading order (NLO) corrections have also reached a high level of automation [4], and fully differential multi-jet cross sections are now available for pure QCD processes and electroweak (W ± , Z and Higgs) boson production in association with up to five jets [5].
Monte Carlo parton showers [6], which describe the all-order evolution of QCD partons fully exclusively, have been extended beyond the strict collinear limit [7] and even beyond the 1/N C approximation [8]. They can be merged with LO predictions for multi-jet events [9] and have been matched to NLO calculations [10,11] for over a decade. More recently, methods for combining next-to-leading order matched predictions of varying jet multiplicity have been devised [12], as well as matching methods at next-to-next-to leading order (NNLO) accuracy [13]. Dedicated Monte Carlo programs aimed at better describing jet production in the high-energy limit have also been developed [14].
Thus, the past years have brought substantial theoretical progress in multi-jet physics, both from the viewpoint of fixed-order calculations and parton showers, as well as the matching and merging of the two approaches. Another important aspect of QCD phenomenology is the all-order resummation of particular classes of observables or processes, beyond the leading-logarithmic (LL) accuracy, which is typical for parton showers. Event shapes in electron-positron, electron-proton and hadronhadron collisions have been studied for a long time (see for instance [15,16] and references therein) and a general framework for resumming event shapes at next-toleading logarithmic (NLL) accuracy was developed in Refs. [17]. Very high logarithmic accuracy (N 3 LL) was achieved using Soft Collinear Effective Theory (SCET) for particular event shapes in e + e − collisions [18,19]. Inter-jet radiation and in particular its response to the presence of a jet veto has also received a lot of attention both from the theoretical [20,21,22] and experimental [23] communities, primarily in the context of Higgs-boson studies [24]. All-order analytical calculations have been performed recently for an increasing number of jet-substructure observables, including jet masses [25], other jet shapes [26], sub-jet multiplicity [27] and grooming algorithms [28]. Recently, there has also been substantial progress towards achieving NNLL accuracy in threshold resummation for dijet production [29].
However, to our knowledge, all phenomenological studies that used all-order resummed results have been restricted to cases with four or less hard colored partons, i.e. 2 → 2 QCD scattering in hadron-hadron collisions [30,31,32] 1 . The reason for this deficiency in comparison to the enormous progress in fixed-order calculations is purely technical. While logarithmic terms associated to collinear emissions have a simple color structure, i.e. the Casimir operator of the jet under consideration, the color structure of soft-gluon emissions at large angles is more complex and, in particular, has a non-trivial matrix structure for n ≥ 4 partons. Nevertheless, resummed calculations can in principle be written for an arbitrary number of hard colored legs, using, for instance, the formalism of Refs. [30,33,34,35]. In order to perform an actual calculation, one then needs to define a suitable color basis for each partonic subprocess, and consequently find the matrix representation of all color insertions. The dimensionality of color bases rapidly increases with the number of legs. Algorithms to define them have been discussed in the literature, e.g. [36,37,38,39]. However, when making use of a non-orthogonal basis, the efficient inversion of the matrix representing the color metric can pose a severe problem. In addition, the underlying Born matrix elements for the hard process must be decomposed in the chosen basis. One would clearly like to automate all these steps.
The main purpose of this study is to overcome these technical difficulties and provide a tool to perform soft-gluon resummation at NLL accuracy for processes with, in principle, arbitrarily many hard legs. In practice we have considered all contributions for up to 2 → 5 processes. As detailed in Sec. 2, we achieve this by writing the resummed exponent in a suitable color basis and by decomposing the Born amplitudes using modified color-dressed recursive relations [40], as implemented in the Comix matrix-element generator [3], that is part of the Sherpa framework [41].
In this paper, we also address the issue of matching the resummation to fixedorder calculations. In Sec. 3 we develop an automated LO matching scheme which makes use of modified dipole subtraction [34]. It circumvents the explicit expansion of the resummation formulae to a large extent and provides a quasi-local cancellation of the logarithmic contributions. We use the resummation of the transverse thrust in hadronic collisions as a first example to study the performance of our method. We finally summarize our work and indicate future directions in Sec. 4.

The soft function and its anomalous dimension
The main aim of this work is to define and implement NLL resummation for processes with an arbitrary number of hard partons. Despite the computational difficulties arising from the non-trivial color structure in soft-gluon radiation, one can formally write all-order resummed expressions in terms of abstract color operators [30,33,34,35], which are then valid for an arbitrary number of hard legs.
The quantity we are interested in is the NLL "soft function" 2 [30,31] Note that the soft function defined here and used throughout this paper does not contain any collinear logarithms. This is in contrast to alternative definitions also common in the literature. In Eq. (2.1), |m 0 denotes a vector in color space representing the Born amplitude, such that the color-summed squared matrix element is 2) The first sum runs over all possible colored dipoles, with Q ij the respective invariant mass, i.e.
The second sum in Eq. (2.2) is over the Coulomb (or Glauber) contributions between final-final (FF) and initial-initial (II) parton pairs. In order to make contact with the existing literature, we can evaluate Eq. (2.2) for the special case of 2 → 2 scattering of massless partons. In this case, Q 12 = Q 34 = √ s, Q 13 = Q 24 = √ −t and Q 14 = Q 23 = √ −u, and the soft anomalous dimension becomes (see e.g. [17]) where we have employed color conservation, i.e.
introduced the compact notation and dropped all contributions from abelian phases because they do not contribute to any cross sections.
Aiming for an automated evaluation of Eq. (2.1) for arbitrary processes, there are essentially three problems which need to be addressed: • the color-basis definition and computation of the metric, • the computation of the color operators T i · T j , in the considered basis, • the decomposition of the amplitude |m 0 in the considered basis.
The construction and implementation of an algorithm addressing all three items represents the core of this paper. This problem is closely related to the color decomposition of QCD amplitudes [42], which is typically written in the form (2.7) Here, M 0 is the full amplitude for a set of external particles 1 . . . n with color assignments α 1 . . . α n . The C (i) are color coefficients, and the m (i) 0 are color-ordered partial amplitudes. The index i labels the color orderings contributing to the color assignment. While the number of orderings and the related color coefficients change with the color basis [43,44], the partial amplitudes are unique, gauge-invariant objects depending only on the particle momenta. They are given by sums of planar diagrams computed in the large-N C limit [45]. One may consider Eq. (2.7) the projection of the Born amplitude onto a given color-basis element, M 0 (α) = c α |m 0 . This will be discussed in more detail in the following.

Non-orthogonal color bases
We first define our notation for color bases. As we are going to work with bases which are not necessarily orthogonal (for a discussion about this topic see also Refs. [46,47,48]), we start by defining basis vectors |c α and introduce the (non-diagonal) color metric, and its inverse Note that c αγ c γβ = δ β α by construction. The basis vectors |c α span a complete (possibly over-complete) set of elements which we leave undetermined for the moment. We will adopt the convention of referring to c αβ as the inverse metric.
Let us consider a general tensor H βγ expressed in the non-orthogonal c-basis. Color invariants are computed by contracting with the metric, and we define in particular the color trace as Tr(cH) = c αβ H αβ . Indices between the c-basis and its dual are raised and lowered with the metric. Tensors transforming with mixed indices are interpreted as H β α ≡ H αγ c γβ .
The soft function from Eq. (2.1) written in matrix notation reads where c αβ H αβ = m 0 |m 0 now represents the color-summed Born matrix element squared. The matrix G is the exponential of the soft anomalous dimension matrix, which due to the non-orthogonal nature of the c-basis, takes the form A significant amount of recent work has focused on improving the basis construction, with certain advantages and disadvantages for each approach. In [37], a complete trace basis was discussed which followed from combining the connected fundamental representation color tensors appearing in the tree-level hard matrix element with the disconnected color structure required by soft-gluon exchange. The construction of this basis for an arbitrary process was automated in [38]. In [39] a general orthonormal basis was constructed, which was shown to be minimal in elements for a given process.
In this work we follow a different approach. Instead of constructing new optimized color bases, we rely on existing ones and circumvent the problem of overcompleteness in an automated fashion by extending the dimensionality of color space. This method, and the alternative approach of dimensional reduction, will be discussed in more detail in Sec. 2.2. In order to select the color bases to start with, we use the following guiding principles: 1. Minimal partial-amplitude count: The components M(α) = c α |m 0 should depend on as few partial amplitudes as possible.
2. Physical color states: The basis vectors should represent physical color states. This disqualifies bases containing singlet gluons, for example.
3. Minimality of the basis: Although we will also use over-complete bases, we require the dimension of the basis to be as low as possible.
The trace basis [42] for processes with quarks and the adjoint basis [43] for processes with only gluons satisfy our guiding principles, and we choose to implement them. However, subtleties arise because these bases can be over-complete. In this respect, we note that exponentiation via Eq. (2.10) requires the computation of the inverse color metric c αβ . Thus, c αβ must be non-singular for general N C . At N C = 3 it may be singular if the corresponding metric c αβ contains representations with weight proportional to N C − 3. In this case the inversion may be computed with N C = 3 + colors. More on this issue appears in Sec. 2.2.
(2.11) The sum runs over all (n − 2)! permutations of the particle labels 2 . . . n − 1, which represent the gluons. Decompositions for processes with multiple quark lines are qualitatively similar and can be found in the literature. In the general case, the decomposition includes disconnected quark lines, arising from soft gluon exchange, and disconnected gluon lines, which appear for processes with 2 or more gluons. Similar terms appear at higher loops in fixed-order calculations.
An important simplification is that any basis with the same number of qq pairs (taking flavor labels as all incoming) and gluons is the same, modulo crossings. This suggests that for a given set of particle flavors, the resummation may be carried out for a fixed flavor ordering. In practice we implement this by always computing Γ in the same flavor arrangement {q,q, g} so that the first sum in Eq. (2.2) is always in order.
We keep track of the map to the physical process by labeling incoming and outgoing for the purpose of assigning the Coulomb phase. This means that the matrices T i · T j only need to be computed once for all processes involving the same number of quarks and gluons.

Purely gluonic processes
There are multiple options of dealing with purely gluonic processes. The first and oldest of them is the trace basis, described in Ref. [42]. The color decomposition of tree-level amplitudes reads M 0 (1, a 1 ; . . . ; n, a n ) = σ∈P (n−1) Tr(T a 1 T aσ 2 . . . T aσ n ) m 0 (1, σ 2 , . . . , σ n ) . (2.12) The sum runs over all (n − 1)! permutations of the particle labels 2 . . . n. In the context of resummation, we must add the non-vanishing color disconnected components containing multiple gluon traces, which also appear at higher loops in fixed-order calculations. A subtlety arises due to the reflection symmetry of the partial amplitudes, m 0 (1, 2, 3, . . . , n) = m 0 (1, n, . . . , 3, 2), which holds for the corresponding soft gluon evolved amplitudes as well. The basis elements corresponding to permutation 123 . . . n and n . . . 321 can be combined due to this symmetry, so that the number of connected basis elements for general N C is reduced by a factor two.

Elimination of N C = 3 pathologies
Although advantageous from many points of view, both the trace and adjoint bases for high-multiplicity processes turn out to be over-complete. As a consequence, the matrices representing the corresponding color metric, defined as in Eq. (2.8), have null eigenvalues at N C = 3. However, the fact that the inverse metric at N C = 3 is often singular is an artefact of calculating c αβ and Γ separately, since maintaining the full N C dependence the resulting S-function is always finite. Keeping the N C dependence explicit becomes computationally impractical for large multiplicity. Here we outline two strategies to overcome these limitations.

Dimensional Reduction
The simplest solution is to reduce the size of the color basis, in particular, if we bear in mind the freedom to reparameterize the basis elements with no tree-level Born contribution. These components only enter S through contractions with the inverse metric and can therefore be reshuffled for convenience.
More precisely, for a basis with m Born proportional and n − m non-Born elements {c 0 , · · · , c m−1 , c m , c m+1 , · · · , c n−1 , c n }, we examine the situation where there is a single zero eigenvalue at N C = 3 in the color metric. In other words, the basis decomposes into n − 1 non-vanishing irreducible representations. A simple procedure for reducing the color space then corresponds to the new basis {c 0 , · · · , c m−1 , c m + c n , c m+1 + c n , · · · , c n−1 + c n }, where we normalise new elements accordingly.
While for simpler processes this procedure is straight-forward (see section A.2), for the general case it is hard to automate, and therefore we choose a different approach.

Numerical Inversion with N C = 3 +
We adopt a solution which avoids adjusting the dimensionality of the basis, and therefore requires no a priori group theory knowledge on the color decomposition of a given process. This is the simplest solution practically, though there is clearly an efficiency loss due to carrying through non-contributing color directions. Figure 1: Sketch of color basis vectors and their corresponding projections of Born matrix elements for qq → q q scattering.
We state the necessary claims here while proofs may be found in App. A.1. First, we note that the metric is always invertible for N C = 3 + with > 0. We can separate the singular from the regular part of the inverse color metric as 14) The singular part of the inverse metric is in the null-space of all color products evaluated at N C = 3c which guarantees that We find that the error introduced in the resummation is O( ) which may be taken sufficiently (arbitrarily) small in practice (theory).

Computation of the hard matrix
A key ingredient for the computation of the soft function Eq. (2.9) is the hard matrix, which is formed by projections of the Born amplitudes onto color basis vectors, H αβ = m 0 |c α c β |m 0 . Consider, for instance, the trivial case of qq → q q scattering. The Born matrix element factorizes into a purely kinematical part, which stems from the s-channel diagram squared, and color coefficients defining the actual matrix structure. This is shown in Fig. 1. However, in any non-trivial case, multiple diagrams appear, which contribute differently to the different matrix elements, such that the hard matrix has a non-trivial dependence on the Born kinematics. In particular, same-flavor quark processes like qq → qq scattering have partial amplitudes where both s-and the t-channel diagrams contribute because of the 1/N C suppressed term in the Fierz identity. This is sketched in Fig. 2. Automating the computation of H αβ requires an algorithm that allows us to easily access these partial amplitudes.
We solve this problem with the help of Comix [3], a matrix-element generator that computes multi-parton amplitudes using color-dressed recursive relations [40].
Comix is part of the Sherpa framework [41]. As Comix allows us to define a color configuration in the large-N C limit, it is trivial to obtain color-ordered partial amplitudes. However, these are not necessarily sufficient to compute the entries of the hard matrix directly, as is the case in qq → qq. The missing terms are added by including the required 1/N C contributions at the vertex level. This amounts to changing the color-dressed Feynman rules such that U (1) gluons couple to quark lines also in the large-N C limit, and evaluating the corresponding 1/N C term in the Fierz identity with N C = 3. We note that it is possible to add the relevant one-loop partial amplitudes and extend this algorithm beyond the tree level. This will provide the hard matrix one order higher, which is needed in order to achieve higher logarithmic accuracy in the resummation.

Validation against multi-parton matrix elements
In order to check the construction of the color metric for the employed bases and the correctness of the corresponding decomposition of the hard matrix for multi-parton amplitudes, we compare our results against exact real-emission matrix elements considering soft but non-collinear kinematics for the emitted gluon. Starting from an n-parton state with momenta p 1 , . . . , p n we assume the emitted gluon to carry additional momentum p s , with |p s | = k s . We choose a particular kinematic configuration, where the final-state momenta resemble a circle in the transverse plane, i.e., . The momenta p 1 to p n can then be used directly to evaluate the n-parton amplitude. For the computation of the (n + 1)-parton process we assume the recoil of the emitted soft-gluon to be absorbed by the dipole spanned by partons 3 and 4. The momenta of partons 3 and 4 that enter the (n + 1)-parton amplitude are then given by We define with R s the inverse ratio between an n + 1-parton matrix element squared and its "sum-over-dipoles" approximation .
The QCD coupling α S is assumed fixed here. Factorization of QCD matrix elements implies that in the limit of soft-gluon kinematics, i.e., This result is in fact independent of the underlying Born kinematics, and in particular independent of the angle φ H through which we rotate our hard-parton configuration, Eq. (2.17). Depending on φ H , the value of R s for finite λ s may be larger or smaller than one. Taking the limit in (2.20), we provide a strong consistency check on the elements of Γ and H n for the n-parton process as well as c and H n+1 for the n + 1 parton configuration. This applies to elements which have a non-vanishing hard contribution.
In order to expose this property of the full matrix element, we sample over φ H in discrete steps assuring that the momentum p s does not get collinear to any other parton. This is sufficiently satisfied by requiring that φ H is not an integer multiple of φ s . In practice we take φ s = π/7, and sample φ H = N π/10 over N = 0, . . . , 9.  In Fig. 3 we display the results of our checks for soft-gluon emission off 4-(left), 5-parton (middle) and 6-parton (right) amplitudes. The last case provides a nontrivial check also on the 7-parton color metric and hard matrix, entering through the denominator of Eq. (2.20). For completeness we collect in App. A.3 the properties of the color bases used for the various processes. By rotating the respective Born kinematics on the circle in the transverse plane, we can verify that the individual coefficients of each dipole are exactly matched in the full matrix element as λ s → 0. For a specific phase-space point these coefficients could be individually very small thus not providing a sufficient test. The results show a strong dependence on the underlying kinematic configuration in addition to the considered parton flavors. However, for sufficiently small λ s in all cases R s approaches unity, proving correctness of the ingredients for the soft function S.

Soft evolution of multi-parton squared amplitudes
Having proven correctness of our color-metric evaluation and the corresponding hardmatrix decomposition, we shall now study the full soft function S(ξ) given in Eq. (2.9) for multi-parton processes. In particular we probe the dependence on the evolution variable ξ and compare to the limiting case of N C → ∞, that closely resembles the approximation used in parton-shower simulations. While the N C = ∞ anomalous dimension is computed explicitly in the trace basis, it amounts to only a non-vanishing contribution to T i · T j for basis elements which have partons i and j color adjacent.
We begin by computing S(ξ) for several multiplicities at benchmark kinematics, that lie on a circle in the transverse plane at z = 0. For the 2 → n processes we parameterize the momenta as where again E n = 2E/(n − 2) and φ nm = π(2m − 3)/(n − 2). The soft function S depends on the kinematics merely through ratios of momentum invariants (cusp angles), such that when considering fixed α S the direct dependence on E n vanishes.   In Fig. 4 and Fig. 5 we present results for the ξ-dependence of the soft function for various parton channels, both for N C = 3 (solid curves) and for the limit N C → ∞ (dashed curves). Depicted is the variation of ln S(ξ) with ξ, where we scaled each curve such that it intersects with the ordinate at one. We observe a non-trivial ξ dependence for all processes when considering the full-color treatment. For the given phase-space configurations the full result shows a stronger variation with ξ than the     large-N C estimate. This originates from taking into account all off-diagonal elements in the soft anomalous dimension. In particular for processes involving gluons the limit N C → ∞ poorly approximates the full result.
To also check kinematic configurations with particles at non-zero rapidity, we considered the above kinematics but rotated by an angle π/2 about the y-axis. This results in momenta that span a circle in the y − z plane at x = 0. The corresponding results can be found in Fig. 6. While the results for N C → ∞ are necessarily monotonically increasing functions of ξ this is not true for finite N C , due to non-vanishing interference effects of different color flows. Accordingly, the large-N C approximation can in general also result in an overestimate of the soft function. To properly account for the highly non-trivial dependence on the parton kinematics and the evolution variable the soft function needs to be evaluated with its full color dependence, i.e. N C = 3.

Towards phenomenology
In the first part of this paper, we have presented a new method to deal with the soft evolution of processes with many colored legs that provides a high degree of automation. Moreover, we have realized an implementation of this method that uses color-partial amplitudes extracted from the matrix-element generator Comix and evolves them according to the soft anomalous dimension Eq. (2.2), thus obtaining an efficient way of evaluating the soft function given in Eq. (2.1). The aim of this second part is to create a framework in which the soft function S in Eq. (2.1) can be used for phenomenological studies. Let us generically call v an observable that measures the "distance" from the lowest-order kinematics. In the context of jet studies, v can be thought of as an observable describing internal jet properties, e.g. masses, angularities, energy correlation functions) or as an observable measuring the radiation outside the leading jets, e.g. event shapes or inter-jet radiation. When v is small, logarithms L = ln 1 v are large and resummation becomes a more efficient organization of the perturbative expansion than fixed-order perturbation theory. Furthermore, we have to consistently match the two approaches to obtain reliable predictions for the entire range of the observable v: The first term in the expression above is computed to some logarithmic accuracy, typically next-to-leading log (NLL) but not infrequently to NNLL, while the second one is computed at a given order in the strong coupling (state of the art is typically NLO). The last term represents the expansion of the resummed distribution to NLO and avoids double counting. The last two terms are affected by large logarithms and are in fact separately divergent in the limit v → 0. However, their combination yields a finite remainder, called the matching term. Although conceptually trivial, computing the matching term is often numerically inefficient because it involves the separate evaluation of fixed-order contribution and expanded resummation in regions of phase space corresponding to soft and/or collinear emissions. It would be preferable to generate the finite remainder directly.

Resummed distributions
Resummed calculations are usually performed for the so-called cumulative distribution, i.e. the integral of the differential distribution up to a certain value v of the observable under consideration: where dB indicates that the expression above is fully differential in the Born kinematics. Here we focus our attention on the NLL approximation of ln Σ, i.e. we consider the functions g in Eq. (3.2), while dropping the non-logarithmic term in square brackets. The inclusion of this constant contribution is necessary in order to achieve what often is referred to as NLL accuracy. We note that to this logarithmic accuracy such contribution, although flavor-sensitive, can be averaged over the different color flows. Furthermore, we note that this constant term can be extracted from NLO calculations as implemented, for instance, with the Powheg method [11], which has been automated in the Sherpa framework in Ref. [50] 3 . For our discussion, we follow the formalism developed in the context of the program Caesar [17], which allows one to resum global event shapes in a semi-automated way. With a couple of generalizations, the Caesar framework is sufficient for our purposes. Furthermore, we will also briefly discuss some differences in the structure of the resummation that arise when dealing with non-global observables [51] at the end of this section.
We consider processes which at Born level feature n hard massless partons (legs) and m color singlets (e.g. photons, Higgs or electroweak bosons) and we denote the set of Born momenta with {p}. Following Refs. [17] we consider positive-definite observables V that measure the difference in the energy-momentum flow of an event with respect to the Born configuration, where V ({p}) = 0. For a single emission with momentum k, which is soft and collinear to leg l, the observable V is parametrized as follows 4 where k (l) t , η (l) and φ (l) denote transverse momentum, rapidity and azimuth of the emission, all measured with respect to parton l. Q is the hard scale of the process which we set equal to the partonic centre of mass energy, i.e. Q 2 = s. It is then possible to write the resummed exponent in Eq. (3.2) in terms of the coefficients a, b l , d l and g l (φ) that specify the behavior of the observable in the presence of a soft and collinear emission.
In particular, while the LL function g (δ) 1 is diagonal in color, one of the contributions that enter the NLL function g (δ,B) 2 is precisely the soft function, which, as discussed in Sec. 2, has a matrix structure in color space, with complexity that increases with the number of hard partons. Explicit formulae are collected in App. B. The results of Sec. 2 provide an automated way of computing these contributions, thus extending the applicability of the Caesar framework to processes with an (in principle) arbitrary large number of hard partonic legs.
We conclude this discussion with a few remarks on non-global observables [51]. Non-global logarithms arise for those observables that have sharp geometrical boundaries in phase space. They originate in wide-angle soft gluons that lie outside the region where the observable is measured, re-emitting softer radiation back into that region. The Caesar framework presented above is not sufficient to deal with this case and new ingredients need to be introduced. Most noticeably, the NLL function g (δ,B) 2 receives a new contribution coming from correlated gluon emission 5 . Because of their soft and large-angle nature, non-global logarithms have a complicated color structure. However, for phenomenological purposes, their resummation can be performed in the large N C limit [51,53,54], thus trivializing the color structure again. Recent studies suggest a way of performing this resummation at finite N C [55]. We believe that the methodology for performing all-order calculations with many hard legs can also prove useful in the application of those methods to LHC phenomenology. However, we leave this investigation for future work. Finally, we point out that while the color structure of the soft anomalous dimension Γ for non-global observables is formally the same as in Eq. (2.2), the coefficients of the T i · T j are observabledependent, because of non-trivial limits for the azimuth and rapidity integrals. 5 We should mention that the particular choice of the algorithm used to define jets can influence the resummation structure at the level of g . This discussion refers to a jet algorithm, like for instance anti-k t [52], which in the soft limit behaves as a rigid cone.

Automated matching
Expanding the resummed prediction in the Caesar formalism to O (α S ) one obtains (see App. B for details) d dL with α S = α S (µ 2 R ) and L = ln (1/v). Our aim is to compute the first two terms fully differentially in the one-emission phase space, similar to how collinear subtraction terms are obtained in fixed-order NLO calculations. In their unintegrated form, these two terms have an explicit dependence on the light-cone momentum fraction which is given by the collinear splitting functions. We can therefore rely on an existing implementation of the Catani-Seymour dipole-subtraction method in Sherpa [56] and replace the kinematics dependent part of the Catani-Seymour dipole insertion operators by the DGLAP splitting functions. This will be described in detail below. The remaining terms in Eq. (3.4) are generated by using the color insertion operators only and multiplying with the analytic expressions for logd l − b l ln(2E l /Q) and the relevant logarithms in the soft anomalous dimension, cf. Eq. (2.2).
The factorization of the one-emission phase space is derived in Ref. [34] in terms of variables that represent scaled invariant masses and light-cone momentum fractions. The ordinary DGLAP splitting functions are modified such that double counted soft-singular regions are avoided by partial fractioning the soft-eikonal contribution. In order to use the resulting algorithm for the matching, we ignore the partial fractioning and implement collinear splitting functions instead, while at the same time keeping the full color dependence of the dipole-insertion operators. We restrict the phase space for the double-logarithmic term in soft-enhanced splittings to the region z a > v (for terms singular as z → 0). This corresponds to the requirement that the gluon rapidity in the rest frame of the radiating dipole be predominantly positive, which generates the correct L-dependence in Eq. (3.4). We define v and z in terms of the variables in Ref. [34] as In this context, x B is the Bjørken-x of the Born process, and the terms involving x B are included to obtain the correct integration range as compared to the resummation. The splitting functions themselves are defined in terms of the original light-cone variables, and the single-logarithmic term is scaled to the full phase space analytically. Eventually, the result is rescaled by 2/(a + b l ) and by the ratio of parton densities, . At this point we induce a single-logarithmic dependence on the observable in the matching term, which is compensated by explicit collinear counterterms which are added to the resummed prediction. The effect can be seen in Figs. 7 and 8, where we test our method for all the dipole configurations using various processes and different types of observables. The numerical accuracy achieved allows us to obtain precise predictions for the matching contribution down to very small values of the observable v.  4). We plot the result as a function of ln v, using two observable types of different behavior with respect to the Caesar coefficients a and b l (a = b l = 1 for thrust variables, on the left, while a = 2, b l = 0 for jet rates, on the right. In both cases d l = g l (φ) = 1). The leading double logarithm appears as a straight line, while the sub-leading single logarithms appear as a constant offset. The collinear massfactorization counterterms (the last term in the square bracket of Eq. (3.4)) are shown in magenta, and the leading-order matching terms are displayed in blue. The sum of all the above is given in black. This sum is to be compared to a direct leading-order calculation, which is shown in black dashed. The difference between the two predictions should be of purely statistical nature, which is verified in the bottom panel of each plot by testing the relative size of the deviation, normalized to the Monte-Carlo uncertainty.

A proof of concept: transverse thrust
In order to demonstrate the completeness of our framework, we compute the resummed and matched distribution for a specific observable. We concentrate on the hadron-collider variant of the thrust observable, i.e. transverse thrust T ⊥ . This global event-shape observable is defined as where the sum runs over all final-state particles, with p ⊥i the particle's momentum transverse to the beam direction, and p ⊥i = | p ⊥i |. The maximal T ⊥ is found by varying the transverse unit vector n ⊥ . Transverse thrust has been studied by the Tevatron experiments [57,58] and, more recently, also by the ATLAS [59] and CMS [60] collaborations. Perturbative calculations for this distribution exist at NLO [61] and also at the resummed level in the Caesar framework [17]. In particular, the event shape that vanishes at Born level is τ ⊥ = 1 − T ⊥ . Details of the resummation for a generic global event shapes are given in App. B. Because the underlying Born processes is a 2 → 2 QCD scattering, the color structure is non-trivial, hence we are able to put at work our construction of the soft function. In Figs. 9 and 10, we plot the transversethrust distribution for pp collisions at 8 TeV. We apply asymmetric cuts on the two leading jets, i.e. p ⊥1 > 100 GeV, p ⊥2 > 80 GeV and we set µ In particular, the plot in Fig. 9 is analogous to the ones already shown in Sec. 3.2 and it provides yet another check of our matching procedure: the sum of the explicit expansion of the resummation (red), the collinear counterterm (magenta) and the LO matching term (blue) is plotted in solid black and it has to be compared to the LO calculation (dotted black). The bottom panel shows that the difference between the two is zero, within the Monte-Carlo uncertainty.
Finally, in Fig. 10 we plot the resummed and matched distribution for transverse thrust (black curve). For comparison, we also show the resummation on its own (red curve). To obtain these results the corresponding soft function and matching to the real-emission matrix elements are computed in a fully exclusive and automated way at run-time. The methods we have developed allow us to calculate such NLL resummed and matched distributions for a wide range of observables, also for highmultiplicity processes, in a largely automated way.

Conclusions and Outlook
Multi-jet physics is central in the physics program of the LHC. In this paper, we have overcome the two main technical difficulties that prevented NLL resummed calculations to be performed in processes with high jet multiplicity.
The first issue was related to the color structure of soft emissions at wide angle, i.e. away from the jets, the complexity of which rapidly increases with the number of hard jets. We have solved this problem by constructing and implementing a framework in which the NLL soft function is computed in an highly automated way. The algorithm constructs an appropriate color basis for the partonic process at hand, and evaluates color operators and the decomposition of Born amplitudes in this basis. It makes use of the matrix-element generator Comix to access the color-ordered partial amplitudes that are needed for the evaluation of the soft function. Using this framework, we have obtained and validated results for the soft function for all QCD processes with up to five hard jets in the final state, i.e. 2 → 5 QCD amplitudes, and we have studied the validity of the widely used large N C approximation. We have found that the impact of finite-N C corrections is significant, especially for processes with many gluons.
We have tackled the second problem of matching resummed predictions to fixedorder calculations. In the traditional way of addressing this problem, one matches the resummed distribution of a given observable v to the one obtained at fixed-order (typically NLO). The main drawback of this approach is that the fixed-order result and the expanded resummed result have to be computed independently in extreme regions of phase space, i.e. at very small v, where numerical cancelation is hard to achieve. We improved this situation by introducing a quasi-local matching scheme at leading order, which generates the finite remainder directly. As a proof of concept, we computed within our framework the NLL transverse-thrust distribution matched to LO. Although in this study we have mainly concentrated on global event shapes, our framework can be easily extended to the case of non-global observables.
We see this rather technical paper as the first necessary step in a rich program aimed at the phenomenological applications of resummed perturbation theory in multi-jet physics. Moreover, because we implement resummed calculations in the Sherpa framework, we have the possibility of making precise comparisons between analytic resummation and Monte-Carlo parton showers. This will provide insights on the benefits and limitations of both approaches and perhaps even indicate ways to improve the formal accuracy of the parton shower.

A. Over-complete color bases
Color bases constructed from irreducible QCD representations do not meet our requirement 1 of the list in Sec. 2.1. Although we do not automate the construction of orthonormal bases this for work, we do employ their generic properties in several arguments throughout this section. For a complete approach to their construction see [39]. We define an orthogonal basis element e α so that where λ α is the weight of the representation. For a given physical process, the dimensionality of the orthogonal e-basis may differ from the c-basis, in which case the indices in Eq. (2.8) versus Eq. (A.1) also differ. Starting with the metric e αβ , we define the (possibly non-square) transformation to the c-basis via As both c αβ and c αβ are symmetric, their (independent) eigenvectors correspond to the row elements of R providing a straight-forward construction.
A.1 General proofs for the N C = 3 + expansion Lemma 1 Assuming N C = 3 + , with > 0 and 1, we can cleanly separate the finite part of the inverse metric from the divergent one, i.e.
where c αβ R andc αβ are regular at N C = 3.

Proof
For general N C , the metric can be brought to diagonal form where the λ α are polynomial in N C corresponding to the weights of irreducible representations, which in the limit → 0 are either O(1) or O( ). In the latter case, let us parameterize such eigenvalues as λ 0 = κ where κ is a constant 6 . The inverse of the orthogonal metric is a matrix with diagonal entries 1/λ α . We construct the tensorc αβ by rotating only the 1/λ 0 components back to the c-basis. Defining α 0 as the indices running over the vanishing weights we havẽ The case λ α | N C =3 = O( 2 ) is in principle possible and it would lead to an O( 1 2 ) term in Eq. (A.3). However, this situation has not been encountered for any of the color-flow bases considered in this study.

Lemma 2
All color products T i · T j belong to the null space of the singular part of the inverse metric, i.e.c αβ (T i · T j ) βγ = 0 α γ . (A.5)

Corollary
An interesting, and computationally advantageous, consequence of the above Lemma is that in order to obtain we only have to evaluate the color metric and its inverse with N C = 3 + , while computing all color producs T i · T j at N C = 3. Therefore, the inversion of the metric at N C = 3 + (with small) provides a valid alternative to dimensional reduction for computing the soft function 7 .

Proof
Rotatingc αβ to the e-basis we define an element so that for every 0 eigenvalue of c αβ there is a corresponding element e α 0 | in the orthogonal basis which satisfies Since T i · T j |e β is a coefficient times an element of the orthogonal basis, we conclude that (T i · T j ) α 0 β = 0. Repeating the argument and noting that c αβ is symmetric, we conclude that the corresponding rows and columns of (T i · T j ) are zero. In the e-basis we clearly have which gives the desired result.
In order to demonstrate the corollary we evaluate all color products at N C = 3+ and we then write the soft anomalous dimension Eq. (2.2) at small as (A.10) 7 A difficulty arises in our method for the case of 6 gluon soft evolution in the trace basis. The problem is linked to the fact that 9 of the vanishing N C = 3 eigenvalues are negative for 3 < N C 3.32. Therefore, an inversion algorithm for N C = 3+ dependent on positive definiteness of the (symmetric) metric is incompatible. However, this problem is avoided by choosing the fbasis, which is positive definite for all processes considered thus far, or inverting using a (much slower) more generic algorithm. A similar problem arises in the standard basis for qq → qqggg.
The first term contributing to the soft function S which involves the inverse metric comes from expanding the exponential to second order Using (A.5) on all the color products that enter the definition of Γ, one finds no finite terms originating from the interference of the 1/ pole of the inverse metric and the O( ) contribution to the anomalous dimensions. Furthermore, this holds for higher terms in the expansion of the exponential. Therefore, all the color products necessary to construct the soft anomalous dimension can be safely computed at N C = 3, while it is still necessary to compute the metric and its inverse at N C = 3 + Finally, we note that using N C = 3 + to invert the metric involves a large cancellation among the entries of c αβ . However, the convergence is better than expected since the coefficients of 1/ are roughly proportional to the number of corresponding 0-representations, which is smaller than the total number of irreducible representations for a given process.

A.2 A concrete example: gg → gg
We list here several different manifestations of the 4-gluon basis as specific examples for our more general discussion in the text.

Trace basis
Let us consider the trace basis for this process: where t a i are the color generators in the fundamental representation and a trace over their fundamental-representation indices is implicit.

Dimensionally reduced trace basis
The basis in Eq. (A.12) is over-complete and consequently the color metric is not invertible for N C = 3. However, if we consider a reduced basis, obtained by taking (A.14) where K d = 1/ 2N 2 C (N 2 C − 1), the metric is then invertible for N C = 3, the connected components remain synced with the hard matrix, and the resulting S(N C = 3) is unchanged.

Adjoint basis
We can now write the 5-dimensional f -basis We can make connection to the trace basis by repeated application of the fundamental Lie algebra to see that (A. 16) Inversion with N C = 3 + We consider here the trace basis for N C = 3 + . We note that there are no additional null eigenvalues for N C = 3. We then take the → 0 limit and expand the matrix representing the color metric in terms of its regular and singular pieces, as in where K 3×3 (a) is a 3 × 3 matrix with each element equal to a. The matrix in Eq. (A.17) is precisely the trace-basis form for the inverse eigenvalue λ 0 ∼ 1/ of the N C = 3 0-representation in the orthogonal basis. One can verify that the matrix (T i · T j ) αβ at N C = 3 for all i and j is in the null space ofc αβ . This a concrete manifestation of the behavior expected from the discussion in App. A.1.

A.3 Bases properties for multi-parton processes
In Tables 1 and 2 Table 2: Summary of basis properties for all 6-and 7-parton processes. Pure gluon processes are listed in the adjoint f -basis.

B. The Caesar framework
Caesar [17] is a computer program that allows one to perform the resummation of a large class of observables, namely global event shapes, to NLL accuracy. In this appendix we recap, without re-deriving them, the expressions of the leading and next-to-leading function g where λ = α S β 0 L, α S = α S (µ 2 R ) and β 0 is the one-loop coefficient of the QCD βfunction, β(α S ) = −α S (α S β 0 + α 2 S β 1 + . . . ), with The result in Eq. (B.1) consists of a sum over all the hard partons and the dependence on the color is trivial and only enters through the Casimir of each leg l, (C F for a quark leg, C A for a gluon leg). Note also that a 1 = a 2 = · · · = a n = a > 0.
The result for the NLL function g (δ,B) 2 has a richer structure: a+b l ) q (l) (x l , µ 2 F ) + ln F ∂ L Lg 3) The first term in the square brackets in Eq. (B.3) contains the two-loop contributions to the DGLAP splitting function in the soft limit and to the QCD β-function, as well as the dependence on the renormalization scale µ R : The second term in the square brackets instead captures hard collinear emissions to a quark leg (B q = − 3 4 ) or to a gluon leg (B g = −πβ 0 ); we have introduced while E l is the energy of leg l. We note that the contribution in this round brackets is actually frame-independent. We move then to the second line of Eq. (B.3) and the first term we encounter is the one that depends on the PDFs (µ F is the factorization scale). This contribution comes about because we veto emissions collinear to the incoming legs which would contribute to the event shape more than a quantity v. There is then a term (F) describing the effect of multiple emissions. The calculation of this term is highly non-trivial for generic observables and indeed this is one of the central aspects of the analysis of Refs. [17]. However, at NLL, multiple emissions have a color structure identical to g   Finally, in the last line we encounter the contributions due to soft radiation at large angle, which we can, for convenience, divide into a diagonal contribution and one with a non-trivial matrix structure. Thus, all the terms but the last one in the Caesar master formula Eq. (B.3) are diagonal in color and therefore apply to processes with an arbitrary number of hard legs. The results of Sec. 2 provide an automated way of computing the only contribution at NLL with a non-trivial color structure, namely the soft function S, which captures the effect of soft gluon emitted at wide angles.