The first calculation of fractional jets

In collider physics, jet algorithms are a ubiquitous tool for clustering particles into discrete jet objects. Event shapes offer an alternative way to characterize jets, and one can define a jet multiplicity event shape, which can take on fractional values, using the framework of “jets without jets”. In this paper, we perform the first analytic studies of fractional jet multiplicity Ñjet in the context of e+e− collisions. We use fixed-order QCD to understand the Ñjet cross section at order αs2, and we introduce a candidate factorization theorem to capture certain higher-order effects. The resulting distributions have a hybrid jet algorithm/event shape behavior which agrees with parton shower Monte Carlo generators. The Ñjet observable does not satisfy ordinary soft-collinear factorization, and the Ñjet cross section exhibits a number of unique features, including the absence of collinear logarithms and the presence of soft logarithms that are purely non-global. Additionally, we find novel divergences connected to the energy sharing between emissions, which are reminiscent of rapidity divergences encountered in other applications. Given these interesting properties of fractional jet multiplicity, we advocate for future measurements and calculations of Ñjet at hadron colliders like the LHC.


JHEP05(2015)008 1 Introduction
For almost forty years, we have known that high energy particle collisions can produce jets [1,2]. The term "jets" has two related but different meanings: "jet formation" is the physical process by which quarks and gluons shower and fragment into collimated sprays of hadrons [3], while "jet algorithms" are an analysis technique used to cluster those hadrons into proxies for the underlying quarks and gluons [4]. Jet algorithms are a powerful way to categorize and organize collision events [5,6], but event shapes (and jet shapes) offer a more sensitive probe of jet formation itself [7]. Indeed, though the observation of three-jet structure in e + e − collisions strongly hinted at the existence of gluons [8,9], an unambiguous discovery at PETRA [10][11][12][13] required the use of event shapes like thrust [14] and oblateness [11].
Recently, the distinction between jet algorithms and event shapes was blurred through the "jets without jets" framework, in which standard jet-based observables are mapped into jet-like event shapes [15]. These observables incorporate a transverse momentum threshold p T cut and a jet radius R just like traditional jet algorithms, but they behave more like event shapes because they involve an inclusive sum over particles in an event and do not uniquely assign hadrons to jet objects. A particularly interesting jets-without-jets observable is jet multiplicity, where p T i,R is the transverse momentum contained in a cone of radius R around particle i. By design, this observable is highly correlated with the standard jet count (for the same p T cut and R values). Crucially, N jet can yield fractional values, offering a new probe of jet formation.
In this paper, we initiate the analytic study of fractional jet multiplicity. For simplicity, we treat the case of e + e − → jets, though we briefly mention how to adapt our calculational techniques to collisions at the Large Hadron Collider (LHC). With two or three partons in the final state (e + e − → qq or qqg), N jet always yields an integer value. Fractional jets only start appearing with four or more partons (e.g. e + e − → qqgg), so our analytic calculations will start at O α 2 s . This is a general feature of fractional jets: non-integer values only appear with three or more particles in a given jet region. 1 Our main technical results will be obtained using fixed-order perturbative QCD, though we will also discuss connections to factorization properties in soft-collinear effective theory (SCET) [16][17][18][19][20].
We will focus on the behavior of N jet in the vicinity of dijet configurations, though we do present some results for ∆ 3− = 3 − N jet as well. This near-integer behavior exhibits a number of curious analytic features, which are already visible at O α 2 s . • Hybrid jet algorithm/event shape behavior. Jet algorithms have a finite cross section at (integer) jet multiplicities whereas event shapes typically have form factors that JHEP05(2015)008 suppress the cross section at singular values. The N jet distribution has both types of behavior. Even though the Born-level process e + e − → qq gives the integer value N jet = 2, one might naively expect the corresponding spike at ∆ 2± = 0 to be completely smeared out by multiple emissions that generate finite values of ∆ 2± . Instead, the spike at ∆ 2± = 0 is robust, as there is a finite region of the many-body phase space that still gives rise to integer values of N jet . At the same time, the appearance of logarithms in ∆ 2± at every perturbative order generates a shoulder at finite values of ∆ 2± , which is suppressed as ∆ 2± → 0. Thus, the cross section at exact-integer values of N jet behave like a jet algorithm while near-integer values behave like an event shape (see section 2.1).
• Cancellation of single-and double-soft divergences. The first non-trivial contributions to non-integer behavior of N jet arise from configurations where three partons are within a mutual radius of 2R but not within R (see section 2.2). For ∆ 2+ , this threeparton phase space has singularities when one or two of the partons goes soft. These divergences arise because the observable receives parametrically equivalent contributions from the single-and double-soft regions, which are not regulated in dimensional regularization. Interestingly, these divergences are structurally similar to rapidity divergences, and we will introduce the analog of rapidity regularization [21,22] to see that the single-soft and double-soft divergences do indeed cancel (see section 3). We note that soft emissions contribute in a non-linear way, and therefore N jet is a non-additive observable.
• No collinear logarithms. Typical event shapes have singularities in both the soft and collinear limit, giving rise to both soft and collinear logarithms. The resulting double-logarithmic structure appears as Sudakov form factors in the cross section. By contrast, collinear emissions do not generate logarithms of N jet , and only soft logarithms appear in the N jet distribution (see section 2.2). 2 Thus, the suppression in the ∆ 2± → 0 limit is only single-logarithmic.
• Non-global yet local structure. The ∆ 2± cross section does not satisfy ordinary softcollinear factorization, because the coefficients of the soft logarithms depend on the collinear structure of the jets (despite the absence of collinear logarithms). Furthermore, the soft logarithms in N jet are purely non-global [26,27], in the sense that they arise from soft emissions in a restricted region of phase space (see section 2.3). 3 These features would seem to preclude any standard factorization theorem, especially given the non-additive nature of N jet . In order to change the value of N jet , however, soft emissions must lie within 2R from the hard core of the jet. Thus, color coherence ensures that the dependence on ∆ 2± , albeit non-global, is local to each jet region. We will present a candidate factorization theorem that exploits this local structure (see section 5).

JHEP05(2015)008
In addition to analytic studies, we will test N jet using high-statistics Monte Carlo samples from Pythia 8 [28] and Herwig++ [29]. Within theoretical uncertainties, the Monte Carlo results confirm our analytic understanding. Given its many features and potential applications, N jet would be very interesting to measure at the LHC. As mentioned above, N jet is a purely non-global observable, with the near-integer behavior determined only by soft and not by collinear divergences. To our knowledge, it is the only jet or event shape observable with this behavior. As such, it is a unique probe of soft physics, and measurements of N jet can be used to test color coherence, underlying event models, and pileup mitigation strategies. Furthermore, N jet is useful basis to compare parton shower predictions for jet substructure, and we present an initial comparison in this work. For new physics searches involving high-multiplicity final states, fractional N jet values interpolate between different jet multiplicities, obviating the need for exclusive jet bins. This interpolation also makes for an interesting version of the classic "staircase" plots for vector boson plus N jet production [30][31][32][33][34][35][36][37][38]. Finally, for the growing field of matrix element/parton shower matching/merging [39][40][41][42][43][44][45][46][47][48][49][50], N jet has a continuous distribution unlike standard jet algorithms and a huge dynamic range compared to standard event shapes, so N jet can be used to test whether matching/merging procedures achieve a smooth interpolation, even in the soft regime. We note, of course, that theoretical calculations of N jet for hadronic collisions must contend with additional effects such as the underlying event and pileup contamination. Moreover, it will be non-trivial to include nonperturbative effects, power-suppressed terms, and higher-order perturbative effects such as the resummation of non-global logarithms [26,27,[51][52][53][54]. Although a detailed study of N jet for the LHC is beyond the scope of this paper, we briefly discuss some of these issues and potential solutions in section 6.
The rest of this paper is organized as follows. In section 2, we review the basic physics behind N jet and explain the kinematic regimes that give rise to fractional jets. In section 3, we discuss the structure of rapidity-like divergences and how they appear in the N jet calculation. In section 4, we perform fixed-order calculations of ∆ 2± at O α 2 s , using both the full e + e − → 4 parton matrix element as well as a 1 → 3 splitting function approximation. We then present a candidate factorization theorem for ∆ 2± in section 5, which includes a renormalization-group-independent "collinear function". In section 6, we briefly discuss how to extend our results to the LHC. We compare our analytic calculations to Pythia 8 and Herwig++ in section 7, and we conclude in section 8. The appendices contain further calculational results and details.

Aspects of fractional jets
Since we will be looking at electron-positron collisions, it is more natural to work with a variant of N jet based on energies and angles (instead of transverse momenta and azimuthrapidity distances):

JHEP05(2015)008
where E i is the energy of particle i, and θ ij is the opening angle between particles i and j.
For particles whose angular separation is always larger than R, E i /E i,R reduces to 1, and N jet simply counts the number of hard particles above the energy threshold E cut . Because N jet is an infrared/collinear (IRC) safe observable, N jet always takes on integer values in the extreme soft and collinear limit. We will use the notation to indicate the near-integer behavior, with ∆ n+ (∆ n− ) parameterizing the N jet behavior just above (below) n. Our calculations will focus on ∆ 2± and ∆ 3− in e + e − collisions.

Hybrid jet algorithm/event shape behavior
The N jet distribution is peaked at integer values, with substantial support in the nearinteger regime. These different parts of the N jet distribution can be qualitatively studied with Monte Carlo generators. 4 We generate events for e + e − → hadrons at a center-ofmass energy of Q = 500 GeV in Pythia 8.183 [28] and Herwig++ 2.7.1 [29], including showering and hadronization. In figure 1a, we plot the distribution of N jet across a wide range of values, showing the circus-tent-like peak and fall-off behavior of the cross section. 5 In figure 1b, we focus on the distribution in the range 2 N jet 3, which is the region we aim to quantitatively describe in this paper. In figure 2, we plot the N jet distribution in a "triptych" form that shows in more detail the near-integer behavior in ∆ 2− , ∆ 2+ , and ∆ 3− , in particular the logarithmic enhancement as ∆ n± → 0.
As mentioned in the introduction, the cross section at exact integer values has a different behavior than at near-integer values, a feature related to the hybrid jet algorithm/event shape nature of the observable. Like a jet algorithm, the cross section at integer values N jet = n has a non-zero value. Like an event shape, the non-integer cross section is suppressed by all-orders emissions as N jet → n. The behavior can be seen in figure 3, where we plot the distribution in the very near vicinity of N jet = 2.
The reason why integer values N jet = n have a finite cross section is that they receive contributions from regions of phase space with non-zero measure. This can be seen easily at O(α s ) for e + e − → qqg, where the entire cross section lies at N jet = 2 or 3. More generally, any collection of particles within a mutual radius of R will give an integer contribution to 4 The original Njet variable in eq. (1.1) is available through the JetsWithoutJets package, an add-on to FastJet 3 [55] contained in the FastJet contrib library (http://fastjet.hepforge.org/contrib/). The variant in eq. (2.1) is available from the authors upon request. 5 The peak at Njet = 1 in Pythia 8 seems to arise from events with τ leptons produced in hadron decays, where the resulting neutrinos carry away a substantial fraction of the jet momentum. The same feature is not visible in Herwig++, nor is it visible when hadronization effects are turned off in Pythia 8.   figure 1, now plotted in "triptych" form to show the near-integer behavior in ∆ 2− , ∆ 2+ , and ∆ 3− . Note that the ∆ 2− and ∆ 3− axes run backwards. The ∆ 2− and ∆ 2+ distributions interface at N jet = 2 (where ∆ 2± = 0), indicated by the left-hand dashed vertical line. Since we do not plot ∆ 2± all the way to 0, we put a gap over the region around N jet = 2. The ∆ 2+ and ∆ 3− distributions are connected at N jet = 2.5 (where ∆ 2+ = ∆ 3− = 0.5), indicated by the right-hand dashed vertical line.  Note the scale on the x-axis and the fact that the y-axis is logarithmic on the left plot but linear on the right one. At N jet = 2, there is a spike in the distribution from the jet algorithm behavior of the fractional jet multiplicity. At non-integer values, the continuous distribution is more similar to event shapes. We show the near-integer values in two different ranges, and for the closer range (right plot) one can see bumps in the Monte Carlo distributions from multiple emission and hadronization effects.

JHEP05(2015)008
N jet . 6 This means that the differential distribution at integer values has the form In contrast, the distribution for near-integer values ∆ n± > 0 is dominated by logarithms of ∆ n± . We will later show that the logarithms of ∆ n± arise from soft emissions, and the most important terms scale single logarithmically as (α s ln ∆ n± ) k . These logarithms combine at all orders to suppress the cross section as ∆ n± → 0, leading to the disjoint behavior at integer N jet . Note that single-logarithmic suppression is not as strong as the more familiar double-logarithmic suppression, so there are no Sudakov peaks in the ∆ n± distributions.

Soft and collinear limits of fractional jets
To understand the leading near-integer behavior of N jet , we can focus on the soft and collinear limits. As discussed further in section 5, the all-orders structure of the cross section does not satisfy standard soft-collinear factorization, but a soft-collinear analysis is still fruitful at fixed order. The essential physics appears already in the three-parton phase space, where the near-integer behavior is dominated by soft physics. We will focus on a JHEP05(2015)008 single jet region here; to describe e + e − → dijets, we sum over the contributions to N jet from both jet regions (see eq. (5.2)).
We call a group of particles merged if their contribution to N jet is connected, such that removing a subset of particles affects the contribution to N jet from other particles. For a single isolated particle, its contribution to N jet is 1. For a merged pair of particles with separation less than R, one still obtains an integer value of N jet : Because a single soft/collinear emission does not contribute to the value of N jet , it is not linear in soft and collinear momenta in the singular region of phase space, and hence it is a non-additive observable. Now consider three merged partons. As shown in figure 4, there are various different phase space configurations that lead to different values of N jet . To achieve non-integer values, one needs a phase space configuration with 6) or permutations of the parton labels. If all E i > E cut , then the contribution to the jet multiplicity is In terms of energy fractions

JHEP05(2015)008
which removes the overall energy scale of the jet as a degree of freedom, the observable takes the value For this merged triplet, near-integer behavior is obtained when one or two of the partons goes soft, as shown in figure 4. In these soft limits, the observable may depend on a single soft momentum or a product of soft momenta. For example, if particle 1 is soft, Or, if particles 1 and 2 are both soft, The first case is typical of event shape observables, as the near-integer behavior is linear in the soft particle's energy. The second case, however, is unusual -it goes as the product of soft momenta, another demonstration of the non-additive nature of N jet . This feature creates a novel structure in the perturbative series, reminiscent of rapidity divergences in SCET II [21,56]. We discuss this further in section 3, using a toy observable ∆ = z 1 z 2 that exhibits the same analytic features. With more emissions, near-integer behavior requires the soft limit of one or more particles. 7 As mentioned above, an arbitrary collection of energetic particles will yield an integer N jet if the particles can be grouped into merged clusters where each particle in the cluster is within R of all other particles in the cluster (such that each cluster has N jet = 1). Non-integer values are obtained when this is not satisfied (as in the merged triplet example above), though generically the resulting values are far from integers. Near-integer values are obtained by starting from an integer N jet configuration and then adding any number of soft particles which are not within the mutual radius R of the cluster. The contribution of these soft particles to N jet will be suppressed by 12) and the deviation of N jet from integer values is similarly suppressed by this ratio as long as E soft,R > E cut . In this way, the near-integer behavior is determined by soft emissions in the vicinity of hard clusters, and soft emissions will generate logarithms of ∆ n± . By contrast, collinear splittings cannot generate near-integer behavior of N jet and hence do not generate logarithms of ∆ n± . For collinear splittings of angle R c , the only effect on N jet comes from particles who are within R of either of the two daughters or the parent, but not within R of all three. For small R c , this is a power-suppressed region of phase space and not logarithmically enhanced. Said another way, the effect on N jet from a small-angle splitting is not smooth, as it either preserves the value of N jet or discretely changes it by 7 We are neglecting special configurations of energetic particles where the value of Njet happens to be near-integer, since those regions of phase space are power suppressed.

JHEP05(2015)008
including or excluding particles from the various E i,R terms; this behavior cannot generate logarithms of ∆ n± . So unlike standard jet shapes (like jet mass) which depend on both the softness and collinearity of a splitting, the near-integer N jet value depends only on the softness of emissions.

Non-global yet local structure
We have argued that soft emissions contribute to near-integer values of N jet , but this is only the case if the soft emissions lie within a restricted phase space of size 2R around the jet. Wide-angle soft radiation does not contribute to N jet , since those emissions yield E i,R < E cut . 8 This angular restriction on soft radiation produces non-global logarithms of ∆ n± , which are logarithms that arise from emissions in a restricted angular region of phase space [26,27].
At leading order, non-global logarithms are often associated with correlated soft emissions from non-Abelian matrix elements and are therefore proportional to C F C A . For the case of fractional jet multiplicity, however, the measurement itself introduces a correlation between different emissions, and this effect appears for both non-Abelian and Abelian matrix elements. As we discuss in detail in section 4.1, the allowed phase space for a soft emission to change N jet depends on the phase space of other soft emissions and on the phase space of the hard partons, similarly to what happens for clustering logarithms [57][58][59][60]. In this regard, all soft logarithms of N jet , including those proportional to C 2 F , can be considered non-global.
Moreover, the fact that the allowed phase space for soft emissions to change N jet depends on the phase space of the hard partons means that one cannot consider how N jet depends on soft emissions without also considering collinear emissions, and vice versa. Because the contributions to N jet from soft and collinear emissions are inextricably linked, we will show in section 5.2 that N jet does not obey standard soft-collinear factorization [19,[61][62][63]. As mentioned above, collinear emissions by themselves do not generate logarithms of N jet , but they do alter the allowed phase space for soft radiation within a jet region. Thus, collinear emissions will modify the coefficients of soft logarithms of N jet , which is a sign of collinear-soft non-factorization.
While the ∆ n± dependence is non-global, it is also local to each jet. Emissions affecting N jet are restricted to an angular region of 2R around each jet direction. Additional emissions away from all jets can also create their own jets, changing N jet by an integer or near-integer amount, but an emission far from all jets cannot change N jet by a small amount. Because of color coherence, this implies that the contribution to N jet from a given jet is, to leading power, independent of all other jets in the event. 9 The N jet distribution is therefore a sum of contributions from jet regions that are to a good approximation independently determined. We will formalize this "local" factorization structure in section 5.4, and see how it could simplify LHC calculations in section 6.

JHEP05(2015)008
1 soft, 2 collinear sc Figure 5. The single-and double-soft modes depicted in the z 1 -z 2 plane for the simplified observable ∆ = z 1 z 2 (similar to ∆ 2+ ). We assign the power counting ∆ ∼ λ 2 . In the case with only one soft particle, the soft energy scales as λ 2 , whereas in the case with two soft particles, the soft energies scale as λ. These modes are connected by the hyperbola shown, which can be thought of as tracing out the rapidity in the energies of the two partons (see eq. (3.1)). In our calculation, we encounter divergences in this rapidity variable for each mode depicted. This picture has a strong analog in the divergences in physical rapidity of single particles in SCET II .

Rapidity-like divergences
As mentioned in section 2.2, the calculation of N jet at O(α 2 s ) features divergences not regulated by dimensional regularization (dim reg) when the calculation is divided into unique soft limits. These divergences have a strong resemblance to rapidity divergences in SCET II , where the large rapidities of soft and collinear modes in the effective theory generate divergent integrals not regulated by dim reg [21,22,56]. The similar divergences appearing in the N jet calculation are not from physical rapidities, but instead originate from the energy sharing between particles and are easily cast in terms of a "rapidity" of this energy sharing: In this section, we show how these divergences arise in the N jet calculation and how we can adapt standard rapidity regulators to our case. Instead of the complete N jet calculation, consider the simplified observable where particles 1 and 2 are particles that may be soft. ∆ has the same soft scaling properties as the near-integer behavior of N jet (see eqs. (2.10) and (2.11)), and the other complications from the complete N jet calculation are irrelevant for the discussion here. Like ∆ 2+ to be calculated in section 4.1, ∆ receives contributions from both single-soft and double-soft regions of phase space. Also like ∆ 2+ , ∆ is a non-additive observable that does not get a linear contribution from each soft emission. The interplay between the single-and double-soft limits is illustrated in figure 5. In order to have a consistent power counting for ∆, we need to consider two different scalings JHEP05(2015)008 of the soft modes. Let λ 1 be the power counting parameter, with ∆ ∼ λ 2 . The phase space regions with the same parametric contribution to ∆ are: Thus, there are two types of soft modes -single-soft modes scale as λ 2 , whereas doublesoft modes scale as λ -and there is a unique soft power counting within each sector. This relative scaling is the same as ultrasoft modes in SCET I (which scale as λ 2 ) and soft modes in SCET II (which scale as λ). This indicates that the calculation has contributions to the observable from soft modes in both SCET I and SCET II . The "energy rapidity" variable y in eq. (3.1) separates these phase space regions by measuring the relative energy sharing between particles 1 and 2. Like an ordinary rapidity, the range is y ∈ (−∞, ∞). The sc region has z 1 z 2 (y → −∞), the ss region has z 1 z 2 (|y| 1), and the cs region has z 1 z 2 (y → ∞). Each of the sc/cs/ss sectors gives an independent contribution to the ∆ distribution at O(α 2 s ) which should be properly summed (i.e. one has to remove potential double-counting). As we will now see, unbounded integrals in y produce divergences, not regulated by dim reg.
In full QCD with dim reg in d = 4 − 2 dimensions, the soft divergences of ∆ are encapsulated by the integral where L 1 is a logarithmic plus-function defined in eq. (B.3). If we take soft limits of the phase space, where the boundaries scale as [0, 1] → [0, ∞], we get:

JHEP05(2015)008
Each integral is divergent and unregulated, but the ∆ < 1 regime of full QCD is reproduced by the combination 10 This result is as expected, since the double-soft limit should remove the double-counting of the single-soft limits, even if each contribution is not well-defined individually. Note that the unregulated divergences appear only in y. This structure of divergences in y is identical to those in physical rapidities of single particles in SCET II . These divergences may be handled in a number of ways, and we will use the rapidity renormalization group (rapidity RG) [21,22]. In the rapidity RG, divergences are regulated analogously to dim reg, using a scale ν and infinitesimal parameter η that correspond to the usual µ and in dim reg. At one loop, these regulator factors are [21,22] 1 soft: where s = z 1 z 2 , and E J is the total jet energy such that z i = E i /E J . These regulators give well-defined terms for each soft limit, Now the sum of contributions is independent of the rapidity regulator and matches the full QCD results for ∆ < 1: Comparing eq. (3.9) to eq. (3.6), we see that with the rapidity regulator, the full result is reproduced by the sum of single-and double-soft contributions, instead of the difference.

JHEP05(2015)008
This happens because the single-and double-soft limits produce canceling poles in the large positive and negative rapidity regions. For example, the large rapidity limit is allowed in both the sc and ss contributions, where the rapidity regulators scale as which lead to canceling poles: Thus the cancellation of the large rapidity regimes happens at the level of the sum of contributions, instead of the difference. Not only does the rapidity RG make the various soft contributions well defined, but it also separates the double-soft limit from the zero-bin subtraction (needed to remove double counting [65]) of the single-soft limits. 11 Because the rapidity regulators in the single-and double-soft limits are not related by scaling in rapidities, the double-soft contribution is not the zero-bin of the single-soft limit. In fact, the zero-bin limit is scaleless; the rapidity regulator is unchanged in the zero-bin limit but now all rapidities are allowed, leading to the scaleless integral [22] ∞ −∞ dy e ±yη = 0 . (3.12) In SCET II applications where the collinear and soft contributions to the distribution can be factorized into separated jet/beam and soft functions, this factorization allows these rapidity divergences to be divided into the collinear and soft functions and the resulting logarithms to be resummed using standard techniques (see, e.g., [21,22]). The features of rapidity divergences in this simple example of ∆ repeat themselves in the calculation of near-integer behavior of N jet (specifically, ∆ 2+ ) in section 4.1 below.

Calculating the near-integer behavior
As seen in figure 4, fractional values of N jet arise when radiation in a jet region extends beyond a radius R. In particular, near-integer values of N jet come from soft radiation beyond R, and the deviation from integer values grows as emissions become harder. In this section, we study the leading-order near-integer behavior of N jet in e + e − → jets at a center-of-mass energy of Q, which first occurs for e + e − → 4 partons.
Let ∆ 2± be the jet multiplicity near the 2-jet peak as defined in eq. (2.3); we take the power counting ∆ 2± ∼ λ 2 , with λ 1. We will first derive an analytic expression for the cross section at leading order in λ (i.e. the singular contributions), including a discussion of JHEP05(2015)008 the aforementioned rapidity-like divergences. We will then calculate the full O α 2 s result using the Monte Carlo program Event2 [66,67], which allows us to include non-singular terms as well as cross check our results for the singular contributions. We will also calculate the O α 2 s contributions to ∆ 3− , though this is not the focus of our studies.

Singular contributions using splitting functions
The cross section for ∆ 2± can be written as where Φ 4 represents four-body phase space, T is the matrix element for e + e − → 4 partons, and F (∆ 2± , Φ 4 ) is the measurement function, which projects out the slice of phase space corresponding to a constant value of ∆ 2± . The allowed values of N jet for four partons are: 12 Two isolated partons and one merged pair: N jet = 3, (4.3) One isolated parton and one merged triplet: The only non-integer behavior is obtained for the merged triplet in eq. (4.4), which requires three particles to be within an angular distance 2R. Thus, for sufficiently small R, we can use the matrix element in the limit where three partons are collinear, allowing us to take advantage of collinear factorization [66][67][68][69][70]: 13 where k labels one of the parton channels q → ggq ,q → ggq , q → q q q , orq → q q q, s 123 is the squared invariant mass of the three parton system, and P k 1→3 is the spin-averaged 1 → 3 splitting function [76,77]. The factorization in eq. (4.6) implies that the relevant phase space for the collinear splitting is 3-body, meaning we decompose the 4-body phase space of the whole event into the 2-body qq system of the hard interaction and the 3-body phase space of the 1 → 3 splitting. In the collinear limit, the 4-body phase space factorizes as dΦ 4 The collinear matrix elements can be further decomposed according to their color structure. The channel q → ggq (andq → ggq) can be written as a sum of Abelian ("ab") 12 Here we assume that each isolated parton and group of partons are sufficiently energetic to pass the energy cut Ecut. If this does not hold, then the integer values of Njet = 3 or 4 may be reduced to 2 or 3. 13 We use spin-averaged splitting functions, which are sufficient since the value of Njet for the collinear system is independent of its orientation relative to other jets in the event (see, e.g., [71]). This 1 → 3 splitting function approach also appears in ref. [72] for calculating the jet substructure observable planar flow [73][74][75].

JHEP05(2015)008
and non-Abelian ("nab") contributions: while the q → q q q (andq → q q q) channels will give contributions proportional to the C F T R color structure. 14 At O α 2 s the cross section for ∆ 2± is obtained by summing over all channels and color structures. In the rest of this subsection we will discuss the calculation of the Abelian piece of the cross section; results for the other color structures can be found in appendix A.
The Abelian contribution comes from a q → ggq orq → ggq splitting. The two channels give identical contributions and we will focus on the q → ggq case for definiteness, labeling the final state as: (4.9) As discussed in section 2.2, at leading power we can take the soft limit of our observable, which corresponds to either one or both gluons becoming soft. As in the toy calculation in section 3, we will label the limit where gluon 1 (gluon 2) is soft as sc (cs) and the double-soft limit as ss. We again use the energy sharing variables where here E J is the energy of one jet region (with E J Q/2 for a collision at center-of-mass energy Q). We generally assume z cut 1, which allows us to neglect power-suppressed contributions from soft quarks, and if some z i < z cut then that particle may be treated as soft.
In tables 1 and 2 we collect contributions to ∆ 2± in the relevant soft limits and regions of phase space. At this order, we also get a contribution to ∆ 3− = 3 − N jet , and we will carry out the calculation for ∆ 3− as well.
We now gather the relevant pieces to evaluate eq. (4.1) in the collinear limit. There are three relevant regions of phase space: (4.10) where θ ij labels the opening angle between partons i and j. In 14 For same flavor final state quarks q = q, this matrix element contains also terms proportional to C 2 F and CF CA. However, these terms do not contribute to the cross section at leading power.

JHEP05(2015)008
Observable where F ss The single-and double-soft limits for the 1 → 3 matrix elements are , (1 soft, 2 collinear) where we have included a symmetry factor of 1/2! for identical gluons. Notice that in the double-soft limit, the Abelian matrix element simply reduces to the product of two eikonal

JHEP05(2015)008
Observable Region Expression Limit Cuts factors. In the collinear regime, the 3-body phase space can be written as [78] dΦ coll where the angular phase space dΦ Ω is given by 24) and the energy phase space dΦ z is given by (1 soft, 2 soft) (4.25) As discussed in section 3, for certain contributions to the observable, these energy integrals give rise to rapidity divergences which require special regularization to make each term well defined. We use the rapidity regulator to do this, and the sum of all contributions is regulator independent. Specifically, the contribution to ∆ 2+ from R C is the only one with both single-and double-soft limits (see tables 1 and 2), and hence the only contribution which requires this extra regulator. While ∆ 2− does receive contributions from the doublesoft region of phase space, the constraints on the gluon energies from z cut implies that both gluons must be soft, and hence the single-soft limits do not contribute. The limit on the gluon energies also implies there is a kinematic limit on the observable, ∆ 2− < (z cut /2) 2 .
By combining the measurement function with the proper limits of the matrix elements, we get an analytic expression for the Abelian contribution to the ∆ 2± and ∆ 3− distributions JHEP05(2015)008 at leading power in λ. Including the identical contribution fromq →qgg yields where the distributions L n are the usual logarithmic plus distributions, defined in appendix B. The integral over the angular phase space is given by 15 Eqs. (4.26), (4.27), and (4.28) represent the leading order distributions in the small R and z cut limits. Notice that at this order there is no dependence on R, as the 1/θ 2 ij factors in the denominator of the angular integral provide a logarithmic scaling which can be used to explicitly remove the R dependence. Since we are focused on the non-integer behavior of N jet , we have suppressed contributions proportional to δ(∆ n± ) in our results (see, however, eq. (5.5)).
The calculation of the C F C A and C F T R n f contributions are very similar to the Abelian case, albeit more involved as the matrix elements do not simply factorize into separate angular and energy functions as in eq. (4.22). In appendix A, we give the full set of results for the different color structures.
In the O α 2 s calculation performed above, double logarithms of ∆ n± (appearing as L 1 (∆ n± )) arose when both emitted gluons became soft, and single logarithms (L 0 (∆ n± )) arose when a single gluon became soft. This correspondence holds true with more soft emissions: each soft emission results in a single logarithm of ∆ n± (in certain regions of phase space). Therefore, in general the largest logarithm at O α k s appears when all emissions are soft gluons, and the contribution to the cross section is of the form α k s L k−1 (∆ n± ).

Nonsingular contributions from Event2
To go beyond leading power in λ, we need to incorporate the O α 2 s matrix elements from full QCD. For fixed-order calculations of e + e − → partons at low multiplicity, the program Event2 is a particularly useful tool [66,67]. Event2 performs next-to-leading order calculations of e + e − → 2 and 3 partons, so it includes the needed e + e − → 4 parton tree-level matrix elements. Conveniently, it allows for the decomposition of its results order-by-order in α s and by color structure. Crucially, Event2 can probe the far infrared regions of phase space, meaning the (singular) logarithmic terms of ∆ 2± and ∆ 3− are enhanced. This allows us to perform a robust comparison and cross check of our splitting 15 We have written the common angular coefficient IΩ in terms of the phase space region RA. The same angular coefficient appears for regions RA, RB, and RC due to an unexpected symmetry in the angular integrals. This symmetry is only present for certain angular integrals in the non-Abelian case. The ∆ 2± and ∆ 3− distributions extracted from Event2. Shown are the separate C 2 F , C F C A , and C F T F n f contributions to the cross section, plotted as the coefficient of (α s /2π) 2 C, where C is the relevant color factor. Note that the ∆ 2− and ∆ 3− axes run backwards. function calculations above. Having verified the singular logarithmic contributions to ∆ 2± and ∆ 3− , we can then extract the nonsingular O α 2 s contributions directly from Event2. In figure 6, we show the O α 2 s contributions to ∆ 2± and ∆ 3− extracted from Event2. We plot the coefficients of the (α s /2π) 2 C terms in the cross section as a function of ln ∆ n± , where C is the relevant color factor (C 2 F , C F C A , or C F T F n f ). To enhance the logarithmic contributions and minimize the power corrections in R and z cut , we choose the small values R = 0.1 and z cut = 0.02. Plotted this way, double logarithms L 1 (∆ n± ) appear as lines of constant non-zero slope, 16 single logarithms L 0 (∆ n± ) appear as lines with zero slope and non-zero offset, and nonsingular contributions vanish as ∆ n± → 0. It is clear that logarithmically-enhanced terms are indeed present in the full QCD result from Event2.
To make sure the logarithmic behavior from Event2 matches our analytic calculations in section 4.1 and appendix A, we can extract the nonsingular contribution to the cross section, which are the residual fixed-order terms after the logarithmic contributions are subtracted: These are shown in figure 7, again separated by color structure, and confirm that our splitting function calculation, which includes the leading contributions in a small R and z cut expansion, captures the leading-order near-integer behavior of N jet correctly. 17 For N jet < 2, the singular terms tend to dominate over the nonsingular ones. The reason is that the ∆ 2− distribution has a kinematic endpoint at (z cut /2) 2 1 at this order, so the distribution only has support in the singular region of phase space. 16 Note that these are not Sudakov double logarithms, since they appear at O α 2 s , not O(αs). For 2 < N jet < 3, there is an ambiguity in determining the nonsingular terms in the cross section, related to the fact that singular terms exist at multiple points in the N jet spectrum. In this range, there are singular terms from ∆ 2+ that dominate near N jet = 2 and singular terms from ∆ 3− that dominate near N jet = 3. In figure 7, we defined the nonsingular term by removing the singular terms from the nearest singular point in the N jet distribution, using N jet = 2.5 as an arbitrary dividing line. 18 Alternatively, we could define the nonsingular term by removing both sets of singular terms over the entire spectrum. 19 This is a valid approach as well, since the singular ∆ 2+ and ∆ 3− distributions are governed by different soft limits of phase space, hence there is no double-counting by including both singularities.
Since we are mainly interested in describing the behavior in the vicinity of N jet = 2, though, for the rest of the paper we will simply adopt the nonsingular definition in eq. (4.30), even in the vicinity of N jet = 3:  Figure 8. The 2 < N jet < 3 distribution in full QCD (black, solid), decomposed into singular contributions from ∆ 2+ and ∆ 3− (blue, dashed/dot-dashed) and residual nonsingular contributions (red, dotted). The coefficient of (α s /2π) 2 is plotted for each contribution. Note that the sum of singular contributions is numerically dominant over the entire range.
That said, this non-singular term turns out to be dominated by the singular ∆ 3− piece. Writing the nonsingular term as the residual term is quite small, even in the vicinity of N jet = 2.5. This is shown in figure 8, where the full QCD result between 2 < N jet < 3 is decomposed into the ∆ 2+ singular, ∆ 3− singular, and residual terms for R = 0.4 and z cut = 0.2. The fact that the ∆ 2+ nonsingular term is nearly saturated by the ∆ 3− singular term suggests that higher-order logarithmic terms can play an important role in determining the shape of the N jet distribution, even for large deviations from integer values of the observable.

Complete results to order α 2 s
Summarizing the results from sections 4.1 and 4.2, the full O α 2 s result for the near-integer ∆ 2± distributions can be written as The first three terms in eq. (4.33) are the singular contributions to the cross section. The Abelian terms are given in eqs. (4.26) and (4.27), while the contributions of non-Abelian and C F T R n f color structures can be found in appendix A. The last term in eq. (4.33) is the nonsingular contribution that we extract numerically from Event2 as discussed above.

Towards a factorization theorem
In the previous section, we have seen that soft and collinear matrix elements govern the near-integer behavior of N jet in fixed-order QCD. In this section, we explore the all-orders behavior at and near N jet = 2, building a candidate factorization theorem to describe the logarithmically-enhanced contributions to the cross section.
Near N jet = 2, energetic collinear modes must be confined to two jet regions of diameter R, such that no two energetic particles are more than R apart. This suggests that the collinear radiation may be described by jet functions. Additional radiation must be sufficiently soft so as not to create an additional jet, meaning there is a local veto of size z cut outside of the primary jets. This suggests that additional radiation may be described by a soft function. However, while one might hope that a factorization theorem of the form would hold, it fails on several fronts. First, this standard factorization picture is challenged by both the non-additivity of the observable and soft-collinear non-factorization (discussed in section 2.3 and examined in more detail below). Additionally, the logarithms of ∆ 2± are non-global, and may not be summed with the above factorization into jet and soft functions. Furthermore, the cross section at N jet = 2 (∆ 2± = 0) behaves similarly to a dijet cross section from a standard discrete jet algorithm, for which a convolution structure does not apply. We will discuss these issues below, en route to a candidate "local" factorization theorem which applies in the small R limit. This factorization theorem will take the form where σ( N jet = 2) is the cross section exactly at N jet = 2 (see section 5.1) and C q,q are "collinear functions" (see section 5.3). This form satisfies a number of plausibility checks, but a formal proof of its validity is beyond the scope of the present paper.

The cross section at N jet = 2
We begin by considering the cross section at exactly N jet = 2. At O(α s ), the constraints to obtain N jet = 2 are the same as for obtaining two jets from a discrete jet algorithm, namely that either two of the three partons must lie within a mutual radius of R or one of the partons must be below E cut . At leading power, the phase space restrictions on the qqg final state are Θ(θ qg < R) + Θ(θq g < R) + Θ(θ qg > R)Θ(θq g > R)Θ(z g < z cut ) .

JHEP05(2015)008
At higher orders, the phase space restrictions differ from discrete jet algorithms (which also differ among themselves), and the N jet = 2 cross section has unique features. Consider an O α 2 s configuration contributing to the near-integer behavior, as discussed in the section 4.1. A concrete example is the ∆ 2+ -R C configuration in table 1, which has a collinear q → qg splitting with another soft gluon (labeled s) emitted such that θ qg < R , θ qs > R , θ gs < R . (5.4) This event will give a non-integer contribution to N jet , and its contribution to the cross section is of the form The plus distribution terms were already calculated in eq. (4.27). The δ(∆ 2+ ) terms were suppressed in our previous discussion (since we were focused on the non-integer behavior), but both the 1/ 2 term and the ζ term follow directly from the calculations in section 4.1.
The structure of eq. (5.5) reflects the hybrid jet algorithm/event shape behavior of N jet . The plus distribution terms have no support at N jet = 2, since the plus function prescription removes any divergence there. 20 This feature is similar to other event shape variables with a singular limit, such as thrust, where there is zero cross section in the singular limit at higher orders. By contrast, the delta function terms behave more like a jet algorithm, as anticipated in eq. (2.4), with a finite cross section at exactly N jet = 2. The IR pole from the real radiation at N jet = 2 will cancel divergences from virtual matrix elements, such that the N jet = 2 cross section is IR finite. 21 The additional finite ζ term at N jet = 2 implies that the non-global structure in the near-integer part of the cross section is also contributing at N jet = 2.

Soft-collinear non-factorization
The non-global nature of the near-integer behavior of N jet is suggestive of a non-standard picture of factorization. In fact, the N jet observable does not obey soft-collinear factorization, meaning separate soft and collinear functions cannot be easily (or usefully) defined. This was foreshadowed in section 2.3 and can be illustrated with a simple example at O α 2 s . Consider the same phase space configuration as eq. (5.4), which has a single soft gluon and a pair of collinear (energetic) partons. The soft gluon is in the region of phase space where it is in the jet region of only one of the collinear partons, and the value of the observable is dependent on this fact. This implies that the observable receives contributions that intrinsically depend on the soft and collinear modes in a non-factorizable 20 To see this, we can use the definition of the plus function in eq. (B.1). Integrating [q(x)]+ against δ(x) gives the value at 0, It is straightforward to see that at O α 2 s the cross section at Njet = 2 must be IR finite. Since noninteger values of Njet, as well as Njet = 3 and 4, have IR finite cross sections, and since the inclusive cross section is IR finite, the remaining piece, the cross section at Njet = 2, must be as well.

JHEP05(2015)008
way; the measurement function cannot be separated into separate soft and collinear pieces that do not depend on each other (see figure 4). Therefore, soft-collinear factorization does not hold. We note that, unlike other cases where soft-collinear factorization is not straightforward (see, e.g., [63,79]), here the leading non-integer behavior does not factorize.
Because the non-global structure of ∆ 2± feeds into the N jet = 2 cross section through the ζ term in eq. (5.5), the same non-factorization is also true of the integer value. Of course, for the exact N jet = 2 case, non-factorization of the soft and collinear modes happens only at O α 2 s . Therefore, one can write the N jet = 2 cross section as the sum of terms, one with global contributions that can consistently be resummed and one with non-factorizing contributions. Such a form is The factorized part of the cross section is similar to dijet (or, generally, exclusive jet) cross sections. In e + e − collisions, such cross sections are known to contain non-global logarithms that spoil a standard effective theory picture of the dynamics. These non-global logarithms span the jet and soft functions, affecting the RG evolution in nontrivial ways. Similar to the non-factorizing terms, the non-global contributions start at O α 2 s .

Introducing collinear functions
Let us now try to simultaneously describe the near-integer and exact-integer behavior of N jet . We have established that the logarithms of ∆ 2± are purely non-global, and there is no standard soft-collinear factorization. As mentioned in section 2.3, however, the contributions from each jet region are independent, meaning they can be separated: where σ 0 is an overall prefactor and ⊗ refers to the standard convolution in eq. (B.6). Here, we have introduced "collinear functions" C q and Cq for the two jet regions, which give the contribution to ∆ 2± for each jet region separately. The collinear functions contain only the singular terms and have the general form where κ (n) k,± are coefficients which in general depend on z cut (except for the leading coefficient κ (n) n−1,± ). Recall that L −1 (x) = δ(x), and these delta functions are needed to describe the N jet = 2 cross section. Since ∆ 2± describe different regimes of the same observable ( N jet just above/below 2), each term in the expansion has support for one term or the other.
The convolution between the collinear functions in eq. (5.8) mixes the distributions for ∆ 2+ and ∆ 2− . Therefore, we require not only the usual convolutions between distributions of either ∆ 2+ or ∆ 2− (which we refer to as convolutions of one-sided distributions, e.g.
We can now reinterpret the calculation in section 4.1 directly as a calculation of C q and Cq to O α 2 s (see appendix C). The convolution between C q and Cq in eq. (5.8) then gives part of the higher-order O α 4 s cross section. In addition, we can estimate the O α 4 s structure of an individual collinear function by performing naïve (Abelian) exponentiation to capture some of the multiple emission terms. That is, we assume that each of the collinear functions in eq. (5.9) has the form The convolution terms at O(α 4 s ) are those coming from naïve Abelian exponentiation in one jet region. Note that this does not fully capture the correct higher-order terms (for example, we do not get any term at O(α 3 s ) from this exponentiation), but we will see in section 7 that it is enough to reproduce some of the higher-order effects observed in parton shower Monte Carlo generators. In defining eq. (5.10), we have used the fact that one can absorb corrections to the N jet = 2 cross section into the σ 0 prefactor (see further discussion below).

A "local" factorization theorem
Via eq. (5.8), we can capture the impact on N jet of soft and collinear emissions within the two jet regions. But soft emissions away from the jet regions can still give logarithmicallyenhanced contributions to the cross section, even if they do not change the value of N jet . For example, a soft gluon well-separated from the jets with energy fraction less than z cut will not change N jet , but it will contribute to the cross section. For e + e − → qqg, this wide-angle soft radiation is part of the N jet = 2 phase space. Going to higher orders, there is a contribution to both the exact-integer and near-integer cross sections from wide-angle soft radiation, and up to power corrections for small R, both contributions are identical.
This logic implies that the N jet = 2 cross section should multiply the near-integer contributions, leading to the candidate factorization theorem from eq. (5.2), repeated for convenience: This is the same structure as eq. (5.8), but we have identified σ 0 with σ( N jet = 2). As discussed in eq. (5.7), σ( N jet = 2) itself has its own quasi-factorization theorem. If this JHEP05(2015)008 candidate factorization theorem is indeed true, then the definition of C q,q (∆ 2± ) in eq. (5.10) should be revised such that the coefficient of the δ(∆ 2± ) piece is always 1 (i.e. the sum over k should start at 0 instead of −1). We stress here that eq. (5.2) is only a candidate factorization theorem, and we have not proven that such a form exists. In particular, we do not have an operator definition of the collinear functions C q,q , and without such a definition, the extraction of C q,q is ambiguous from fixed-order calculations alone. The reason is that the structure of C q (∆ 2± ) ⊗ Cq(∆ 2± ) is single-logarithmic order-by-order, without any definite relation between coefficients, so higher-order terms can absorb corrections from lower-order ones. For example, the leading nontrivial terms in σ( N jet = 2) (the O(α s ) term) and C q (∆ 2± ) ⊗ Cq(∆ 2± ) (the O α 2 s L 1 terms) will contribute to κ 1,± (the O α 3 s L 1 (∆ 2± ) terms), but cannot unambiguously determine that coefficient.
Despite these limitations, our candidate factorization theorem does describe important effects (like convolutions between the two jet regions) that go beyond a simple perturbative, log series expansion, which is why we will use eq. (5.2) in our comparison studies in section 7. One thing we can say unambiguously is that if the factorization theorem in eq. (5.2) is valid, then the collinear functions must be RG independent. The reason is that the prefactor σ( N jet = 2) is itself a cross section, so it must be RG independent, and therefore C q ⊗ Cq must be RG independent. Similarly, the modes that contribute to C q and Cq are completely disjoint, so there is no possibility that RG-scale-dependence could cancel between C q and Cq. This does not rule out, however, a further factorization of the collinear functions.

Complete results
Let us summarize our final prediction for the ∆ 2± distributions using the candidate factorization theorem in eq. (5.2). Beyond the calculation summarized in section 4.3, we can include two higher-order effects.
First, as described in section 5.3, the collinear function approach allows us to capture O α 4 s terms coming from convolutions. There are two types of convolutions: convolutions between C q and Cq given by the factorization theorem, and convolutions within an individual collinear function coming from naïve Abelian exponentiation. Though the naïve exponentiation can in principle give results to all orders in α s , we will truncate the convolutions to order O α 4 s , especially since there are known O α 3 s terms missed by this approach.
Second, we can include running coupling effects by evaluating α s (µ) at an energy scale µ = Q √ ∆ 2± . To see why this is the correct scale, note that in section 4.1, the fixed-order expansion in dim reg had a prefactor of where E J Q/2 is the jet energy. This implies that at higher orders, the fixed-order calculation generates terms like ln(µ 2 /E 2 J ∆ 2± ) where µ is the renormalization scale, suggesting that Q √ ∆ 2± is a relevant running coupling energy scale.

JHEP05(2015)008
As in section 4.3, the nonsingular contributions to the cross section enter at fixed O α 2 s and can be simply included additively. Like in section 4.3, we absorb the entirety of the ∆ 3− singular distribution into the ∆ 2+ nonsingular distribution. To try to capture some higher-order effects in ∆ 3− , though, we make use of the nonsingular decomposition in eq. (4.32), repeated for convenience: We evaluate the singular ∆ 3− piece at the scale µ = Q √ ∆ 3− and the small ∆ 2+ residual term at a fixed scale µ = Q. While this approach misses out on genuine O α 3 s fixedorder corrections from 3-jet events, they have an endpoint at ∆ 3− = (z cut /2) 2 analogous to eq. (4.4), so we will simply not make a prediction for ∆ 3− < (z cut /2) 2 .
Finally, while the singular terms for ∆ 2± come with an overall prefactor of σ( N jet = 2), the nonsingular terms do not. However, since the nonsingular terms start at O α 2 s , we can multiply them by σ( N jet = 2)/σ 0 , which introduces corrections beyond the order to which we are working. This allows us to factor out a global σ( N jet = 2) from our final predictions.
Putting these pieces together, our final analytic prediction for the ∆ 2± distribution is: In figure 9, we compare the pure fixed O α 2 s distributions for ∆ 2± and ∆ 3− from eq. (4.33) with the final prediction from eq. (5.16). At small values of ∆ n± , the dominant differences come from the higher-order logarithms included in our final prediction, while at larger values of ∆ n± the running effects from our ∆ n± -dependent scale choice give most of the difference from the fixed-order prediction. Since we do not have a candidate factorization theorem for the N jet = 3 region of phase space, the ∆ 3− distribution is really just eq. (5.16) evaluated at ∆ 3− = 1 − ∆ 2+ .

Looking towards the LHC
Given the interesting features of N jet and the wide-ranging and robust measurements of jet substructure at the LHC [80][81][82], a natural consideration is the N jet distribution in hadronic collisions. In this case, it is more convenient to use the original definition of N jet from eq. (1.1) based on transverse momenta and rapidity-azimuth distances (instead of JHEP05(2015)008 where p T i,R = j p T j Θ(R − R ij ) and R ij = ∆y 2 ij + ∆φ 2 ij . At first glance, the calculation of N jet at the LHC would seem to be much more difficult than at an e + e − collider. After all, N jet depends sensitively on soft radiation, and soft QCD is notoriously complicated at a hadron collider. However, N jet only depends on soft radiation in a region of size 2R around energetic partons. To the extent that we can exploit color coherence at small R, we can make predictions for N jet by simply knowing the collinear functions for quark-and gluon-initiated jets, C q and C g . In particular, the 1 → 3 splitting function formalism that we used in section 4.1 to calculate the collinear function for quarks in e + e − may also be used to calculate the quark and gluon collinear functions for LHC processes.
Conveniently, when switching between the e + e − and hadronic definitions of N jet , the collinear functions C q,g only differ by terms of O(R). For small jet radii, one can therefore neglect those corrections and use the e + e − collinear functions calculated in this paper also for the hadronic case, though one has to be careful to match the z cut value to the outgoing parton momentum. To see why this is the case, notice that where p T J is the scalar sum of transverse momenta in a jet region. Thus, the energy integrals in C q,g would differ at most by O(R) terms. The angular phase space regions in JHEP05(2015)008 eq. (4.10) would be written as constraints on R ij instead of θ ij , but notice that where y J is some characteristic rapidity associated with the jet region, for example the rapidity of the summed jet region momenta. At leading power, the logarithmic scaling of the angular integrals can be used to remove any dependence on cosh y J , so that the differences in the angular integrals are again at most an O(R) effect. In order to extend the candidate dijet factorization formula in eq. (5.2) to dijet events from hadronic collisions, we have to sum over all relevant partonic channels: where k, = q,q, or g, and h 1,2 represents the colliding hadrons. In general, σ h 1 h 2 →k , C k , and C depend on the outgoing parton kinematics, though we have suppressed that dependence in eq. (6.4) for readability. Note that the incoming beams only create additional high-p T jets through hard, perturbative emissions. We can write the total rate σ h 1 h 2 →k ( N jet = 2) schematically as where f a/h 1 and f b/h 2 are parton distribution functions for partons a and b carrying momentum fractions x a,b of the initial hadrons h 1,2 , and σ ab→k ( N jet = 2) is the total rate at N jet = 2 for the partonic channel ab → k . As in the e + e − case, this cross section can have large logarithms of R and p T cut that require resummation to obtain reliable predictions. Compared to the e + e − case, the main new ingredient is the collinear function C g (∆ 2± ) for a gluon-initiated jet. For completeness, we have calculated the gluon collinear function using the same approach as section 4.1, and we present both C q and C g in appendix C. The structure in eq. (6.4) can be easily extended to handle the ∆ n± cross section for n-jet processes by using n collinear functions multiplied by the N jet = n cross section, appropriately summed over the various partonic channels.
At a hadron collider, there is also a new potential source of non-global logarithms from initial state partons. In general, non-global logarithms of ∆ n± only appear in the collinear functions C q,g for each final state jet, and are associated with correlated emissions from final state partons. For events with well-separated jets, the collinear functions are universal and independent of other jets in the event. However, there are also non-global logarithms of z cut and R which appear both in the collinear functions and in the exact integer N jet cross section (e.g. σ h 1 h 2 →k ( N jet = 2), see eq. (5.7)). For the exact integer cross section at a hadron collider, one also has to include correlated emissions from initial state partons, which may introduce super-leading logarithms [83][84][85] at sufficiently high α s order. Ideally, one would want to understand the resummation of non-global effects in the collinear functions and the exact integer cross section to achieve accurate predictions.

JHEP05(2015)008
Because a detailed study for the LHC is beyond the scope of this paper, we will not present any results for hadronic collisions. An important effect to account for in future LHC predictions and comparisons with measurements is hadronization, which we have not considered here. Additionally, for hadronic collisions, effects from the underlying event and initial state radiation are not present in the e + e − case. To lowest order in R, we expect those effects to be captured by color coherence, and one could imagine using the techniques of ref. [86] to understand the R dependence. A similar concern is pileup contamination, though the closed-form nature of N jet makes it well-suited to analytic studies using area subtraction [87][88][89]. One could further mitigate the impact of pileup by using a version of N jet that includes jet trimming [90] in closed form [15], and we expect the collinear function for the trimmed N jet version could be calculated using the same techniques used here.

Monte Carlo comparisons
We now compare our analytic prediction in eq. (5.16) with parton shower Monte Carlo generators. As discussed in section 5.4, the rate σ( N jet = 2) only enters as an overall normalization factor, and it in principle requires resummation of logarithms of R and z cut to obtain a reliable prediction. Since we are mainly interested in describing only the nearinteger (and not the exact-integer) behavior, we will divide out by an overall normalization factor and perform only a shape comparison. In all of the plots below, we normalize the cross section to the region ∆ 2+ ∈ (10 −4 , 10 −2 ).
To make an apples-to-apples comparison of our analytic results to Pythia 8 and Herwig++, we turn off hadronization in the Monte Carlo generators. In principle, the collinear functions C q and Cq should get hadronization corrections, but the non-factorizing nature of N jet means that we cannot adopt a standard shape function analysis [91,92]. We run at a sufficiently high energy to allow for a comparison over a wide logarithmic range in ∆ 2± . The large collision energy we choose, Q = 10 TeV, also mitigates the effect of low energy cutoffs ( 1 GeV) on the parton shower. Because the singular terms in our calculation are numerically dominant for small ∆ 2± , higher-order logarithms may be important in determining the shape of the distribution in this regime. Some of the O α 4 s effects are captured by the convolution structure in eq. (5.16), but they are not fully reliable. Therefore, we include an uncertainty in our predictions derived from the addition of O α 3 s L 2 (∆ 2± ) and O α 4 s L 3 (∆ 2± ) terms with unknown coefficients that we vary. These terms represent the leading logarithms at each order, and are an estimate of missing higher-order terms. The form we use to determine the uncertainty is 3,± L 3 (∆ 2± ), (7.1) and we vary independently the coefficients in the ranges κ 2,± ∈ (−5, 5) and κ 3,± ∈ (−15, 15). These coefficients are of similar size as the leading coefficients κ 3,± ∼ O(10). The uncertainty band is given by the envelope of these variations, including effects both on the cross section normalization (where we just add eq. (7.1)) and on the cross section shape (where we add eq. (7.1) and readjust the normalization in the ∆ 2+ ∈ (10 −4 , 10 −2 ) window).
In figure 10, we show the shape comparison between our result and Pythia 8/Herwig++ over the range 2 < N jet < 3. The singular cross section for ∆ 3− is included as part of the nonsingular correction to the ∆ 2+ distribution (see eq. (5.15)). Overall, we find good agreement with the Monte Carlo generators within uncertainties. It is amusing that there is such close agreement with Pythia 8 in the N jet → 2 region and with Herwig++ in the N jet → 3 region, with the analytic result effectively interpolating between the two. Given that the ∆ 2+ and ∆ 3− parts of the distribution are dominated by different phase space configurations (see table 1), it is possible that we are seeing the impact of different shower ordering variables in Pythia 8 (p ⊥ -ordered) versus Herwig++ (angular-ordered). Of course, the theoretical uncertainties in our calculation are too large to make a definitive statement.
In figure 11, we compare distributions of ∆ 2± over the range (10 −4 , 1), using the triptych format to also see the ∆ 3− region. Again, our analytic prediction reproduces the shape of the Pythia 8 and Herwig++ distributions remarkably well over the whole range. The Monte Carlo distributions, which include higher-order logarithms of ∆ 2± through multiple emissions, generally lie within the uncertainty band of our prediction for N jet > 2. This indicates that we have made a reasonable estimate of the higher-order corrections not included in our calculation. Because we normalize to the window ∆ 2+ ∈ (10 −4 , 10 −2 ), it is JHEP05(2015)008  Figure 11. Same distributions as figure 10, but in triptych form to highlight the near integer behavior. Because we normalize only to the ∆ 2+ ∈ (10 −4 , 10 −2 ) window, the normalization differences in the ∆ 2− region are accentuated; the shape agreement is much better. For the ∆ 3− distribution, we extend our central prediction (with a dotted blue line) to ∆ 3− < (z cut /2) 2 , where genuine 3-jet events, whose contribution we have not calculated, contribute to the observable. Note that the ∆ 2− and ∆ 3− axes run backwards. not surprising that there is a normalization discrepancy in ∆ 2− , though one can see that the shape agreement for N jet < 2 is excellent. We suspect that the normalization issues for N jet < 2 may be due to the treatment of correlated soft radiation (and angular ordering) in the Monte Carlo generators, since the ∆ 2− distribution is dominated by double-soft emissions with a C F C A color structure. The overall shape agreement between Pythia 8 and Herwig++ is quite surprising, given that they exhibit a factor of 3 difference (not shown) in their predicted value of σ( N jet = 2), suggesting that the shape of the non-integer distributions is more robust than the cross section at integer values.

Conclusions
In this paper, we studied the analytic properties of N jet , a jets-without-jets event shape that can return a fractional value of jet multiplicity. Focusing on e + e − → jets, we calculated the distribution of N jet in the vicinity of a dijet configuration at O α 2 s . A fractional number of jets requires at least three collimated partons, such that for e + e − → jets, the first nontrivial contribution requires at least two emissions. The singular parts of this emission structure can be captured using 1 → 3 splitting functions, and we validated this approach (and included nonsingular contributions) using the fixed-order code Event2. To partially capture higher-order effects out to O α 4 s , we included convolutions from different phase space regions and running coupling effects. We found very good agreement between our calculation and Monte Carlo distributions from both Pythia 8 and Herwig++.

JHEP05(2015)008
Fractional jet multiplicity exhibits unique analytic features that are not shared by other jet observables. At O α 2 s , we showed how rapidity-like divergences, related to the energy sharing between emissions, appear when one or two partons become soft, and we explained how to regulate them. Beyond the fixed-order result, we proposed a candidate local factorization theorem and used it to predict a hybrid jet algorithm/event shape behavior for the N jet distribution in the vicinity of an integer. At exact-integer values, N jet behaves much like a standard jet algorithm, yielding spikes in the N jet cross section. The near-integer behavior of N jet is more characteristic of event shapes, where towers of higher-order logarithms give rise to (single-logarithmic) suppression of the singular phase space, yielding shoulders in the N jet cross section. Finally, as opposed to typical event shapes, collinear emissions do not generate logarithms of N jet , so the shape of the near-integer distribution is entirely determined by soft logarithms. These soft logarithms are purely non-global, as near-integer values force soft partons to lie in a restricted angular region of phase space and correlate different emissions (even if generated from an Abelian matrix element).
Beyond our present understanding of N jet , there are three key directions to pursue. The first is to extend our calculations to O α 3 s . Though the differential cross section shapes were largely within our uncertainty estimates, the normalization differences seen between our analytic calculations and Monte Carlo generators suggest that higher-order terms might be relevant. While we were able to estimate some O α 4 s effects through convolutions, there are genuine O α 3 s effects that may get a phase space enhancement in the merged jet region (relative to the O α 2 s phase space) to partially overcome the α s suppression. The second is to attempt an understanding of logarithmic resummation. Because of the non-global nature of the observable, standard renormalization group methods will not work, but there may be a way to exploit the rapidity-like divergences seen in section 3 or the techniques introduced in refs. [93,94]. The third is to perform a detailed study for hadron colliders like the LHC, as discussed in section 6, including the important effect of hadronization. Fractional jets offer a more nuanced understanding of jet formation than is possible with standard jet algorithms, and a measurement of N jet seems feasible given the increasingly sophisticated approach to jet physics at the LHC.

JHEP05(2015)008
A Results for non-Abelian contributions Following the Abelian results in section 4.1, here we show the singular C F C A and C F T R n f contributions to the ∆ 2± and ∆ 3− cross sections. The C F C A terms are where I Ω is defined in eq. (4.29), and the remaining angular integrals can only be done numerically: I The C F T R n f terms are The C F T R n f contribution to ∆ 3− is purely power suppressed.

B Properties of distributions
In this appendix we collect useful formulae for the convolution of one-sided distributions, and then discuss two-sided distributions and their convolutions. The relations given here are straightforwardly derived from the results and techniques in ref. [95]. Throughout this paper, we use the standard plus distribution notation. For a function q, For many applications, one often makes use of convolutions between plus distributions, especially L n . The convolution is defined as We can take the Fourier transform F to make the convolution multiplicative, and also use the Fourier transform and its inverse to determine a convolution: It is straightforward to determine the convolution between two plus functions L k and L n by relating it to eq. (B.4): (B.8) Using the Fourier transform with s as the conjugate variable, it is straightforward to show This approach is similar to the one used in ref. [95] to give the general form of the convolution (L k ⊗ L n )(x), where an equivalent result to eq. (B.10) is given.

JHEP05(2015)008
We give the first few convolutions here for convenience: These distributions only have support for x > 0, and appear in many applications.

B.2 Two-sided distributions and their convolutions
In this work, we encountered observables whose cross sections have support for all x and behave like distributions as x → 0 ± . We will refer to them as two-sided distributions (one may think of the usual distributions as one-sided), and we now discuss convolutions of them.
Consider an observable x with singular behavior for x → 0 ± . The fixed-order singular behavior of x is described by distributions L n (x ± ), where x + = x for x ≥ 0 and x − = −x for x ≤ 0, so that Convolutions between various L n (x ± ) will mix the x ± distributions.
First, we note that convolutions between two x + distributions or two x − distributions are effectively one-sided, meaning the above results can be used without modification. Convolutions between an x + and an x − distribution are the novel ones we derive here, adapting the technique in appendix B.1 to find the general form of two-sided convolutions.
Using the Fourier transforms of the x ± distributions

C Quark and gluon collinear functions
We summarize here results for the collinear functions for quark-and gluon-initiated jets through O α 2 s . We express our results for dijet observables (∆ 2± and ∆ 3− ), but they can be applied to the more general n-jet case of ∆ n± as well. The quark collinear function can be derived from the calculations in section 4.1 and appendix A. The gluon collinear function calculation proceeds in analogy with the quark case, and there is a strong relation between the quark and gluon results for nearly all coefficients.

JHEP05(2015)008
To O α 2 s , the quark and gluon collinear functions contribute to ∆ 2± and ∆ 3− , with where i = q, g for the different parton types, and the subscripts on K (i) indicate the contribution of an individual jet region to N jet . The O α 2 s terms are given by The angular integral factor I Ω is defined in eq. (4.29) and I

JHEP05(2015)008
Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.