Non-Global Logarithms in Filtered Jet Algorithms

We analytically and numerically study the effect of perturbative gluons emission on the"Filtering analysis", which is part of a subjet analysis procedure proposed two years ago to possibly identify a low-mass Higgs boson decaying into b\bar{b} at the LHC. This leads us to examine the non-global structure of the resulting perturbative series in the leading single-log large-N_c approximation, including all-orders numerical results, simple analytical approximations to them and comments on the structure of their series expansion. We then use these results to semi-analytically optimize the parameters of the Filtering analysis so as to suppress as much as possible the effect of underlying event and pile-up on the Higgs mass peak reconstruction while keeping the major part of the perturbative radiation from the b\bar{b} dipole.


Introduction
In recent years, there has been a growing interest in jets studies in order to identify a boosted massive particle decaying hadronically, for instance the W boson [1,2,3,4], top quarks [5,6,7,8], supersymmetric particles [9,10] and heavy resonances [11,12,13] (see also [14] for related work on general massive jets). Some of these studies revealed themselves to be successful in looking for a boosted light Higgs boson decaying into bb at the LHC [8,15,16,17]. That of [15,16] can be briefly summed up as follows: after having clustered the event with a radius R large enough to catch the b andb from the Higgs decay into a single jet, 1 this jet can be analysed in 2 steps: • A Mass Drop (MD) analysis that allows one to identify the splitting responsible for the large jet mass, i.e. separate the b andb and thus measuring the angular distance R bb between them, while suppressing as much QCD background as possible.
• A Filtering analysis where one reclusters the 2 resulting subjets with a smaller radius and takes the 3 highest-p t subjets 2 obtained in order to keep the major part of the perturbative radiation while getting rid of as many underlying event (UE) and pile-up (PU) particles as possible (used also in [8,17,18], and a variant is proposed in [19]).
Concerning the MD analysis, the only thing we need to know is that we end up with 2 btagged jets, each with a radius roughly equal to R bb . Notice that due to angular ordering [20,21,22,23,24], these 2 jets should capture the major part of the perturbative radiation from the bb dipole. The whole procedure is depicted in figure 1 (taken from [15]  In this paper, we are going to focus on the Filtering analysis. One can generalize it with respect to its original definition using 2 parameters, n filt and η filt (as discussed also in [19]): after the MD analysis was carried out, one reclusters the 2 resulting subjets with a radius R filt = η filt R bb and takes the n filt hardest jets obtained. Obviously, the larger the value of η filt the more perturbative radiation we keep, but also the more the UE/PU degrades the Higgs peak. The same holds for n filt . So there is a compromise to make between losing too much perturbative radiation and being contaminated by soft particles from UE/PU. In [15], the values that were found to give nice results were n filt = 3 and η filt = min(0.3/R bb , 1/2). However, these values had been chosen on brief Monte-Carlo event generator study and one would like to gain a little more analytical control over them. One question would be for instance to understand even approximately how the optimal (n filt ,η filt ) values change when one increases the Higgs p t H cut, or when the PU becomes more and more important during the high luminosity running of the LHC. Though the MD and Filtering analysis were originally designed to identify a light Higgs boson, one should be aware that similar calculations may apply in other uses of filtering, for instance to study any boosted colorless resonance decaying hadronically, including W and Z bosons.
The article will be devoted in large part to the study of the dependence of the perturbative radiation loss with respect to the filtering parameters. As usual in this kind of work, large logarithms arise due to soft or collinear gluon emissions, and one is forced to deal with them in order to obtain reliable results in the region where the observable is sensitive to this kind of emission. We will thus compute analytically the two first orders in the leading soft logarithmic (LL) approximation when n filt = 2 and to all-orders in the large-N c limit 3 when n filt = 2 or 3 for small enough values of η filt (section 2). With these in hand, and using a program that allows one to make all-orders leading-log calculations in the large-N c approximation, we check the analytical results and examine if the small-η filt limit and/or the truncation of the LL expansion can be trusted to estimate the loss of perturbative radiation in practice (section 3). Finally, in section 4, we will analyse the Higgs mass peak width due respectively to the loss of perturbative radiation and to the presence of UE/PU, before combining them in a simple and approximate but physically reasonable way in order to be able to conclude about the optimal parameters choices.

.1 The filtered Higgs mass: a Non-Global observable
It is now very well known [25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40] that soft or collinear gluons can give rise, in multiscale problems, to the appearance of large logarithms in the perturbative expansion of an observable, and more precisely in a region of phase space where it is sensitive to the soft or collinear divergences of QCD. In this article, the observable considered is ∆M = M H − M filtered jet , where M filtered jet is the reconstructed Higgs-jet mass and M H is its true mass. ∆M has the property that it is 0 when no gluon is emitted. We are interested in Σ(∆M ), the probability for the difference between the reconstructed and true Higgs masses to be less than a given ∆M . In this case, large soft logarithms have to be resummed at all-orders to obtain a reliable description of the small ∆M distribution.
For this observable, soft gluons emissions lead to powers of ln M H ∆M , whereas collinear gluons emissions leads to powers of ln R bb R filt . In this study, gluons are strongly ordered in energy (the first emitted gluon being the most energetic one, and so on), and we aim to control the α s ln M H ∆M k series, in a region where Therefore, at leading-log accuracy, one has to resum terms like where all the f k are functions to be computed. We thus disregard all the subleading terms, i.e. those suppressed by at least one power of ln M H ∆M . Unfortunately, such a calculation is highly non-trivial due to the fact that the observable is non-global. This property, first studied in [35], means that it is sensitive to radiation in only a part of the phase space. In the case of ∆M , only emissions of gluons outside the filtered jets region contribute to the observable (cf figure 2). As a consequence of this property, one must consider soft gluons emissions not just from the bb dipole (usually called primary emissions, the only ones that would be present in QED) but also from the whole ensemble of already emitted gluons [35,41]. As the number of gluons is increased, the geometry and the color structure of all these gluons become rapidly too complex to perform an analytical calculation. Therefore, to deal with this, one is forced to apply numerical Monte-Carlo calculations that can easily take care of the geometry. But the colour structure remains prohibitive, and one must usually also resort to the large-N c approximation in order to go beyond the 2 first orders in perturbation theory [35,41,42,43] (though some authors have derived some analytical results in special cases [44,45] and others have examined contributions beyond the leading large-N c approximation [46,47]).
However, before considering numerical calculations, some results can be derived analytically at 2 nd order for n filt = 2 (where f 1 and f 2 are computed exactly) and n filt = 3 (where only the leading behaviour of the f k in ln R bb R filt and N c is looked for). In each case, the hardest gluon 1, which is inside the filtered jet region, emits a softer gluon 2 outside the filtered jet region.

Some results for n filt = 2
Perturbatively, one can write Σ(∆M ) as where I k (∆M ) is the O α k s contribution to the observable. To simplify the calculation, Σ(∆M ) will be computed using the anti-k t algorithm [48], even if the numerical study will be done using the C/A algorithm [49,50] to be in accordance with the choice in [15]. However, the anti-k t algorithm is enough to catch the dominant behaviour of the leading-log series, in the sense that it does not affect the leading large collinear logarithm in the function f k at small R filt : i.e. a k is unchanged when moving from C/A to anti-k t . 4 This jet algorithm gives simpler results because the gluons outside the filtered jet region tend not to cluster with the ones inside. It is this property which ensures that the hardest jets in an event are generally perfect cones, as particles usually cluster with the hardest ones in their neighbourhood first [48]. As a first step, primary emissions are considered, defined to be those one would obtain if gluons were only emitted from the bb dipole (as for photons in QED).

Primary emissions
Due to the use of the anti-k t algorithm, the result of the primary emissions can be easily shown to exponentiate, as will be roughly seen in the next section with the O α 2 s analysis. 4 When R filt ∼ R bb 2 , the discarding of the O ln k−1 R bb R filt terms is not a priori justified, but fig. 5, which compares numerical results obtained using C/A with analytical estimates using anti-kt, supports the dominance of the leading collinear logarithms.
Here, we just review the very well known result that the contribution to Σ(∆M ) from primary emissions, denoted Σ (P ) (∆M ), can be written as: 5 with: (6) M (k 1 ) is the matrix element squared for emitting one soft gluon from the bb dipole (the b quark is taken to be massless): Concerning the notations, Θ k 1 ∈ J bb equals 1 when gluon 1 is emitted inside the jet regions around b andb, denoted by J bb (and is 0 otherwise), which, for R filt < R bb , is just 2 cones of radius R filt centered on b andb (figure 2(a)). Then, concerning the expression in brackets in eq. (6), we separate the 2 different regions where the gluon can be: either inside or outside the filtered Higgs jet. The first term Θ( k 1 ∈ J bb ) means that the gluon does not contribute to the observable (as it is kept in the Higgs jet, the reconstructed Higgs mass is the true Higgs mass: ∆M (k 1 ) = 0). If the gluon is outside the filtered jet region (second term), then it does contribute to the observable: up to prefactors that can be neglected in the leading-log approximation, see appendix A. Finally, the −1 stands for the virtual corrections, for which there's obviously no loss of mass for the Higgs, and whose matrix element is just the opposite of the soft real one. 6 One thus obtains: The computation of this integral in the boosted regime, where p tH ≫ M H , or equivalently R bb ≪ 1, is done in appendix A. From now on, we will essentially use η filt = R filt /R bb instead of R filt and we define n ≡ n filt and η ≡ η filt for more clarity in mathematical formulae. In order to keep in mind that it depends on the 2 parameters of the Filtering analysis, the distribution Σ(∆M ) is renamed Σ (n) (η, ∆M ). What we obtain at fixed coupling is the following: 7 5 The superfix (P ) serves as a reminder that only primary emissions are being accounted for. 6 Even if the result seems obvious here, this way of doing the calculation can be easily generalised to higher orders and other kinds of jet algorithms. 7 To obtain the result at running coupling, one simply makes the replacement (see eqs. (29,30) later in the article): We give the value of J(1), a quantity that is important to discuss some aspects of the results obtained in the following sections: Notice that the case η > 1 will not be used in this study, but is mentioned in appendix A. The function J(η) is plotted in figure 3. Two remarks can be made: 1. The result does not depend on the energy fraction z of the Higgs splitting into bb.

Non-global contributions
Now, we turn to the O α 2 s term, and more precisely to the contribution of the non-global terms that have to be added to the primary logarithms computed in the previous section. That corresponds to the analysis of I 2 (∆M ) in the perturbative expansion of Σ(∆M ) from eq. (3). The matrix element squared for 2 real gluons emission is expressed as [24,35,51,52]: with This expression is valid when there is a strong energy ordering between the two real gluons 1 and 2, either E 1 ≫ E 2 or E 2 ≫ E 1 (the formula is completely symmetric under the interchange k 1 ↔ k 2 ). For the cases with one or both gluons being virtual, the following matrix elements are obtained, valid only when E 1 ≫ E 2 [52]: Using these properties, separating the 4 phase space regions depending on whether the gluons are inside or outside the filtered jet region in the same way as was done for I 1 , and defining dk as: we can then write I 2 in the following form: For each phase space region, the 4 terms (k 1 , k 2 ) = (real,real) − (real,virt) − (virt,real) + (virt,virt) are considered. The strong energy ordering E 1 ≫ E 2 implies that ∆M (k 1 , k 2 ) = ∆M (k 1 ), and one immediately gets: where I is just the second order contribution to the primary emissions, already computed above. To be convinced, one can notice that (4πα s ) 2 W 1 can be expressed as the product of 2 one-gluon matrix elements M (k 1 )M (k 2 ) and, when E 1 ≫ E 2 , if k 1 and k 2 belong to the same phase space region. Therefore I (P ) 2 can be written in a more symmetric way: so that it corresponds to the second order perturbative expansion of the result eq. (5), obtained with primary emissions only. The important term for this section is the one containing W 2 , denoted by I . As mentioned in section 2.1, it receives a non-zero contribution when the hardest gluon 1 is emitted inside the filtered jet region whereas the softest gluon 2 is emitted outside. For the opposite configuration, there is an exact cancellation between gluon 2 being real and virtual.
Here again the computation of I (N G) 2 is postponed to appendix A, giving directly what will help to interpret some results later. S 2 is defined such that where we explicitly introduce the dependence on η and we factorize out the soft divergence, still revealed in the large logarithm ln M H ∆M . When η < 1/2, the result for S 2 can be written as: The important point to notice in this result is the absence of collinear logarithms, which would appear as ln 1 η , contrary to the primary emission case (eq. (11)). So that the primary emissions dominate for this observable, at least for η sufficiently small.
As mentioned in previous studies [35,53], one notices the presence of "π 2 terms" in non-global results at second order.

Some results for n filt = 3
The goal in this section is to have an estimate of the analytical behavior in the large N c limit of Σ (n) (η, ∆M ) for n = 3, which is the probability of having no second gluon emission leading to a ∆M ′ greater than ∆M . Notice that, contrary to the previous part where we obtained the function Σ (2) , only the leading behavior in L = ln 1 η and N c will be derived, so that in this context Σ (2) can be simply written: 8 where for further convenience we introduce the parameter t = αs 2π ln M H ∆M and we change the arguments of Σ which becomes now a function of L and t. In this formula, 2L = 2 ln R bb R filt can be interpreted as the "logarithmic size" of the bb dipole, i.e. the allowed phase space in rapidity for an emission from this dipole (in its center of mass frame) outside the jet region. The parameter t means that this emission cannot occur with a t ′ between 0 and t. Now we turn to Σ (3) (L, t). To have no second gluon emission in [0, t], either there is no first gluon emission in [0, t] outside the jet region (which corresponds to Σ (2) (L, t)), or there is such an emission but the new dipole configuration is prohibited from emitting a second gluon in [0, t] outside the jet region. This is depicted in figure 4. As the calculation is done in the large-N c limit, after the emission of a first gluon, the second one cannot be emitted from the bb dipole, but only from the bg andbg ones. Fig. 4 can be translated mathematically as: Notice that L bg , the logarithmic size of the bg dipole in figure 4, does not depend on l in the leading collinear log approximation. 9 In this expression, 4N c LΣ (2) (L, t ′ )dt ′ is the probability not to emit the first gluon in [0, t ′ ] and to emit it only at t ∈ [t ′ , t ′ + dt ′ ]. The remaining part is the probability to emit no second gluon from the bg and bg dipoles in [t ′ , t]. Using eq. (25) for Σ (2) , Σ (3) is then given by: 9 One can easily show the following relation: Expanding the exponential and keeping the term of order k gives The leading O (Lt) k term is already taken into account in eq. (27). Therefore, including the l dependent component of L bg gives rise to terms of the form N k c L k−2 t k at order k, suppressed by 2 powers of L with respect to the leading one.
Two limits can be considered: The limit 4N c Lt ≪ 1 reveals two important aspects: 1. One can notice the absence of the O(Lt) term, which is indeed the goal of the filtering analysis as it was presented in its original version [15]: it is intended to catch the major part of the O(α s ) perturbative radiation. It cannot catch all the O(α s ) contribution because a hard gluon emitted at an angle θ > R bb from the b andb escapes the filtering process as it is rejected by the Mass Drop analysis. Therefore, when 4N c Lt ≪ 1, the expansion eq. (28) misses a term O(N c t), but this is legitimate in a leading collinear log estimate. Notice that the missing term is simply −J(1)N c t where J(η) was given in eq. (12).
2. it shows that the purely non-global result for n = 3 contains large collinear logarithms L, contrary to the case n = 2 (eq. (24)). Indeed, the primary result for n = 3 at second order can be proven to behave as 10 −32C 2 F (Lt) 2 at order α 2 s , so that the S 2 term for n = 3 should be equivalent to −8C F C A (Lt) 2 at large L.
Having understood some analytical features of the Filtering analysis, we now examine what can be learnt from a numerical calculation of the reconstructed Higgs mass observable.

Non-Global structure: numerical results
In all that follows t is defined so as to gather all the information about the soft logarithms in a running coupling framework: where the last equality holds at the one-loop level and β 0 = . The argument of α s was taken as the gluon's transverse momentum with respect to the Higgs boson direction, of order k t M H pt H , k t being its transverse momentum with respect to the beam. In the case of a fixed coupling constant α s , the definition for t here coincides with that of section 2.3: 10 In fact, one can show the following general estimate for the primary emissions in the leading soft and collinear approximations: But from now on, and unless stated otherwise, t is given in the running coupling framework, eq. (29), and the function Σ(η, ∆M ) is rewritten as Σ(η, t).
To get an idea of the range of values covered by t, To numerically investigate non-global observables, two approaches can be followed: • an all-orders approach where one resums the leading-logs at all-orders in the large-N c limit, the output being the function Σ(t), i.e. the probability that the loss of perturbative emission results in a Higgs mass in the range simply obtained by inverting the relation eq. (29).
• a fixed-order approach where the first few coefficients from the expansion of Σ(t) are , then the program returns the first few coefficients c k .
From a numerical point of view, the way to write an all-orders program was explained in [35]. On the other side, a result at fixed-order may be obtained by developping a systematic approach like the one presented at second order in eq. (19). For the filtered Higgs jet mass observable, we used the Fastjet package [54] to perform the clustering (and mass-drop + filtering) with the C/A algorithm, consistently with the choice made in [15].
As the all-orders program gives immediately what we are looking for, which is Σ(t), we will use it (section 4.1) to compute the perturbative Higgs width. But in order to check it and be confident with the results obtained, we compare them with the previous analytical estimates and see how well the perturbative leading log series fits them. This leads us to study the behaviour of the higher order terms and to gain a better understanding of the convergence and structure of the non-global series. Though treated in more details in appendix C, the main points are mentioned in this section.

Comparison with analytics
Using the all-orders Monte-Carlo program, a comparison between the all-orders numerical curves obtained using the C/A algorithm and their corresponding analytical estimates obtained previously with anti-k t in eqs. (25,27) can be done. The results are presented in figure 5 and show good agreement, at least in the region of physical t values.
Notice that the slight discrepancy between analytical estimations and numerics starts to occur at t > 0.1, which is at the edge of the physical region (cf table 1), beyond which ∆M would be below the perturbative scale of around 1 GeV. This agreement manifests that: • In the physical region, the leading terms in (α s Lt) k , with L = ln 1 η seem to completely dominate and we do not need to compute the subleading corrections.
• One can use these analytical expressions to get an accurate estimate of the reconstructed Higgs peak width.

Comparison with fixed-order results
The structure of the non-global series at fixed-order is now examined so as to independently cross-check the all-orders program and to understand if the perturbative leading-log series can be usefully truncated.
As an example, figure 6 compares the all-orders result to the fixed-order ones up to α 5 s for n = 2 and two different values of η (only the coefficients with an uncertainty of at most a few percent are plotted 11 ). The curves are represented up to t = 0.3, which is far beyond the physical region but is instructive to study the convergence of the series. The left plot for η = 0.3 shows a nice convergence of the perturbative series eq. (3), as the t range for which the all-orders and fixed-order curves coincide grows with k. However, the second plot for η = 0.9 gives an unexpected result: the fourth order diverges with respect to the third one, in the sense that the point of disagreement is shifted to smaller t. The question arises whether this divergence will remain at higher orders. To answer it, one needs to go further in perturbation theory. In appendix C, a parallel is made between the filtered Higgs jet observable and the slice observable, studied for instance in [41], for which, due to computationnal speed, it is possible to obtain reliable coefficients up to order 6. The same effect is observed and is even enhanced at orders 5 and 6. Therefore, it seems that the fixedorder information cannot be safely used in general: one has to be aware that the leading-log large-N c non-global series may be divergent for any value of t.

Choice of the filtering parameters
In the previous sections we examined the structure and convergence of the perturbative leading-log series, analytically and numerically. We could then cross-check the analytical expressions and the fixed-order approach with the all-orders program, which we are going to use throughout this part. We would like to decide how one should choose the filtering parameters (n,η) depending on the level of UE and PU as well as the p t of the Higgs boson. Here, we do not claim to make an exact and complete analysis, but we want to obtain some estimates. First, we consider the width of the Higgs mass distribution separately in presence of perturbative radiation (using the all-orders results) and UE/PU (using a simple model for it). Then, we try to minimize the Higgs width in presence of both of these effects. Finally, we will estimate hadronisation corrections.
In all this part, we set the Higgs mass M H at 115 GeV, as in [15].

Study of the Higgs perturbative width
As we could see in the previous sections, even without considering additional particles from UE/PU, ∆M ≡ M H − M filtered jet = 0 because of the loss of perturbative radiation. The Higgs boson thus acquires a perturbative width, denoted δM P T . At first sight, knowing the distributions Σ (n) (η, ∆M ), one might simply define it as: as we do for gaussian distributions for instance. Unfortunately, if we simply take n = 2 as an example and if we consider the primary emission result eq. (10), we can deduce the following distribution for ∆M : with Computing ∆M and ∆M 2 implies dealing with integrals of the form Such integrals give a large importance to the ∆M ∼ M H /2 region, where there should be very few events, and do not describe what happens in the neighbourhood of the peak near ∆M = 0. Therefore, the definition eq. (32) does not seem adequate for the perturbative width. That's why we shall adopt another definition, adapted from [55]. The Higgs perturbative width is defined as the size δM P T for which a given fraction f of events satisfy 0 < ∆M < δM P T . Using the all-orders function previously computed, this is equivalent to solving the equation Σ (n) (η, ∆M ) = f . This leads to the width function δM P T (n, η, f ).   7 shows δM P T as a function of η for n = 2 . . . 6. When ∆M ∼ 50 GeV (i.e. ∼ M H /2), one should be aware that soft approximation loses sense and results on these plots should no longer be taken seriously. We chose the values f = 0.68 and f = 0.95, corresponding respectively to 2σ and 4σ for gaussian distributions, to show that the Higgs mass perturbative distribution is not gaussian (otherwise, going from 2σ to 4σ would have multiplied the width by a factor of 2, see also eq. (33)). One important thing to notice is a kind of "saturation" effect that one observes for η close enough to 1 for every fraction f . It manifests itself as a flat curve at a value δM P T = δM sat (f ), independent of n. For instance, δM sat (f = 0.68) ≃ 1 GeV and δM sat (f = 0.95) ≃ 33 GeV. This can be understood simply by considering that when the radius of the filtering is large enough, say η > η sat (n), it captures (almost) all the particles resulting from the Mass Drop analysis, i.e. all those that are within angular distance R bb from b orb, but it still fails to capture particles outside the Mass Drop region. 12 Of course, the larger n, the smaller η sat (n) as we keep more jets. This saturation property is equivalent to saying that all the functions Σ (n) (η, ∆M ) become independent of n and η when η > η sat (n).
For the rest of this analysis we keep the value f = 0.68, even if it is not clear which value should be chosen, and more generally what should be the relevant definition of the Higgs perturbative width. However, we will mention in section 4.4 what happens if we vary f between f = 0.5 and f = 0.8, so as to obtain a measure of the uncertainty of the calculations.
The curves in figure 7 only give us an overview of the scales involved in the Higgs boson width. But one can go a little further. At small η, we should get a large collinear enhancement revealing itself as a large logarithm L = ln 1 η multiplying t. The perturbative expansion is thus a series in (N c Lt) k . As a direct consequence, at small η, the all-orders function Σ (n) (η, t) can be written as a function of a single variable Σ (n) (N c Lt). Solving the "width equation" gives where t P T is simply related to δM P T by and where C P T (n, f ) is a function, independent of η, which increases with n and decreases when f increases. This is confirmed by figure 8(a) which shows that t P T L is indeed independent of η as long as η and n are not too large.
As an example, for n = 2, let us take the simple result Σ (2) (L, t) = e −4NcLt from eq. (25) in the small η limit. It was shown in section 3.1 that this result is very close to the all-orders one in the physical t region. Solving Σ (2) (L, t) = f immediately implies which, for f = 0.68, gives C P T ≃ 0.032 in accordance with figure 8(a). One observes that t P T L is not strictly speaking a constant for higher n values. This may be due to the saturation effects discussed above. Indeed, even at large L, the perturbative expansion is not only a function of Lt but also of t for the lowest orders, as mentioned at the end of section 2.3: If we only had QED like emissions, i.e. primary ones, with the use of the anti-k t jet algorithm, we would obtain a k = (−J(1)Nc) k k! for k ≤ n − 2, where J(η) was derived in section 2.2. As n increases, the term a 1 t becomes more and more important with respect to a n−1 (Lt) n−1 , leading to larger and larger deviations from the simple law t P T L = constant. However, until   Table 2: C P T and η sat as a function of n when f = 0.68. n = 5, assuming t P T L is a constant at small η seems a good approximation. Therefore, using eq. (39), one can model the Higgs perturbative width in the following form: η sat (n, f ) is given by the intersection between the curve t P T = C P T /L and t P T = t sat . Therefore: Table 2 shows C P T and η sat for f = 0.68 and different n values. Figure 8(b) shows the curves corresponding to the parametrisation eq. (42). We can see that it works rather well for all values of n except n = 2 in the region η ∼ 0.4 − 0.6. This can be improved using the relation J(η)t P T = constant, which works better for n = 2 because it is exact for primary emissions with anti-k t . But implementing it would not change the main conclusions presented in sections 4.3-4.5. Therefore, for the sake of simplicity, we will not use it here: we keep eq. (42) as the expression for δM P T for the rest of this study.
Of course, were it only for the perturbative radiation, it would be nicer to choose η ≥ η sat in order to catch as many gluons as possible, leading to δM P T → δM sat . But we also have to take into account Initial State Radiation (ISR) from the incoming qq pair and nonperturbative effects like PU and UE that can spoil our Higgs neighbourhood, thus increasing the jet mass.
For the purpose of this article we will only add UE and PU to the Final-State Radiation (FSR) effect studied above. We will thus ignore ISR, this partially for a question of simplicity of the analysis, but also because the results of work such as [18,56] suggest that for LHC processes whose hard scales are few hundred GeV, the crucial interplay is that between FSR and UE/PU. This is evident in the preference for small R values in dijet mass reconstructions in those references, where ISR is not playing a major role. Similarly we believe that the optimal values of η that we will determine here will have limited impact from ISR, though we shall not check this explicitly.

Study of the Higgs width due to underlying event and pile-up
For this simple analysis, which does not aim to give precise numbers but only an estimate of the influence of the UE/PU on the mass of the Higgs jet when we vary for instance η, n, or when the Higgs boson becomes more and more boosted, we model the UE and PU as soft particles uniformly distributed in the (y, φ) plane [57,58], and with transverse momentum per unit area denoted by ρ. In order to get this estimate, we consider the simple case of a symmetric (z = 1/2) Higgs decay along the x axis. In the limit M H ≪ p t H , the Higgs momentum p H is given by: The UE/PU momentum, denoted p U E , 13 is simply the sum of all the UE/PU particles g belonging to the filtered jet J. Still in the limit M H ≪ p t H , we recall the following formula [15]: Throughout this section, we will apply it with z = 1/2. We can now write ∆M = M filtered jet − M H as: In the last line we used the approximation: which comes from the fact that the UE and PU particles tend to cluster around the perturbative radiation, which is usually close to the b andb because of the collinear logarithmic divergence of QCD. As all the filtered UE/PU particles flow approximately in the same direction, the remaining sum is just the total transverse momentum of the UE which, by definition of ρ, is equal to ρA, A being the total area of the filtered jets. 14 We thus obtain with for the C/A jet algorithm, taking into account the anomalous dimension that comes from the fact that there should be some perturbative radiation in the jets (cf figure 14 in [57]). Notice that eq. (49) is only true if all the jets do not overlap, so usually when η is small enough. But this is sufficient for the purpose of our study, and we shall use this formula in all the following calculations. The correction eq. (48) for ∆M only induces a shift towards higher masses of the Higgs mass peak. However, there are 3 sources of fluctuations that give a width to this Higgs peak: 1. ρ is not strictly uniform in the (y, φ) plane in a given event.
2. ρ is not the same from one event to the next.
Following [57], we can write the total UE/PU transverse momentum contributing to the Higgs p t as where σ = ρ 2 − ρ 2 with ... a spatial average in a given event , δρ = ρ 2 − ρ 2 with ... an average over events , Σ = A 2 − A 2 with A the average over events of the filtered jets' area .
For pure UE events, i.e. without PU, these terms can be estimated [57,58]: Though ρ U E seems to be around 2 GeV/area, the tuning used in [15] was closer to 3 GeV/area, the value that we choose here. In presence of PU, i.e. when there is more than 1 pp collision per bunch crossing at the LHC (thus leading to the emission of other soft particles), ρ, σ and δρ have to be modified. We define N P U to be the number of pp collisions in a bunch crossing except the one at the origin of the hard interaction. We use a simple model to write the parameters of the UE/PU as: Some comments are needed: since ρ measures the level of noise, it should grow like N P U . In the expression 1 + N P U /4, the 1 corresponds to the pp collision that leads to the UE and to the hard interaction, whereas the N P U /4 term simply corresponds to the other pp interactions and could be derived from the numbers given in [59]. The intra and inter events fluctuations of ρ are modelled as growing like √ ρ: we thus just give σ and δρ the factor 1 + N P U /4, though further studies might be of value to parametrize these terms in a more adequate manner. Notice that the value given for δρ ignores the fluctuations in the number of PU events from one bunch crossing to the next, but this is beyond the accuracy of our model here. At high luminosity at LHC, N P U is expected to be ∼ 20, which implies ρ ∼ 10 − 20 GeV [59,60,61].
Assuming gaussian distributions for these three kinds of fluctuations, one can deduce the Higgs width due to the presence of UE/PU, 15 For a gaussian peak, defining a 2σ width means that we keep roughly 68% of the events around the average, which is in correspondence with the value f = 0.68 chosen for the perturbative calculation.
We now have all the important results in hand to consider both UE/PU and FSR simultaneously.

Study of the Higgs width in presence of both UE/PU and perturbative radiation
The purpose of this part is to give an estimate of how one should choose the couple of filtering parameters (n,η). For that, one has to convolute the effects of UE/PU and perturbative radiation and compute the resulting reconstructed Higgs peak width, and then minimize it with respect to the filtering parameters. This is highly non trivial to do analytically and we leave it for future work. The simple choice made here is to say that, for a given n, the optimal η, denoted η opt , is the one for which the two widths are equal. This is obviously not true in general, but seems reasonable to obtain an estimate (figure 9) and to understand how η opt changes when we vary p t H and N P U . Notice that, using this method, we have to impose η opt < η sat where η sat is the saturation point (eq. (43)), because beyond η sat , increasing η makes δM U E larger without decreasing δM P T , thus solving the equation δM P T = δM U E has no sense in this region. Finally, we numerically minimize δM 2 P T + δM 2 U E , calculated at η = η opt (n), with respect to n in order to find n opt . if the 2 distributions were gaussians, i.e. δM tot = δM 2 P T + δM 2 U E when n = 3. In this case, η opt , though slightly larger, is approximately given by the intersection of the 2 curves, at least as long as η is not in the saturation region.
First, we would like to understand how η opt evolves with respect to the physical parameters. The equality δM P T = δM U E gives an equation in L = ln 1 η : where the coefficients c σ , c δρ and c Σ can be easily calculated using eqs. (49,(54)(55)(56)(57)(58)(59)(60)61): c δρ (n, N P U , R bb ) ≃ 0.8πnR 2 bb 1 + If the solution of eq. (62) for a given n is found to be above η sat (n, f ), then η opt = η sat (n, f ) in order to take the saturation of δM P T into account. We start by solving this equation numerically. In figure 10 we show η opt as a function of p t H and N P U for different values of n. As it should, η opt increases with p t H at fixed N P U . Indeed, if p t H grows at fixed η, R bb decreases and so does the effect of UE/PU, whereas the perturbative radiation is kept fixed (no dependence on R bb ). Notice also, for n = 3, that the values obtained for η opt are roughly consistent with the choice in [15] where we had η = min(0.3/R bb , 1/2). The saturation comes into effect at relatively low p t H , around 400 − 500 GeV. Above this value, the total width is small and hadronisation corrections start to become relevant, so that the results presented on these plots become not very reliable. However, for p t >∼ 500 GeV and η > η sat , the Higgs width due to perturbative radiation and UE/PU vary slowly with η and we thus believe that the precise value chosen for η is not so important: one can take any value above η sat without changing the result too much. The decrease of η opt with N P U seems to be weaker than one might have expected a priori. However, in fig. 9, we can see that the negative slope of the perturbative width is very large, and therefore increasing the noise from PU will not change too much the η opt value. It would be interesting to understand analytically the evolution of η opt with respect to the physical parameters p t H and N P U . Unfortunately, eq. (62) cannot be easily dealt with. That's why we have to make an approximation: in this equation, one of the 3 terms under the square root may be dominant when η = η opt . At first sight, one would expect that at low η opt , the c 2 σ e −2L term, which scales like η 2 , should be the largest, whereas at large N P U , it should be the c 2 Σ e −4L term that is the largest one as it scales like N 2 P U . But figure 11 for n = 3 reveals that the c δρ term surprisingly brings the largest contribution to δM U E for physical values of the parameters (the same holds for other values of n). Therefore, to simplify things a little, one can consider eq. (62) and put c σ = c Σ = 0. However, to be more general, and to consider the possible situation where one of the other terms might be dominant, 16 we rewrite eq. (62) in the following approximate form: where p = 1 if the c σ term dominates and p = 2 otherwise. Moreover:  Figure 11: δM U E computed at η = η opt with respect to (a) p t H when N P U = 0 and (b) N P U when p t H = 200 GeV. On these plots is also represented the contribution to δM U E of each term separately. When the UE/PU width falls below the saturation line δM U E = δM sat , then η opt = η sat .
Eq. (66) can be written in a slightly different way: with: Despite its simpler form, eq. (68) for L cannot be solved analytically. Here comes the second approximation, which is to make a perturbative expansion: Neglecting the O (α s L) 2 term, the resulting quadratic equation immediately implies Taking into account the saturation effect, η opt is then given by: We used this expression with C U E corresponding to the δρ term in eq. (67) and p = 2 in order to plot the approximate solutions in figure 10. This reveals that the above relation for η opt (eq. (73)) works rather well, within a few %.
As a second step, we would like to find the optimal n, denoted n opt . This also should depend on the way UE/PU and perturbative radiation are combined. However, as a simple approximation, one can combine them as if they were both gaussian distributions. Therefore, one should minimize δM tot (n) = δM 2 computed at η = η opt (n) for a given p t H and N P U . The results are plotted in figure 12. We can notice that the larger n, the narrower the peak, and thus the better the result. However, one should keep in mind that when n increases, the optimal R filt = ηR bb becomes small, and we have to deal with hadronisation corrections that grow as 1/R filt [56] as well as detector resolution and granularity δη × δφ = 0.1 × 0.1 that both start to have an important impact on the reconstructed Higgs width, and thus degrade the results presented here. In section 4.5 we will examine what happens when we include a very rough estimate for hadronisation corrections. However, at first sight, it seems that one should definitely not take n = 2. The value n = 3 chosen in [15] is good, but it may be possible to do better with n = 4. Beyond this value, the optimal R filt falls below ∼ 0.2 (cf figure 10), which is too small for this study to be fully reliable, as we shall see in section 4.5.

Variations of the results with z and f
Until now, we have only presented some results for f = 0.68 and z = 1/2, z being defined as with E i the energy of particle i in the Higgs splitting into bb. What happens if we change these values? Let us start with z. Though the Higgs splitting into bb is more often symmetric than in QCD events (and this is what was used in [15] to distinguish it from pure QCD splittings), it still has a distribution in z that is uniform in the range: which follows from simple kinematics in the limit m b = 0. But in order to reduce the large QCD background, one usually cuts on small z, so that with z cut ∼ 0.1. As an example, assume the b quark carries the fraction z of the Higgs splitting. In such a case, b andb are not equidistant from the Higgs direction: they are respectively at an angular distance (1 − z)R bb and zR bb from H (see for instance figure 16 in appendix A). Therefore, as UE/PU particles tend to cluster around the perturbative radiation, eq. (47) has to be modified: for a given UE/PU particle g in the filtered jet. This leads to the modification of eq. (46) according to g being relatively close to b (region called "J 1 ") orb (region called "J 2 "): In this calculation we used eq. (45). To compute the dependence of the fluctuations on z, we take the simplest case n = 2. For the σ and Σ fluctuations, the terms ρA 1 and ρA 2 vary independently, leading to the following contribution to δM U E : 17 where with c σ and c Σ given by eqs. (63,65) for n = 1. Concerning the δρ fluctuations, the 2 terms ρA 1 and ρA 2 vary the same way from one event to another. Therefore, if it were only for the δρ term, we would write ρA 1 = ρA 2 leading to: 17 the factor of 4 comes from the fact that we compute the width at 2σ. where with c δρ given by eq. (64) for n = 1. Adding all these contributions, this apparently leads to an enhancement of the width by a factor of 1/z. But we have to take into account that the coefficients c δρ and c Σ also contain a factor R 2 bb (eqs. (63,64)) leading to another factor 1/z, and thus an enhancement 1/z 2 at small z. 18 Therefore, we can conclude that the effect of z = 1/2 is to broaden the reconstructed Higgs peak. Such a factor may partly explain the width of ∼ 14 GeV that was observed in [62], to be compared with the various widths found in the previous subsection (see for instance figure 12), and should also lead to decreasing η opt . This is illustrated in fig. 13, which was obtained with the results derived in appendix B, where we carry out the above analysis for a general n. The points correspond to the numerical determination of η opt , found solving eq. (62) whose z dependence is derived in appendix B, whereas the curves correspond to its approximate analytical solutions. (b) δM tot = δM 2 P T + δM 2 U E computed at η = η opt as a function of p t H for f = 0.68 and z = 0.2. Now, we turn to the f value. As we explained in section 4.1, the choice f = 0.68 was made to correspond to a 2σ gaussian width, as we did for δM U E , which is somewhat arbitrary. We would like to estimate how the results change when f is modified. We thus also consider a range of values for f between 0.5 and 0.8. In this case the C P T (n, f ) constants caracterizing δM P T are changed (see for instance eq. (40)), and δM U E is also changed, i.e. eqs. (61,62,67) have to be slightly modified: 18 This is valid when pt H > M H R 0 , with R0 the radius of the initial clustering of the event, in order for the b andb to be clustered together. For lower pt H , there is a kinematic cut on z and the enhancement is less strong.
where erf(x) is the usual error function: Notice that the constants c σ , c δρ and c Σ are left unchanged with this convention. However C U E becomes: The bands corresponding to the uncertainties on η opt that we obtain including these modifi- cations are presented in figure 14. The uncertainty that we get, ∼ 20−30%, is not larger than the precision of the whole study of this paper, which limits itself to a large-N c leading-log calculation. Notice that the variation with N P U remains small. One finally observes that η sat (n, f ) is almost independent of f for n = 3. In appendix B, we will show that it can be approximately written as: Because of the small coefficient of its first order correction, η sat = e −0.58 is a good approximation within less than 1% for a large range of f values. But this seems to be a coincidence with no deep physical reason.

Hadronisation corrections
It is difficult to calculate what happens during the process of hadronisation, though some analytical results can be found concerning jet studies for instance [56,63,64]. In particular, it was shown in [56] that such non-perturbative corrections lead to a p t shift for QCD jets equals on average ∼ −Λ/R filt C i where Λ = 0.4 GeV and C i = C F or C A depending on whether it is a quark jet or a gluon jet. This can be translated in our study by the following averaged p t shift for the filtered jet: where the second equality holds in the large N c limit. Unfortunately, there is no result concerning the dispersion of the p t distribution, which is the relevant quantity to compute in our case. Therefore, we are going to assume that the spread is of the same order of magnitude as the shift. This is in principle a crude approximation, but the only aim here is to illustrate the consequences of including hadronisation corrections in order to emphasize the fact that taking n too large is certainly not a good choice. Therefore, we use eq. (46) to estimate very roughly the hadronisation corrections to the reconstructed Higgs mass peak width: when z = 1/2. As before, one should know how to combine perturbative radiation with UE/PU and hadronisation corrections in order to minimize the resulting combined width. However, we simply choose to minimize the quantity with respect to η and plot the resulting minimal δM tot for different values of n ( fig. 15).  Figure 15: δM tot including hadronisation corrections computed at η = η opt as a function of (a) p t H and (b) N P U for different values of n.
The first thing one can notice on these plots is that increasing n also increases the hadronisation corrections. For n = 5 they become so important that it is now clearly not an optimal filtering parameter contrary to what could be deduced from figure 12. The relevant p t region in our study is roughly 200 − 400 GeV where we find the major part of the Higgs cross-section above 200 GeV and where our results are more reliable (see section 4.3). In this region n = 3 gives the best result. And at high PU, n = 3 and n = 4 both seem optimal, whereas n = 2 is far from being as good.
To conclude, our estimates seem to indicate within the accuracy of our calculations that n = 2 is not a good choice, nor is n ≥ 5. Taking n = 3 or n = 4 gives equally good quality to the mass peak. Increasing the hadronisation effects with respect to eq. (91), would lead to n opt = 3, whereas if we lower them, we would find n opt = 4. The only thing we can say is that n = 3 and n = 4 both seem to work rather well.
One way to go beyond these results would be to use event generators like Herwig [65,66] or Pythia [67], to compute directly the Higgs width in presence of UE/PU, perturbative radiation, ISR and hadronisation, and to find for which value of the couple (n, η) the reconstructed Higgs mass peak width δM H becomes minimal (that would still depend on p t H and the level of UE/PU). But our study here aimed to understand as much as possible the physical aspects behind such an optimisation, the price to pay being larger uncertainties on the result because of the necessary simplifications that were made.

Conclusions
This work has investigated the effect of QCD radiation on the reconstruction of hadronically decaying boosted heavy particles, motivated in part by the proposal of [15] to use a boosted search channel for the H → bb decay. Though this article took the Higgs boson as an example, all the results presented here can be applied to the W and Z bosons, as well as any new colorless resonance decaying hadronically that might be observed at the LHC. The main effect of the QCD radiation is to distort and spread out the boosted heavy resonance shape well beyond the intrinsic width of the resonance. The aim therefore is to calculate the resulting resonance lineshape. This is a function of the parameters of the reconstruction method, notably of the "filtering" procedure, which aims to limit contamination from underlying event and pile-up, but which causes more perturbative radiation to be lost than would otherwise be the case.
Calculations were performed in a leading (single) logarithmic and leading colour approximation, which is the state of the art for this kind of problem. Analytic results were provided up to α 2 s ln 2 M H ∆M for n = 2, and all-orders analytic results for the cases n = 2 and n = 3 were given for the terms that dominate in the small η limit. Numerical fixed-order results up to α 5 s ln 5 M H ∆M and all-orders resummed results were also given and are treated in more details in appendix C for a range of n and η. For the n = 2 and n = 3 cases there is quite acceptable agreement between the small-η analytic results and the full numerical results, even for values of η ≃ 0.5.
One unexpected feature that was observed was the behaviour of order-by-order expansion as compared to the resummed result: indeed there are indications that the series in α s ln M ∆M has a radius of convergence that is zero (but in a way that is unrelated to the renormalon divergence of the perturbative QCD series). This seems to be a general feature of the nonglobal logarithm series. Its practical impact seems to be greater for large η, or equivalently when the coefficients of the "primary" logarithms are small.
With these results in hand, it was then possible to examine how the perturbative width of the resonance peak depends on the parameters of the filtering. Though this was accessible only numerically for the full range of filtering parameters, figure 8(a) lends itself to a simple parametrisation for practically interesting parameter-ranges.
This parametrisation was then used in section 4.3 together with a parametrisation for the effect of UE and PU, so as to examine how to minimize the overall resonance width as a function of the filtering parameters and of the physical parameters of the problem such as the resonance p t and the level of UE/PU. The approximations used might be described as overly simple, yet they do suggest interesting relations between optimal choices of the filtering parameters and the physical parameters of the problem. Though it is beyond the scope of this article to test these relations in full Monte Carlo simulation, we believe that investigation of their applicability in realistic conditions would be an interesting subject for future work. It should also be noticed that the methods used in this paper may be adapted to other reconstruction procedures like jet pruning [14] and jet trimming [19] as well as filtering as applied to jets without explicit substructure [18].  Figure 16: The α plane, with various variables used in the calculation and all along this study. In this figure, the b quark is supposed to carry a fraction z of the Higgs energy, and the center of the frame coincides with the direction of the Higgs boson momentum.

A.1 Primary coefficients
Let us go back to the integral of eq. (9) and write it in the α plane: For ∆M (k) = M H − M one easily finds with where we used eq. (45). With this expression, the integration over k t is straightforward: Notice that | α| ∼ (1 − z)R bb or | α| ∼ zR bb , depending on whether the perturbative gluon emission is relatively close to b orb (due to the collinear divergence of QCD). Thus, given that a Higgs splitting is most of the time roughly symmetric (z ∼ 1/2), this leads to A ( α) = O(1). Therefore: and the ln A( α) term can be neglected in a leading-log calculation. One thus obtains: with J(η) the remaining angular integral. Introducing the coordinates (r, ψ) defined in fig. 16, J(η) can be rewritten in the following form: 19 Performing the ψ integration and the r one (for η < 1/2), one arrives at the formulae (11,12), where all the R bb dependence is cancelled. One remark: if η > 1, the b andb quarks cluster together, and the result then depends on the z fraction of the splitting. For instance, if η > 2, J(η) can now be written: 20 . (103)

A.2 Non-Global coefficients
The starting point here is eq. (20) for I N G 2 : In the same way as the primary case, using the α plane and integrating over the energies of the 2 gluons, one arrives at: 19 A simple shift of the α coordinates gets rid of the z dependence, for instance α ′ = α − z − 1 2 R bb 20 the intermediate case 1 < η < 2 has a more complicated phase space integration and is not presented here. with In all this part, η < 1/2 is assumed. To deal with this integral, the frame is centered aroundb for instance, and 2 quantities are computed: S tot where gluon 1 is in the jet region Jb around b and gluon 2 covers the whole phase space except Jb, from which is subtracted S int where gluon 2 covers J b , the jet region around b ( fig. 17). Therefore: where the factor 2 is for the symmetric case (gluon 1 in J b ).

A.2.1 Calculation of S tot
Using the variables u = r ηR bb , ψ ( fig. 16), S tot can be written .
Doing the angular integrations, one arrives at: 2 remarks about this result: 1. The first and second terms have divergences in u 1 = 0 and u 2 = 1 η , so respectively when gluon 1 is collinear tob and gluon 2 is collinear to b, but they cancel when adding these terms.
2. The part of the integral corresponding to u 2 > 1 η , i.e. r 2 > R bb , is null, which can be simply interpreted as a manifestation of the angular ordering.
Performing this integration with Maple for instance gives the following result: where This is rather complicated, but that expression can be greatly simplified using the relations: The final answer is then: The remarkable point to notice is of course that S tot does not depend on η. But no simple explanation was found to interpret this result.

A.2.2 Calculation of S int
S int must now be subtracted from S tot . The computation is similar to the previous one and is not detailed here. However, contrary to S tot , a simple analytical result was not obtained, only the following expansion: Using eq. (106) with the 2 previous results, one arrives at the expression eq. (24).
B Analytical considerations on the dependence of the results on z and f

B.1 Dependence on z
It is interesting to understand analytically how η opt evolves when the decay of the Higgs boson occurs with a z fraction different from 1/2. In fact, we obtain the same result as eqs. (72,73) up to a modification of the constant C U E (more generally, we have to modify the coefficients c σ , c δρ and c Σ , see eqs. (62)(63)(64)(65)). The starting point is eq. (79) generalized to any value of n: where a i (z) is either 1 z or 1 1−z depending on whether subjet i is in the J 1 region (around the b quark) or in the J 2 region (around theb quark). This is because the UE/PU particles tend to cluster around the perturbative radiation, which itself is emitted close to b andb. We call a "configuration" the set of all the coefficients a i (z).
The result on the fluctuations depends on which ones are considered. Let us start with the fluctuations originating from the σ and Σ terms. In this case the ρA i terms vary independently, thus leading to a contribution to δM U E similar to that of eq. (80) for n = 2: with (see eq. (81)): The coefficients c σ and c Σ are still computed for n = 1 in this formula. But eq. (120) is only valid for a given configuration {a i }. We thus have to average over all the 2 n−2 possible configurations (the b andb subjets are fixed). As the perturbative radiation pattern does not depend on z (for z not too small), each configuration arises with the same probability. Therefore, if k is the number of subjets in the J 1 region and n − 2 − k the number of subjets in the J 2 region apart from the b andb subjets, we obtain: We can follow a similar reasoning for the δρ fluctuations, except that the ρA i terms in eq. (119) vary the same way from one event to the next. Therefore, for a given configuration {a i }, one can deduce the following contribution to δM U E : with δ δρ given by eq. (83). As before, we have to average this result over all the 2 n−2 possible configurations, leading to: One can absorb all the dependence of the resulting δM 2 U E in n and z into the coefficients c σ , c δρ and c Σ , and define new coefficients c ′ such that with With these results in hand, we can easily generalize eq. (72) for any value z of the Higgs splitting. One only has to modify the value of C U E . For instance, the curves in fig. 13(a) were obtained using eqs. (69−73) with: C U E (n, N P U , z) = 1.6π where we just use the dominant c ′ δρ term. There is also another source of fluctuations that we haven't accounted for so far. Indeed, even if the values of ρ and of the jets area were constant, we should consider the fact that the filtered subjets can be either in the J 1 or in the J 2 region. The calculation of its effect is similar to that of the fluctuations in σ, δρ and Σ. Its contribution on δM 2 U E can be written: with But we checked that its effect on δM 2 U E is negligible compared to that from the dominant δρ fluctuation, 21 and therefore we did not include it.

B.2 Comments on the uncertainty due to the choice of f
Let us return to the observation of section 4.4 that η sat (n, f ) is almost independent of f for n = 3. To understand why, we have to estimate t sat (f ) and C P T (3, f ) (cf eq. (43)). t sat can be deduced from the equation: for any value of n, as t sat only depends on f (see for instance figs. (7,8(a)). To be simple, we can use the function Σ (2) which was widely studied in this paper. Unfortunately we cannot take the primary emission result eq. (10) because the non-global part becomes important when η = 1. However, as an approximation, we can numerically compute the second order coefficient a 2 ≃ −3 of Σ (2) (η = 1, t) and solve: which simply leads to This expression can be shown numerically to give t sat (f ) with a precision better than 1% for f ∈ [0.5, 0.8]. C P T (3, f ) is harder to evaluate. One must solve in the limit of large L. Using eq. (27), which is valid in this limit, and defining the function  21 This effect is strictly null when z = 1/2. When z = 0.2, we obtain: which gives C P T (3, f ) within a few %. Therefore, eq. (43) leads to which is eq. (89).
To be complete, the same analysis can be done when n = 2, using eq. (40) for C P T (2, f ), to obtain η sat (2, f ). We will just mention that C Convergence of the non-global series In this appendix, we go back to the study of the convergence of the non-global series that was already briefly examined in section 3.2, trying to understand a little more what may be behind the observed behaviour.

C.1 Case n filt = 2
First, we start by studying the convergence of the non-global perturbative series when n = 2 for η small, where the primary coefficients are known to be enhanced with respect to the purely non-global ones because of the presence of large collinear logarithms (cf section 2.2). Figure 18 compares the fixed-order results to the all-orders one up to α 5 s . On these 2 plots  one can notice that the series seems to converge, as was already shown in section 3.2. In other respects, the convergence looks better when η is larger. This may be understood using the following simple explanation: if we make an expansion up to order k, then the series starts to diverge from the exact result when the term of order k + 1 becomes roughly of the same size as the function itself. In the Σ (2) (L, t) case, using the analytic expression (eq. (25)), this can be translated to with L = ln 1 η . At large k, 23 the solution gives So the convergence is better when k and η increase.  Let us go to larger η, where there is no large collinear logarithm anymore, and check what happens. This is done for η = 0.6 and η = 0.9 on figure 19. One striking feature of these plots is that the convergence seems acceptable up to the third order, but the 4 th order does not give as good a result, and the situation becomes worse as η increases. Said another way, the perturbative expansion can be trusted until the third order, but then it starts to diverge. The fact that the convergence looks better for small η may come from the dominant behavior of the primary series, which converges very well to a nice exponential function. However, if one could go to sufficiently high orders, it might be possible to observe the same divergence as in figure 19, when the purely non-global coefficients become of the same order of magnitude as the primary ones.
To get an idea of these coefficients, the series of the plots are explicitly written below: For η small, the growth of these coefficients comes from the powers of the large collinear logarithm L. For η near 1, the small growth that one can start to observe at the 4 th order 23 the derivation was done at large k, but the result seems to be reasonable even for k = 2.
essentially comes from the purely non-global part. Indeed, at the 4 th order, the primary coefficient is positive, and it is not the case for η > 0.6. This non-global growth will be confirmed at higher orders for the slice observable in section C.3. Let us now see what happens if the perturbative series is exponentiated. This means that instead of plotting g(t) = 1 + k i=1 a i t i , which is the perturbative series, one plots e f (t) where so that . . .
Notice that the exponentiated first order corresponds to the analytical estimate for n = 2.
One can observe on figure 20 a nice convergence for η = 0.1 until the 4 th order. Concerning the order 5, it seems that it diverges a little, but if the coefficients of the series are varied within their respective errors, then it can coincide with the all-orders curve. However, one can guess that it should not converge at the end because the all-orders function is not strictly a simple exponential. This will be confirmed later with the slice. When η = 0.9, the exponentiated 4 th order surprisingly almost fits the all-orders result. Even more surprising, if the 5 th order coefficient (not shown here) is varied within its error band, it seems that the exponentiated fit is still improved. Is it accidental? Here again, the slice example and good sense lead us to answer yes but we cannot be completely sure.   form g(4N c Lt) with: The radius of convergence of the Taylor series for g is infinite as the coefficients a k resulting from its expansion can be shown to be bounded for large k by: As the expansion of this function converges, one would expect the same to occur for the curves obtained numerically. However, the expansion does not converge as fast as for the exponential, so that the t window for which there is a convergence should be smaller than what was obtained for n = 2. This is indeed what is observed in figure 21(a) for η = 0.3, if it is compared for instance with the plot 18(b). In other respects, the curves on the plot 21(b) for η near 1 behave similarly as for n = 2, i.e. a perturbative series that converges until the 3 rd order, and that starts to diverge from the 4 th order. Notice that the same behaviour is observed for n > 3.

C.3 The Slice case
The Slice observable, studied for instance in [41], gives an interesting example of the strange behaviour of the non-global leading-log series. It is simply defined by the sum of all the particles' energy flowing into a region Ω of the phase space corresponding to a rapidity y ∈ [−y 0 , y 0 ], with y 0 a parameter of the observable ( figure 22). Here, we work in the qq center of mass frame and the quarks are assumed to move along the z axis. This observable is non-global as shown in fig. 22, and it is interesting in 2 ways: • A simple change of frame 24 shows that it is more or less equivalent in the leading-log approximation to the ∆M observable in the filtering analysis using anti-k t with n = 2 and η = e −y 0 when η ≪ 1, while being faster to compute numerically due to the absence of clustering.
y=−y 0 y=y 0 q q Ω 1 2 Figure 22: The region Ω between the 2 dashed lines, and the gluon's configuration leading to the appearance of non-global logarithms.
• When e −y 0 ∼ 1, which should very approximately correspond to η = O(1), the strange behaviour of the non-global series, observed in section 3, is clearly confirmed with the addition of the 5 th and 6 th orders.
The series are represented in figure 23 for 2 different values of y 0 . The plot for y 0 = 2.3 is here for a comparison with figure 18(a), i.e. when n = 2 and η = 0.1. One can notice that until order 5 the behaviours of the 2 series are very similar. The slight difference comes from the fact that the C/A algorithm was used there instead of anti-k t . This can also be seen on the expansions eqs. (143,152). However, a remarkable effect is the 6 th order curve which does not improve the fit with the all-orders result anymore (it even makes it worse). When y 0 = 0.5 ( fig. 23(b)), this effect is enhanced: indeed, one notices that the 3 rd order gives the best result, with the 2 nd order even worse than the 1 st one. And here again, adding more orders shifts point of disagreement to smaller t.
The growth of the coefficients for y 0 = 2.3 until the 5 th order essentially comes from the powers of 4N c y 0 when expanding the primary result, which is the expression of the collinear divergence near q andq, whereas, for y 0 = 0.5, it essentially takes its origin from the purely non-global part. There is no large enhancement due to collinear divergence which can explain it. As an example, we also show the function S(y 0 , t), which is defined as in [35,41] to contain the purely non-global part of the result: S(y 0 , t) ≡ Σ(y 0 , t) Σ (P ) (y 0 , t) , where, for the slice, the primary contribution Σ (P ) can be written as: in the large-N c limit. The plots for S(y 0 , t) are shown in fig. 24. One observes the saturation property noticed in [41] which leads to very similar plots for y 0 = 0.5 and y 0 = 2.3. There is clearly no convergence of the perturbative series.    Therefore, the non-global leading-log large-N c series seems to behave badly at high orders. Does this mean that it is an asymptotic series like the Standard Model is known to be [68]? 25 This study cannot answer such a question but, at least, one should be aware of the strange behaviour of the non-global series.  To finish, let us mention the exponentiated results represented in figure 25. They show that the convergence observed in the case n = 2 is clearly only illusory. One point to notice is that the case y 0 = 2.3 is slightly different from its η = 0.1 counterpart as the divergence starts to be visible at order 4 instead of 5. This may be because using the C/A algorithm in the filtered Higgs mass observable reduces the impact of the non-global logarithms [42,43] and the primaries still dominate at order 4, whereas it is not the case anymore for the slice. Notice also the very nice fit given by the exponentiated 2 nd order curve when y 0 = 0.5.