The mass area of jets

We introduce a new characteristic of jets called mass area. It is defined so as to measure the susceptibility of the jet's mass to contamination from soft background. The mass area is a close relative of the recently introduced catchment area of jets. We define it also in two variants: passive and active. As a preparatory step, we generalise the results for passive and active areas of two-particle jets to the case where the two constituent particles have arbitrary transverse momenta. As a main part of our study, we use the mass area to analyse a range of modern jet algorithms acting on simple one and two-particle systems. We find a whole variety of behaviours of passive and active mass areas depending on the algorithm, relative hardness of particles or their separation. We also study mass areas of jets from Monte Carlo simulations as well as give an example of how the concept of mass area can be used to correct jets for contamination from pileup. Our results show that the information provided by the mass area can be very useful in a range of jet-based analyses.


Introduction
In the present era of the LHC, as in the times of all precedent hadron colliders, jets remain fundamental objects of interest [1,2]. Their importance extends far beyond the domain of physics of strong interactions, where they are used as representatives of partons participating in a hard process. They play also a significant role in a whole range of processes involving decays of heavy particles. Those include, for example, a top quark decaying into three jets, W/Z, Higgs boson or a hypothetical boson Z ′ decaying into two jets, as well as a variety of SUSY particles which readily decay into many-jet final states.
A considerable effort is being made to improve our control on jets. On the theoretical side, this comprises, on one hand, improving the precision of calculations involving the canonical set of jet observables like the transverse momentum, mass or thrust. On the other hand new concepts are being developed including additional characteristics, like, for example, the catchment area of jets [3], or new analysis techniques based on subjets [4][5][6][7][8][9][10][11][12][13][14].
Amongst a number of properties of a jet, its mass turns out to be important in many physical contexts. In the legitimate approximation of massless QCD partons, the jet mass arises due to its substructure. One source of this substructure is of course the radiation of gluons and quarks, which leads to the well known distribution of mass of QCD jets with a significant fraction of jets with large masses. Consider, however, a process involving a hadronic decay of a heavy object of mass m. If this object, in addition, has the transverse momentum p t ≫ m, a situation not unusual at the LHC, the decay products will end up in a single jet. The reconstructed mass of such jet will be an important emblem pointing to its origins. Moreover, such a fat jet can be analysed further with techniques involving study of the masses of its subjets. The jet-based reconstruction of heavy particles has been a subject of numerous studies devoted to decay of W [15], WW scattering [4], decay of top [7,11,16,17], Higgs [6,9] as well as SUSY searches [5,13,14,18].
The success of the above techniques depends crucially on the ability of precise determination of the mass of jets measured in experiment. In hadron colliders, however, particles that can contribute to the jet's substructure may also come from soft radiation unrelated to the genuine hard process of interest. Such radiation appears, for instance, due to independent minimumbias collisions that happen in the same bunch crossing, a phenomenon known by the name of pileup (PU). But even in the absence of pileup each hard process from single hadron-hadron collision is accompanied by soft underlying event (UE) which can easily modify the jet's transverse momentum by a few GeV [19,20].
A major step towards quantifying the effects of UE/PU and correct for them was made in [3,21], where the concept of the jet area was introduced, which is a measure of how much the transverse momentum a jet from a given clustering algorithm is prone to be affected by soft radiation. We briefly review the corresponding results in section 2.
In this paper, we introduce a related characteristic of a jet, which we will call the mass area and which will represent the susceptibility of a jet's mass to a soft background like UE or PU. In line with [3] we will introduce two types of the mass area: passive and active. The former will correspond to pointlike background whereas the latter will be appropriate to measure the susceptibility of the jet mass to the soft radiation which is diffuse and uniform.
We will analyse the passive and active mass areas of jets from four modern clustering algorithms: k t [22,23], Cambridge/Aachen (C/A) [24][25][26], anti-k t [27] and SISCone [28]. The first three belong to the class of sequential recombination algorithms. They introduce a distance d ij for each pair of particles and a distance d iB for particle and the beam. The distances depend on the basic parameter, jet radius R. The algorithms start from computing the above distances for all final state particles. If the smallest distance involves two particles, they are recombined and replaced in the list of particles by the product of this recombination. If the smallest distance is that between a particle and the beam the particle is called a jet and removed from the list of entries. The procedure is repeated until there are no entries in the list. The SISCone algorithm belongs to a different class of the so called cone algorithms. They look for stable cones of radius R and subsequently apply the Tevatron run II procedure [29] to split or merge the overlapping cones. All the above algorithms are infrared and collinear safe and are easily accessible via the FastJet package [30,31]. Further details on each of them are given in section 3.1.
In [3] the jet areas were calculated for the case of 1-and 2-particle systems. In the latter case the results were obtained in the limit of strong ordering of transverse momenta of the two particles. In this paper we relax the assumption of strong ordering and start by presenting in section 3 the corresponding general results for passive and active areas of 2-particle jets. Subsections 3.1 and 3.2 are quite technical. Though they contain very useful material, the reader interested in the main part of our study may skip them on the first reading.
In section 4 we introduce the concept of the mass area of a jet and define its passive (subsection 4.2) and active (subsection 4.3) variants. There, we also analyse their properties for the system of 1-and 2-particles. In particular, we compare results from the four algorithms and examine the dependence on the relative hardness of the constituents of 2-particle jets. At the end of each subsection we discuss the problem of logarithmic dependence of the mass area of QCD jets on the jet's transverse momentum. We give it a quantitative description in terms of the anomalous dimension and compare the results across the jet algorithms. Throughout the paper we work in the small R approximation which is justified by the observation that the corrections from higher powers of R are accompanied by small coefficients [19,32].
In section 5, we turn to a study of jets simulated with Pythia. We illustrate how the features found for simple 1-and 2-particle systems help understanding mass areas of more realistic jets (subsection 5.1). Then, we give an example of practical application of mass areas to correct jet mass for the contamination from pileup (subsection 5.2). Finally, we summarize our results in section 6 and provide some extra details in two appendices A and B. 2 Essential definitions, notation and brief review of jet areas 2.1 Passive area Consider a set of particles {p i } which are clustered with an infrared safe jet algorithm into a set of jets {J i }. Suppose now that we add to the set {p i } a single infinitely soft particle g, which hereafter we shall call the ghost, and repeat the clustering on the new set of particles {p i , g}. Because we use an algorithm which is infrared safe and because our extra particle g has infinitely small transverse momentum this clustering will not change the set of jets {J i }. The ghost particle g can be either clustered with one of the real particles, in which case it ends up in one of the jets J i , or it can form a new jet with g being its only constituent.
The passive, scalar 1 area of the jet J is defined [3] as the area of the region in the (y, φ) plane in which the ghost particle g is clustered with J a(J) ≡ dy dφ f (g(y, φ), J), f (g, J) = 1 for g clustered with J 0 for g not clustered with J . (2.1) Such definition provides a measure of the susceptibility of the jet to soft radiation in the limit in which this radiation is pointlike. For a set o particles that consists only of a single particle p 1 the passive area of the corresponding jet J 1 is a(J 1 ) = πR 2 for all four jet clustering algorithms: k t , C/A, anti-k t and SISCone.
Adding a second particle p 2 leads to the result which depends on the jet definition (i.e. jet algorithm and jet radius) and the geometrical distance between particles p 1 and p 2 in the (y, φ) plane ∆ 2 12 = (y 1 − y 2 ) 2 + (φ 1 − φ 2 ) 2 . The analytic results for a(∆ 12 ) of the harder jet in the limit p t1 ≫ p t2 ≫ Λ QCD ≫ p tg for all four algorithms were obtained in [3,27]. In Fig. 1 (left) we show the corresponding functions, normalised to the 1-particle passive area. We notice substantial dependence on the algorithm especially in the region ∆ 12 < R where the two particles form a single jet. There, the areas from the k t and C/A algorithms are notably different from πR 2 and vary significantly with the distance between the particles. On the contrary the areas from SISCone and anti-kt are identical with the 1-particle area for ∆ 12 < R and in the latter case also for ∆ 12 > R. All results recover the correct limit of πR 2 when ∆ 12 goes either to 0 or to 2R.

Active area
Suppose that we add to the set of particles {p i } not a single ghost like in the case of passive area but a dense coverage of ghost particles randomly distributed in the (y, φ) plane. Again,   Figure 1: Passive (left) and active (right) area of the hardest jet in a 2-particle event with p t2 ≪ p t1 and the interparticle separation ∆ 12 . All curves for passive areas as well as the anti-k t and SISCone curves for active area represent the analytic formulae obtained in [3,27]. The active area results for the k t and C/A algorithms were computed using the FastJet 2.4.2 package [30,31].
the original jets {J i } are not modified, but, they can contain many ghosts which are clustered together with real particles. In addition, now ghost may also cluster among themselves leading to formation of jets with no physical particle -the pure ghost jets. The active scalar area of a jet J is defined [3] as a number of ghosts contained in this jet per the density of ghosts per unit area averaged over many sets of ghosts. If the number of ghosts from a particular ghosts ensemble {g i } clustered with the jet J is N {g i } (J) and the number of ghosts from this ensemble per unit area is ν {g i } then the active scalar area is given by where in addition to the limit of the infinite density of ghosts, the average over many sets of ghosts is taken. The latter is necessary since the ratio N {g i } (J)/ν {g i } depends on the particular set of ghosts even in the limit of high ν {g i } . Therefore, one also defines the standard deviation of the distribution for the active area over many ghosts ensembles The active area is meant to measure the susceptibility of a jet to the soft radiation which is uniform and whose density is high. Similarly to the scalar area also the 4-vector active area may be defined as where p tg is the average ghost transverse momentum. The 4-vector area will prove useful in section 4.3 were shall discuss the active mass area. For small jets, the scalar area and the transverse component of the 4-vector area are virtually equal, A(J) ≃ A t (J), and A µ is a massless vector which points in the direction of the jet. For larger, jets the 4-vector area becomes massive and its direction differs from that of the jet. algorithm A/(πR 2 ) Σ/(πR 2 ) 1-particle-jet ghost-jet  [3,27] for active areas and their fluctuations in the case of 1-particle and pure ghost jets. The numbers for the k t and C/A algorithms where obtained from numerical study with FastJet [30,31] whereas those for anti-k t and SISCone represent exact values from analytic calculations. All results are normalised to πR 2 . The results for pure ghost jet areas are not shown for SISCone and anti-k t . In the first case they depend strongly on the spilt-merge parameter, f , while in the second case the distribution has two peaks at 0 and πR 2 .
The active area can be studied numerically for any infrared safe jet clustering algorithm, most easily using the FastJet package [30,31]. In addition, the analytic results can be obtained in some cases for the anti-k t algorithm and for SISCone.
Unlike the passive area, the active area of the 1-particle jet may differ significantly from the naive expectation of πR 2 . Firstly, in that it is in general a rather broad distribution over many ghost ensembles and secondly in that the average value of this this distribution may lay below πR 2 . This is illustrated in table 1, which summarises the results for the average active scalar areas of 1-particle and pure ghost jets and the corresponding standard deviations from four clustering algorithms obtained in [3,27]. We see that the average values for the k t and C/A algorithms are significantly lower than πR 2 with pure ghost jets having smaller jet area than the jets with 1 hard particle. Moreover, the values of standard deviations indicate that the distribution of active jet areas is rather broad. The anti-k t algorithm is special in that its 1-particle-jet active area is equal to the passive area πR 2 and does not fluctuate [27]. For the SISCone algorithm, the active area of a single-particle jet can be calculated exactly [3] and it turns out that its value is four time smaller than that of passive area. The active areas of ghost jets for SISCone and anti-k t exhibit somewhat more complex behaviour. For the former the results depend on the split-merge parameter, f , and for the latter the distribution has two peaks at 0 and πR 2 . That is why we do not show them in table 1.
As in the case of passive area, discussed in the preceding subsection, also here, adding a second particle to the system has a significant effect on the active area of the hardest jet. This is illustrated in Fig. 1 (right) for the case of p t2 ≪ p t1 , which was considered in [3]. As we see, the behaviour depends significantly on the algorithm. The active areas from k t and C/A exhibit similar shape to the passive areas differing with the latter mostly by about 20% lower normalisation. The anti-k t , as expected, gives the same result for passive and active area. The most drastic change is seen for the SISCone algorithm for which the active area is almost factor four smaller than the passive one (c.f. table 1).
3 Areas for general case of 2-particle system In the current study we are interested in mass of a jet and the way it is affected by soft background. The main contribution to jet mass comes from its substructure. This substructure may originate, e.g., from QCD splittings. In this case, the results for the areas of jets consisting of two strongly ordered particles, obtained in [3] and reviewed briefly in the previous section, are adequate. However, if the two constituents of our simple jet come from a decay, then their transverse momenta are comparable and one expects such a jet to have different properties.
In order to be able to discuss the problems related to jet masses for the whole spectrum of cases between those two extremes of p t2 ≪ p t1 (QCD jets) and p t2 ∼ p t1 (jets from decay), as a preparatory step, we will generalise the results for jet areas from [3] to the case of two particles with arbitrary transverse momenta.
It will be convenient to quantify the relative hardness of the two particles, p 1 and p 2 , in terms of the variable which, by definition, is always in the range 0 ≤ z ≤ 1/2. The main difference with respect to the case discussed in the previous section will be that now, when the particles p 1 and p 2 are combined, the jet J 12 may be centred anywhere between the positions of these two particles. Before, such jet was centred always at the harder particle.
The exact values of (y J 12 , φ J 12 ) will depend on the recombination scheme used as part of a jet definition. Out of several existing schemes, we adopt for the study presented in this paper the widespread E-scheme which combines particles by simply adding their 4-momenta. Apart from being very intuitive and preserving Lorentz symmetry, it has been also recommended in [29].
For the 2-particle system, the centre of the jet will lie on the line segment bounded by the positions of the particles. Therefore, the results for mass areas from the four algorithms that we are going to study can depend only on the distance along this line, from one (any) of the particles to the centre of the jet. We will denote the distance from the softer particle by ∆ J . It will depend on ∆ 12 and on the asymmetry parameter z.
For convenience we also introduce the versions of ∆ 12 and ∆ J normalised to the jet radius Employing the above definitions, we may write the explicit formula for x J in the E-scheme valid in the small R limit One notices that, since, according to the definition (3.1), z ≤ 1/2, the softer particle is always further away from the centre of the jet than the harder one. The distance between the latter and the jet's centre being x − x J . In the limit of strongly ordered particle transverse momenta, x J → x and the jet gets centred at the harder of the two particles.

Passive areas
Since the system under consideration is simple and the hardest jet may consist of at most two particles, its passive area can be calculated analytically. The result will depend on the order of clustering of particles p 1 , p 2 and the ghost g. This order is different for each algorithm hence the passive areas will vary across them. As mentioned in the introduction, we work in the small R approximation. One consequence of that is that we treat the directions y and φ in the (y, φ) plane on equal footing.
The k t algorithm with its 2-particle distance measure, d ij = min(p 2 ti , p 2 tj )∆ 2 ij /R 2 , and beamparticle measure, d iB = p 2 ti , will always cluster the ghost first with either one of the particles p 1 , p 2 or the beam. In the latter case, the contribution to the area is zero. In the former case, the ghost clusters with the particle which is geometrically closer according to the distance ∆ ig regardless of the relative hardness of particles p 1 and p 2 . Therefore, the result will be independent of z and will coincide with that found in [3] and shown already in Fig. 1 (left) of section 2.1. The corresponding formula can be found in the appendix A.

(a)
Cam/Aachen The Cambridge/Aachen algorithm does not take into account the hardness of the particles undergoing the clustering but solely the geometric distance ∆ ij between them according to the measures d ij = ∆ 2 ij /R 2 and d iB = 1. The clustering of the system of two particles p 1 and p 2 and the ghost proceeds as follows. If the ghost is closer than ∆ 12 to either of the perturbative particles then it is clustered first with the closer one. Subsequently, the particles p 1 and p 2 are clustered. If, however, the distance between the ghost and the closer particle is greater than ∆ 12 then the two perturbative particles are clustered first forming the jet J 12 centred at the point in the line segment between the positions of the particles p 1 and p 2 at the distance ∆ J from the softer particle. Then, the ghost may cluster with J 12 if its distance to the jet's centre is smaller than R. Therefore, the area of a 2-particle jet in the C/A algorithm is a union of two smaller circles of radius ∆ 12 , centred respectively at the particles p 1 and p 2 and the big circle with radius R centred at the jet J 12 .
The range 0 < x < 2 consists of four distinct sub-ranges. The two critical values of x, which we denote as x c1 and x c2 , correspond to the situations where one or two of the small circles start sticking out of the big circle, as depicted in Fig. 2 (a) and (b). For x below x c1 or above 1 the results will not depend on the asymmetry parameter, z, and will be identical with those found in [3].
The conditions for the critical values of x are given by In the limit of small R the approximate solutions have the following simple forms The analytic result for the passive area from the C/A algorithm is given in appendix A. The corresponding curves are shown in Fig. 3 (left) for R = 0.6 and several values of the asymmetry parameters z. We note that the dependence on z is mild. In the limit z → 0, x c1 → 1/2 and x c2 → 1, and one recovers the result for the system of two particles with strongly ordered transverse momenta from [3].
The SISCone algorithm looks for stable cones of radius R, which are the cones whose direction coincides with the E-scheme sum of the momenta of the particles inside. Those cones which overlap are subsequently split or merged according to the Tevatron run II type [29] procedure. This procedure starts from ordering stable cones according to the scalar sum of the transverse momenta of their constituents,p t . Then, thep t shared between the hardest jet and the next to hardest jet that overlaps with it (withp tj ) is compared with fp tj , where f is the overlap threshold parameter. The cones are merged ifp t > fp tj and split otherwise.
For x < 1 only one stable cone is found, with its centre between the particles p 1 and p 2 . Any ghost within this cone belongs to the jet. Therefore the area is identical to that of a single particle jet.
For 1 < x < 2 two stable cones are always found, centred at particles p 1 and p 2 respectively. In addition, for 1 < x < x c4 a third stable cone is found containing both particles. The third cone is stable as long as the distance between the jet's centre and the softer particle is smaller than R. This gives the condition for x c4 which in the limit small R leads to and since, according to the definition (3.1), 0 ≤ z ≤ 1/2, the above critical value stays in the range 1 ≤ x c4 ≤ 2. As a next step, one has to check if the overlapping cones have a chance to be merged. As shown in Fig. 2 (c), all the three cones overlap in the region 1 < x < x c4 . The central cone has the largestp t and the amount ofp t shared with the left jet is p t1 since the two jets have only one common particle. The condition for merging the left and the central cone is p t1 > fp t1 and it is always satisfied. Similarly, the right cone will always be merged with the middle cone. Therefore, in the region 1 < x < x c4 the jet area will be given by the area of the union of the three circles, depicted in Fig. 2 (c). In the region x c4 < x < 2, only two stable cones are found with no common particle so they are never merged. The two particles will end up in different jets. The area of the harder one will be the same as for the k t and C/A algorithms in this range of x.
The final formula for the passive area in the SISCone algorithm is given in appendix A. The corresponding curves are shown in Fig. 3 (right) for R = 0.6 and four values of z. Contrary to the k t and C/A algorithms, here the dependence on the asymmetry parameter, z, is very strong for x > 1. We note that the average area in this region is bigger by the factor of around two for jets consisting of two subjets with comparable p t with respect to the jets whose constituents are strongly ordered in transverse momenta. As expected, in the limit z → 0 one recovers the result from the Fig. 1 (left) since x c4 → 1 and the third stable cone cannot exist for any value of x.
The anti-k t algorithm is a sequential recombination algorithm with hierarchy inverted with respect to the k t -algorithm by using the measures d ij = min(p −2 ti , p −2 tj )∆ 2 ij /R 2 and d iB = p −2 ti . The hardest particle in a system will cluster first with anything within the geometric distance ∆ < R. In the event with two particles of arbitrary transverse momenta and a ghost the three competing distances are ∆ 1g , ∆ 2g and ∆ 12 .
For 0 < x < 1, the events in which the distance between the ghost and one of the physical particles is the smallest lead to formation of two small circles around particles p 1 and p 2 as depicted in Figs. 2 (d) and (e). If, however, ∆ 12 < ∆ 1g , ∆ 2g , then the real particles are clustered first, leading to the jet J 12 , which subsequently clusters with the ghost provided that the distance between the two is smaller than R. Up to a certain value of x, which we denote as x c2 , the two small circles are contained in the big circle of radius R and the area is simply that of 1-particle jet, i.e. πR 2 . Above x c2 the circle centred at the harder of the two particles protrudes and one gets the configuration shown in Fig. 2 (d). Then, above x c3 , the second of the two small circles starts sticking out leading to the jet depicted in Fig. 2 The conditions for the aforementioned critical values of x are given by The approximate solutions to each of these equations can found for R 1 (3.13) For 1 < x < 2 the particles p 1 and p 2 form two separate jets. However, the shape of the jet centred at the harder of the two particles will not be entirely conical. The presence of the x is the distance between the two particles in the units of R.
second particle will cause it to be clipped. This situation is shown in Fig. 2 (f). The boundary b between the jets J 1 and J 2 is defined by z∆ 1b = (1 − z)∆ 2b . Hence, it turns out that the area of the jet J 1 will be reduced with respect to πR 2 by the area of the overlap region of the circle of radius R around that jet and the circle of radius (1−z)z 1−2z ∆ and the centre away by (1−z) 2 1−2z ∆ from the centre of the jet J 1 . Above x = 1/(1 − z) the two circles do not overlap and the area of the harder jet becomes perfectly conical.
In Fig. 4, we show the curves corresponding to the analytic results for the passive area from the anti-k t algorithm, which can be found in appendix A. One notices that, in general, the anti-k t jets are not perfectly conical. If a jet consists of two particles of comparable hardness separated by ∆ 12 ∼ R its area deviates from πR 2 , the more so the closer to each other are the transverse momenta of the two constituents. On the other hand, if the separation between two particles is smaller than 1/(1 + z) or greater than 1/(1 − z) or if their transverse momenta are strongly ordered the resulting area of harder jet is equal to that of a single particle jet. For the maximally symmetric system, corresponding to, z = 0.5, the anti-k t result for passive area coincides with that from the C/A algorithm (cf. formulae from appendix A). However the two algorithms behave very different for z < 0.5.

Active areas
Since the computation of the active area involves clustering of a very complex system with a large number of ghost particles, in general, one needs to rely on the numerical analysis. As was the case for the passive areas also the active area results are expected to vary between the algorithms. This is because each algorithm comes with a different order of clustering of real particles and ghosts and it is this order that governs the behaviour of jet areas.
The numerical analysis has a potential to produce slightly different results for jets oriented along the y or φ axis. This is because it operates on the real phase space for which these directions are not equivalent. One expects, however, that for the values of R which are sufficiently small, the corresponding differences should be largely subdominant. In practice, all the results shown in this and the following sections correspond to jets with the two constituent particles aligned along the rapidity axis. We have checked explicitly that the opposite extreme case of the particles oriented along the φ axis gives virtually the same results for the jet areas. The situation for x active area anti-k t , R = 0.6 z = 0.5 z = 0.3 z = 0.2 z = 0.001 Figure 5: Active areas of the hardest jet for the system of two particles separated by the distance xR in the (y, φ) plane and having arbitrary transverse momenta. The plots correspond to the k t (top left), C/A (top right), SISCone (bottom left) and anti-k t (bottom right) algorithms. The asymmetry parameter, z, is defined in Eq. (3.1). All the results obtained with FastJet [30,31]. the mass areas discussed in section 4 is similar except for certain configurations studied with SISCone algorithms which we will comment on in due course.
We have studied the active areas of the hardest jet in the system of two particles of arbitrary relative hardness. We performed analyses with the same four jet definitions as discussed in the previous subsection, i.e. k t , C/A, SISCone and anti-k t together with the E-scheme for particle recombination. The results are presented in Fig. 5. The overall picture is very similar to that from the study of passive areas of preceding subsection.
The "non-conical algorithms", k t and C/A, exhibit either no dependence on z, in the case of k t , or only a weak z-dependence in the case of C/A. The active areas from both algorithms behave very similarly to their passive area counterparts. For x < 1, where the two particles form a single jet, the k t active area grows steadily whereas the C/A active area stay practically constant for low x and starts growing rapidly above certain value of x. The pattern of mild zdependence in the latter region is also the same as that seen for passive area, i.e. the results are slightly smaller for more symmetric system of two particles. For x > 1, the hardest jet consists of a single particle and its active area from k t and C/A does not depend on the asymmetry parameter z. Apart from the similarity to the passive area results, the 2-particle active areas are smaller by about 20%. This has already been observed for the 1-particle jets and the 2particle jets with strong p t -ordering in [3] and we have also recalled those results in table 1 and Fig. 1.
The SISCone algorithm gives the active area which depends quite strongly on z. We see that, for z ∼ 0.5, it stays well above the 1-particle result, πR 2 /4, also for x > 1 and then it drops at some value, just as was the case with passive area. Again, this is related to the existence of the third stable cone containing both particles p 1 and p 2 . Below a critical value of x, the same as that given in Eq. (3.9), this third stable cone is being merged leading to large jets. However, there is also a difference between the cases of passive and active areas for large z and x > 1, namely in that the active area falls with x for 1 < x < x c4 whereas the passive one keeps growing in this region. The mechanism responsible for this effect is the same as that which leads to the reduction of the 1-particle active SISCone area by the factor 1/4 with respect to the passive area of 1-particle jet as explained in [3]. It is related to additional splittings of stable cones with physical particles which overlap with stable cones built up solely of ghosts. Such splittings involving the central stable cone from Fig. 2 (c) lead to narrowing the jet with increasing x. This may lead to the active area of a jet containing two particles being smaller than the active area of a 1-particle jet. As shown in Fig. 5 (bottom left) such situation indeed happens for the system with z close to its maximal value 1/2 (identical transverse momenta of the particles p 1 and p 2 ). In this case, the critical value x c4 is reached for very high x (or never in the case of z = 1/2) and the 2-particle active area can smoothly decrease below the 1-particle result.
The anti-k t active area results shown in Fig. 5 (bottom right) are identical to the passive areas from Fig. 4. This comes from the fact that the ghost particles cluster among themselves only after all clusterings involving perturbative particles. The equivalence of the passive and active areas from anti-k t for the 2-particle jets with strongly ordered transverse momenta of the two constituents, corresponding to z = 0, has been pointed out in [27]. Their equivalence for arbitrary z, illustrated in Figs. 4 and 5 (bottom right) is also known and has been taken into account in the FastJet program (see the code accessible in [31]). As in the case of SISCone, also for the anti-k t algorithm there is a region of strong z-dependence.
For all the algorithms and all the z values shown in Fig. 5, the 2-particle active areas tend to the 1-particle results in the limit x → 0. However, in the limit x → 2 the results converge to the 1-particle area only for the "conical algorithms", i.e. SISCone and anti-k t . For k t and C/A the 2-particle jet areas are different from the values given in table 1 even if the separation x > 2. This is related to the fact that these algorithms build up the jets starting from formation of local structures which are subsequently merged leading to jets of very irregular areas.

Mass area 4.1 Jet mass
The mass of a light quark jet arises due to its substructure. If a jet J 12 is obtained from clustering two subjets J 1 and J 2 with masses much smaller than their transverse momenta, m J 1,2 ≪ p tJ 1,2 , then the mass of the jet J 12 in the small R limit is given by . Jet mass is an infrared and collinear safe quantity that can be calculated order by order in perturbation theory. Because of the soft and collinear singularity of the QCD matrix element for gluon emission, the distribution of masses of the QCD jets gets strong enhancement for low values of m J . At the lowest non-trivial order (i.e. NLO of the perturbative α s expansion) the approximate result for the mass distributions of QCD jets is given by [1,2,17] where C is the colour factor of the initiating parton. The higher order terms are enhanced by further powers of ln Rp tJ m J . The resummed corrections are known for jets from e + e − [33][34][35] and DIS [36]. Contrary to the case of QCD, the distribution of jets coming from decay of a heavy object is flat in z and therefore the mass distribution of such jets is peaked around the mass of the heavy object which originated them.
As discussed in the preceding sections, the area of a jet provides a measure of the susceptibility of the jet's momentum to soft background. Such a measure, combined with a method of determination of the level of this background, like the one discussed in [20,21], allows one to account for the contamination from UE/PU and correct the momentum of the jet accordingly.
Similarly, one can define a quantity which measures how much the mass of a jet can be modified by the soft radiation for jets defined with a given algorithm. In what follows, we define such a new characteristic of jets, which we call the mass area, and use it to study 1-and 2-particle jets from the four jet-clustering algorithms.

Passive mass area
In analogy to the passive jet area from section 2.1, the passive mass area of the jet J can be defined as where m J is the mass of the jet J, m Jg is the mass of a jet that consists of the jet J and the ghost g and p tg is the transverse momentum of that ghost. The passive area defined in the above equation is dimensionless. Its value reflects susceptibility of the mass of a jet to the contamination from soft radiation in the limit in which this radiation is infinitely soft and pointlike.
For a jet consisting only of a single hard particle with transverse momentum p t1 , the passive mass area for all the four algorithms is given by The above result coincides with the polar moment of inertia of a disk (or cylinder) of radius R. This correspondence is general and, in fact, the passive mass area defined in Eq. (4.3) is nothing but the polar moment of inertia, i.e. the measure of resistance of an object to torsion. This resistance is small if the mass is distributed close to the rotation axis (here, the jet centre) and large if the mass extends far away from the rotation axis.

Passive mass areas for general case of 2-particle system
The calculation of the passive mass areas for the system with two particles of arbitrary z proceeds in close analogy with the calculation of passive areas for that system which led to the results presented in section 3.1. In particular, all the subranges of the separation variable x and the corresponding pictures from Fig. 2 are valid also for passive mass areas. However, now the integrand in the definition given of Eq. (4.3) is less trivial. The total squared mass of the system composed of two massless perturbative particles and the ghost g is given by If the two particles p 1 and p 2 come from the soft QCD splitting the last term is negligible. If, however, they come from a decay of a heavy object, the last two terms are commensurate. Plugging the above expression (4.5) into the definition (4.3) allows one to obtain analytic results for passive mass areas from all four algorithms. We give the corresponding formulae in appendix A. Below, we comment on the results for each of the four algorithms, which are shown in Figs. 6 and 7.
The k t algorithm produces jets whose passive mass areas do not depend on the relative hardness of the two constituent particles. This occurs in spite of the fact that the integrand in the definition (4.3) with m Jg taken from Eq. (4.5) does depend on z. However, because the k t algorithm always clusters the ghost first with one of the particles p 1 and p 2 , the shape of the jet in the (y, φ) plane has an additional reflection symmetry. This, in turn, implies that the integrated contributions from each particle differ only by the multiplicative factor, 1 − z for p 1 and z for p 2 . Therefore, the z-dependence cancels in the sum. As shown in Fig. 6 (left), the qualitative behaviour of the mass area is the same as that of the area from Fig. 1 (left). As long as the separation between the particles x < 1 the passive mass area grows fast with increasing x. Quantitatively, however, the change of the passive mass area of the 2-particle jet with respect to the 1-particle jet is much bigger than the corresponding change for the passive jet area. As we see by comparing the results from Figs. 1 (left) and 6 (left), the former changes by the factor ∼ 3.6 in the range x < 0 < 1 while the latter only by the factor ∼ 1.6. For x > 1 the hardest jet consists solely of a single particle. However, the presence of the second jet in the neighbourhood causes its mass area to be slightly smaller than πR 4 /2. The value of a single particle mass area is being slowly approached as we go to x = 2.
The Cambridge/Aachen algorithm gives jets with mass areas weakly dependent on the asymmetry parameter z as is depicted in Fig. 6 (right). As in the k t algorithm, also here the overall shape of the mass area as a function of the distance between constituent particles p 1 and p 2 is very similar to the shape found for the passive area (cf. Fig. 3). The quantitative change is, however, again much bigger for the mass area whose 1-particle value (4.4) can be modified up to the factor ∼ 3.6 by the presence of the second particle (comparing to the corresponding factor of ∼ 1.6 for the area). As can be found by inspecting Fig. 6 (right) or the corresponding formulae from the appendix A, there is also a small qualitative difference between passive area and passive mass area in the behaviour for 0 < x < x c1 . The mass area starts growing with x right from the beginning contrary to the area which is constant for x < x c1 .
The SISCone algorithm returns jets with mass area strongly dependent on the separation x between two constituent particles. For x < 1 the mass area differs from the 1-particle result only mildly, growing slightly with x (unlike the area which is constant in this region, cf. Fig. 3). For x > 1, however, the mass area of 2-particle SISCone jets jumps by the factor of four and continues growing very fast with x reaching the value ∼ 11 for the 2-particle system with z = 1/2 and the separation x = 2. We have seen already a similar behaviour for the areas of SISCone jets shown in Fig. 3 (right) but it is much bigger in quantitative terms for the mass area. The cause of the big change of the mass area at x = 1 is the same here, namely for two particles with comparable transverse momenta there is a region of x where there are three stable cones which all get merged leading to a gigantic jet. One can exploit this property in two different ways. If one is interested just in measuring the jet mass with an algorithm which is as little sensitive to the soft pointlike radiation as possible than, clearly, the result from Fig. 7 (left) strongly disfavours SISCone. This is especially true if the jet comes from decay of a heavy object in which case its subjects have similar hardness and the separation x may be easily greater than 1.
The result from Fig. 7 (left) could alternatively be regarded as a useful additional characteristic of a jet. It could be used to devise some discriminating variable which would help separating QCD jets, which have small mass area, from the jets coming from a heavy object decay, which exhibit significantly larger passive mass area.
The anti-k t algorithm produces jets with mass area growing slowly with x up to the critical value x c2 from Eq. (3.12). Between x c2 and 1 the growth becomes much faster and the more so the closer to each other are the values of transverse momenta of the two constituent particles.
For x > 1 the hardest jet consist of a single particle and its mass area slowly approaches πR 4 /2 with increasing x. Hence, qualitatively, the behaviour is not very different from that seen in Fig. 4 for the passive area except the region below x c2 , where there was no growth in the latter case. But as for the three algorithms discussed above, also for anti-k t , the quantitative effect of adding a second particle is much bigger for the mass area than for the area of a jet. Overall, the passive mass area from the anti-k t algorithm may be substantial especially for symmetric configurations (z ∼ 0.5) with interparticle separation x ∼ 1. One notices also that the anti-k t result for z = 0.5 coincides with that from the C/A algorithm for the same z value. This is reflected as well in the exact formulae given in appendix A. However, as we go away from z = 0.5 the two algorithms behave very different as seen from Figs. 6 (right) and 7 (right).

Scaling violation of passive mass area of QCD jets
Mass area is sensitive to substructure of a jet. For the QCD jets this substructure arises due to radiative emissions of gluons. Therefore, we expect that the average mass area of a QCD jet will acquire logarithmic dependence on jet's transverse momentum. The coefficient in front of this logarithm, which we will call anomalous dimension, can be easily found in the small R approximation. The results for jet areas were obtained in [3]. Here, we will determine their passive mass area counterparts. The mean mass area at the order α s for a given jet algorithm and with a given R value can be written as The O(α s ) correction in the limit of strongly ordered transverse momenta of the particles, p t2 ≪ p t1 , adequate for QCD jets, is given by Q 0 /∆ 12 dp t2 dP dp t2 d∆ 12 (a m (∆ 12 ) − a m (0)) , (4.7) with dP dp t2 d∆ 12 being the probability for emitting a gluon with transverse momentum p t2 at relative angular distance ∆ 12 and the second term in the bracket accounting for virtual corrections. The lower limit of the integration over p t2 contains a cut-off Q 0 for the relative transverse momentum of the particle p 2 with respect to particle p 1 . The need for such a cut-off comes from the fact that the mass area, just like the area of jets, is not an infrared safe quantity and its value depends on non-perturbative effects. 2 The convergence of the integral over ∆ 12 is guaranteed by the property that the passive mass area of the hardest jet in the 2-particles system tends to the 1-particle result both when ∆ 12 → 0 and ∆ 12 → 2R. Taking the QCD matrix element in the soft and collinear approximation dP dp t2 d∆ 12 and performing the integration in Eq. (4.7), one finds in the fixed and in the running coupling approximation, respectively. In the latter case ∆ 12 was replaced by R in the argument of the coupling which affects only the terms not enhanced by the logarithm of R. C i is a colour factor corresponding to the parent particle and b 0 = (11C A − 2n f )/(12π).
anti-k t 0 0 0 0 Table 2: Coefficients governing the logarithmic scaling violation of passive mass areas with transverse momentum of a jet for 2-particle QCD jets. The analytic results are normalised to R 4 . We use the shortcut notation for ξ ≡ (ψ ′ (1/6) is the trigamma function. In the results for s 2 m , ζ(3) ≃ 1.202 is a special value of the Riemann zeta function. The numerical results are normalised to the passive mass area of a 1-particle jet.
The coefficient d m , which depends on jet definition, is the aforementioned anomalous dimension and it is given by In a similar manner, one can compute fluctuations of mass areas defined as where we have dropped σ 2 m (0), which is identically zero, and ∆a m 2 as it gives higher order corrections in α s . A calculation similar to the above leads to the results identical to those given in Eq. (4.9) with just d m replaced by s 2 m where the latter is defined as (4.12) The analytic results for the coefficients d m and s 2 m , normalised to R 4 , for all four algorithms are given in table 2. There, we also quote their approximate numerical values normalised to the 1-particle passive mass area. One notices that the coefficients d m depend strongly on jet algorithm. The largest value is found for the k t algorithm. The next in the hierarchy is the C/A algorithm with its d m coefficient already more than factor four smaller of that from k t . SISCone produce fairly small and negative result whereas anti-k t yields identically zero. The observed hierarchy is consistent with the behaviour of passive mass areas of strongly ordered system (i.e. z = 0) from Figs. 6 and 7. The large coefficient for the k t algorithm comes about due to strong rise of the passive mass area in the region of small interparticle separations enhanced in the integral (4.10). The smaller d m from C/A is related to the fact that the mass area in this algorithm becomes significantly different from the 1-particle result at x > 1/2 hence in the range which is less favoured by (4.10). Similarly, the small and negative d m from SISCone comes from the fact that the mass area in this algorithm deviates from πR 4 /2 only for x > 1 where it becomes lower than the mass area of a 1-particle jet. One practical conclusion from table 2 is that the passive mass areas of jets from k t algorithm will depend much more strongly on those jets' p t than will the passive mass areas of other algorithms. This is a similar conclusion to that found in [3] for areas of jets. The values of s m coefficients from table 2 suggest significant fluctuations of the passive mass areas of QCD jets. Here the pattern essentially follows that of d m coefficients with, however, a somewhat smaller difference between k t and C/A algorithms.

Active mass area
We define the active mass area as follows where m J is a mass of the pure jet J and m J{g i } is a mass of the jet consisting of J and a dense coverage of ghosts from some random ensemble {g i }. Similarly, p tJ{g i } is a transverse momentum of the whole jet with real and ghost particles. The ghosts have density ν {g i } and the infinitesimally small average transverse momentum p tg . The limit of infinite density of ghosts is taken and, in addition, the result is averaged over many sets of ghosts. The standard deviation of the distribution across these ghost ensembles is given by (4.14) Consider the system with one or more particles whose transverse momenta are well above the ghost scale p tg . In the case in which such particles are massless where we used the definition of 4-vector active area A µ (J|{g i }) from Eq. (2.4). Note also that p µ J{g i } is a 4-momentum of the whole jet consisting of physical and ghost particles. The two terms on the right hand side are of two fundamentally different scales. The second term is itself an interesting characteristic of a jet and, as we shall see in Section 5.2, there are cases in which it is useful to know it. However, because of an extra power of an arbitrary small ghost transverse momentum, p tg , the contribution of this second term to the mass area, as defined in Eq. (4.13) is negligible and that is why we drop it here. This, together with combining Eqs. (4.13) and (4.15), leads to the following formula for the active mass area which is particularly convenient to work with. In what follows, we will be computing active mass areas of jets using the above equation with the 4-vector area A µ (J) calculated with FastJet. For definition of the latter quantity we refer to Section 2.2, the original paper [3] or FastJet documentation [31].

Active mass area for 1-particle jet
The k t and Cambridge/Aachen algorithms allow only for numerical study of active mass areas. The formula (4.16) can be applied directly for 1-particle jet. The distributions of active mass areas from the two algorithms, normalised to πR 4 /2, are shown in Fig. 8 (left). The results from k t and C/A are very close to each other. Similarly to the case of the jet areas [3], the maxima of the distributions lie significantly below 1. The corresponding results for the average mass areas and their standard derivations are given in table 3. The 1-particle active mass area is very close to the 1-particle passive mass area but if fluctuates significantly across the ghost ensembles. This is partly different from the jet area case where the values of active areas were consistently 20% below those of passive areas for both algorithms [3] (cf . table 1). Qualitatively, however, the results shown in Fig. 8 (left) and in the first two rows of table 3 are similar to those found in [3] for jet areas. SISCone anti-k t Figure 8: Distribution of active mass area A m of 1-particle jets from the k t and C/A algorithm (left) and from SISCone and anti-k t (right). The curves correspond to numerical results obtained from FastJet. The width of SISCone and anti-k t distributions arises solely due to finite binning. Table 3: Average active mass areas for 1-particle jet together with corresponding standard deviations for four algorithms. The numbers for k t and C/A correspond to the distributions from Fig. 8 (left) whereas for SISCone and anti-k t analytic results area given. The latter are confirmed by the numerical study as shown in Fig. 8 (right).
The SISCone and anti-k t algorithms allow for analytic study of the mass areas of 1particle jets. As pointed out in [3], the split-merge procedure used in SISCone always results in the split between two stable cones both if one of them does or does not contain a hard particle. This, in turn, reduces the radius of the hard jet by the factor 1/2 and therefore, from (4.4), the active mass area by the factor 1/16. This result does not depend on the ghost ensemble, assuming that the coverage of ghosts is sufficiently dense, hence the fluctuations vanish. The anti-k t algorithm leads to 1-particle jets of a circular shape with radius R. Therefore, the active mass area of such jets coincides with the passive mass area result (4.4) and the fluctuations of the active mass area are identically zero.
These analytic results are summarised in table 3. The corresponding distributions from numerical study are shown in in Fig. 8 (right). We see that both algorithms give distribution of active mass areas for 1-particle jets which are close to δ-function. The width comes solely from finite binning.

Active mass areas for general case of 2-particle system
The results for active mass area of the hardest jet in a system with two particles of arbitrary relative hardness are given in Fig. 9. As before, we present the active mass areas normalised to πR 4 /2 as functions of the separation (in units of R) between the two particles. All curves correspond to numerical computations with FastJet. One has to keep in mind that in general, as was the case for 1-particle mass areas, the 2-particle mass area is a distribution and the curves shown in Fig. 9 corresponds to its mean value. The active mass area from the k t algorithm does not depend on the asymmetry parameter z, just as the passive area. For C/A this dependence is weak. On the other hand, similarly to the case of passive mass areas, the active mass areas of SISCone and anti-k t strongly vary depending on whether the two constituent particles are of comparable hardness or whether their transverse momenta are significantly different.
The active mass areas from the sequential recombination algorithms are virtually, for k t and C/A, or exactly, for anti-k t , identical with their passive mass area counterparts. Regarding only the shape, the situation was quite similar for the 2-particle areas from k t and C/A, only that there the normalisation of the active area was different. Since, as seen from the first two rows of table 3, the active and passive 1-particle areas are almost identical for k t and C/A, also the results from Fig. 9 and Figs. 6 and 7 coincide to large extent for those algorithms. The identity of passive and active mass areas for anti-k t , has the same origin as the analogous identity for the areas observed in section 3 and it comes from the fact that the ghosts cluster among themselves after all clusterings with physical particles have occurred. Altogether, for the active mas areas from the sequential recombination algorithms, one observes the same pattern in the relation of these results to the active areas as seen earlier for the passive quantities. Specifically, while the qualitative picture for the active areas and active mass areas is very similar, quantitatively the effects seen for the latter are much stronger.
The case of SISCone is quite special. Firstly in that its 1-particle active mass area gets modified very strongly by the presence of the second particle of comparable hardness. The similar conclusion was drawn already for the passive mass areas (cf. Fig. 7). However, in absolute terms, the active mass area of a 2-particle SISCone jet remains still much smaller than both its passive counterpart as well as the active mass areas from all the other algorithms shown in Fig. 9. This implies that the sensitivity of the jet mass to soft background should be the lowest for SISCone. This, in turn, translates into a particularly good mass resolution of this algorithm seen e.g. in [6].
The mechanism responsible for the strong relative change of the the 2-particle jet active mass area with respect to the 1-particle case for SISCone is the same as discussed in section 3.1 for the passive area and refers to the existence of the third stable cone containing the two physical particles. This cone disappears at x = 1/(1 − z) and two particles separated by greater distance form two distinct jets and hence the drop of the active mass area seen in Fig. 9 (bottom left). If the value of z is sufficiently large and the drop occurs at x ≃ 2, the active area of a 2-particle jet from SISCone starts falling at some x, an effect of split-merge procedure involving the third stable cone (with particles p 1 and p 2 ) and the pure ghost cones. This is seen in Fig. 9 (bottom left) for the 2-particle system with z = 1/2. A similar effect was discussed in section 3.2 for the active area.
For all algorithms the value of 1-particle passive area from table 3 is recovered at x = 0. However, at x = 2 only SISCone and anti-k t yield πR 4 /2 and a larger value is given by k t and C/A. As mentioned in section 3.2, this comes from the fact that those algorithms build jets starting from formation of local structures.
A general comment concerning the results of Fig. 9 is that sensitivity of the active mass area to the relative hardness of the constituent particles (reflected in the value of z) is related to the shape of jets produced by a given algorithm. Namely, the algorithms which depend strongly on z are those belonging to the class of "conical algorithms", i.e. SISCone and anti-k t . Though, as noticed earlier, in general, their jets are not ideally conical, still their shapes in the (y, φ) are usually quite regular. Conversely, the k t and C/A algorithms, whose jets are highly irregular in shape show either none or weak dependence on z.
The whole variety of behaviours of the active mass areas observed in Fig. 9, depending on the algorithm, asymmetry parameter z or the interparticle distance x encourages one to exploit it on the analysis-by-analysis basis.

Scaling violation of active mass area of QCD jets
We conclude this section by the study of average leading effect of perturbative radiation on the active mass areas of QCD jets. In analogy to the passive mass area, we define where A m (1-particle-jet) depends on jet algorithm as summarised in table 3. The perturbative correction to the 1-particle-jet result can be computed from the formula analogous to Eq. (4.7) with a m replaced by A m and the upper limit 2R removed. The latter is related to the fact that the active mass area of a 2-particle jet may in general be different than A m (1-particle-jet) for  Table 4: Coefficients governing the logarithmic scaling violation of active mass areas with transverse momentum of a jet for 2-particle QCD jets, normalised to the passive mass area of a 1-particle jet. The numerical results obtained by performing interactions from Eqs. (4.18) and (4.22) using the functions corresponding to those shown in Fig. 9.
∆ 12 > 2R, as noted in the previous subsection. Simple integration gives 18) and the analogous fixed-coupling result as in section 4.2.2.
As discussed at the beginning of section 4.3, the active mass area comes with an intrinsic fluctuations due to fluctuations of ghosts. Therefore, the fluctuations of the active mass area of QCD jets can be separated into two components where the first one, being just the contribution from one-particle jets, is given for each of the four algorithms in table 3. The second term comes from 2-particle configurations. It acquires contributions both from the change of the mass area caused by the perturbative radiation (as in the passive case) and from the fluctuations of ghosts used to determine the mass area of those configurations (absent in the passive case). The corresponding formulae for active areas was derived in [3]. A straightforward, analogous derivation leads to the following result for logarithmically enhanced, O(α s ) contribution to the active mass area The results for the coefficients D m and S m , normalised to πR 4 /2, are given in table 4. The numbers come from integration of the numerical results for active mass areas in the case of k t and C/A algorithm and the analytic results in the case of SISCone and anti-k t . One notices that the anomalous dimension and its fluctuations for active mass areas are very close to those found for passive mass areas with an exception of SISCone whose active area anomalous dimension, though of similar absolute magnitude, comes with an opposite sign. Therefore, most of the discussion from section 4.2.2 related to table 2 remains valid also for the results from table 4. The only qualitative difference of the positive D m versus negative d m for SISCone comes from the fact that for the former the active mass area of the hardest jet in the 2-particle system never goes below the 1-particle result (see appendix B).

Mass areas of simulated jets
Real jets are of course more complex than just the 1-or 2-particle systems that we studied so far. Nevertheless, we believe that a series of features of mass areas from those found in the preceding sections will be present also in jets measured in the real life. To provide some support to that statement, in this section, we perform a brief study of mass areas of jets from Monte Carlo (MC) simulation. Compare to the 1-or 2-particle jets discussed earlier, the simulated jets will have more accurate modelling of QCD radiation, in particular that associated with parton shower, as well as hadronization.
We will examine jets from Pythia 6.4 [37] dijets events with the underlying event switched off. As before, we will be interested in the hardest jet in an event. We will not, however, impose any rapidity or transverse momentum cuts. Our aim will be to obtain an analog of Fig. 9 for the MC jets. In the case of two particle system each event was characterized by the the asymmetry parameter and by the angular distance between the particles, defined respectively in Eqs. (3.1) and (3.2). Those particles were meant as an approximation to two subjets of a realistic jet. The meaningful subject analysis of real jets is, however, possible only for some jet algorithms. Moreover, it is not very useful in the region of x > 1. Therefore, for the purpose of this Monte Carlo study, we need to use slightly more sophisticated strategy. It will be based on the procedure from [2,28] were it was employed to study the reach of jet algorithms. It involves using a "reference algorithm" for which choose C/A with R=1.2. First, an event is clustered with this algorithm and the hardest jet is decomposed into its two main subjets S 1 and S 2 . Those subjets are used to determine x and z. Then, the same event is clustered with one of the four "test algorithms", k t , C/A, anti-k t or SISCone with R=0.6. Subsequently, one looks for the hardest "test jet" which belongs to the same hemisphere as the hardest jet from the reference algorithm. The mass area of this jet is assigned to the (x, z) pair determined in the first step. If the separation x between the two subjets, S 1 and S 2 , is small, they will predominantly both end up in the hardest test jet. If, on the other hand, this separation is large, only one of them, either S 1 or S 2 , will have significant overlap with the test jet. Each of the two situations should be reflected in the value of the mass area of the test jet.
The results obtained after applying the above procedure to the dijet events from Pythia are shown in Fig. 10, where the mass areas are presented as functions of x in bins of the asymmetry parameter z. Assigning correct substructure is difficult for the cases with large asymmetries, and that is why we do not go below z = 0.1. Otherwise, as shown in [2,28], the method works well with perhaps slightly higher uncertainties for x ∼ 1 and z being close to its either lower (k t ) or the upper (anti-k t ) limit.
The first observation from Fig. 10, is that all four algorithms give results which are in qualitative agreement with the 2-particle picture of Fig. 9. The general pattern of the growth of mass area with x and then the drop at some point for x ≥ 1 is well reproduced. Also the sensitivity to the z value, low for k t and C/A, noticeable for anti-k t and large for SISCone, is consistent with the 2-particle picture. There are, however, quantitative differences between Figs. 9 and 10. They are clearly related to the extra amount of perturbative radiation which builds up the structure of physical jets.
Let us begin with the three sequential algorithms, k t , C/A and anti-k t . In the 2-particle case the active mass areas were in the same ballpark. As seen from Fig. 10, for MC jets, the three algorithms exhibit clear hierarchy with the mass areas from k t being on average significantly higher than those from C/A, which in turn are much larger than the mass areas from the anti-k t . This can be understood by noticing that the above hierarchy is consistent with the one found in section 4.3.3 for the scaling violation coefficients D m (table 4). The large coefficient for the k t algorithm means that even collinear emissions can lead to a significant increase of mass area. For realistic MC jets multiple such emissions are provided by parton shower. This also explains somewhat smaller, but still significant difference in mass areas between the 2-particle and MC jets from C/A and a very small difference in the case of anti-k t , whose scaling violation coefficient is identically zero. Another quantitative difference is related to the value of the mass area for x > 1. In the 2-particle system this value was close to the passive area of a 1-particle jet. For MC jets, though there is a significant drop above x = 1, which we interpret as the two subjets not being merged, the mass area in that region is not necessarily close to the 1particle mass area. The latter is related to the fact that the two widely separated subjets, S 1 and S 2 , with x > 1 have enough room to develop their own substructure and hence cannot be approximated by a single particle. The extent to which their mass area differs from πR 4 /2 is again related to the scaling violation coefficient and the same hierarchy is observed. We have also checked that the values of mass areas for x = 2 are indeed very close to the average mass area of the hardest jet in the system. The case of SISCone algorithm is special because of its highly nontrivial dynamics involving the split-merge procedure and therefore it should be discussed separately. As already mentioned, the general pattern found for MC jets is the same as the one from 2-particle results. In particular, we see that the SISCone jets with finite z often have very large mass areas even for x > 1, which would point out to the interpretation that their subjets are likely merged in this region. We note also that this observation is compatible with the study of reach of the SISCone algorithm from [2,28] and in particular with the discussion therein related to the R sep parameter. As in the case of k t and C/A, also the SISCone jets from Pythia exhibit somewhat larger mass areas than the 2-particle jets. Part of the reason is again some sensitivity to additional radiation from parton shower, though that must be moderate given that fact that the D m coefficient in table 4 is not very big. Another important mechanism which leads to larger mass areas of the SISCone MC jets is related to the split-merge procedure. As discussed in section 4.3.1, the small value of the mass area of 1-particle jet arises due to the fact that the cone around that particle overlaps with cones of pure ghost jets and since there is no other particle that they could share such overlapping cones always split. This must be somewhat different in the realistic event which is populated with many physical particles. As a consequence, the number of pure ghost jets is greatly reduced and therefore the above mechanism, which led to reduction of mass area of the 1-particle jet, is not that efficient here. Similar conclusion can be drawn from the study of the areas of realistic jets from [3]. To further test this reasoning we varied the split-merge parameter f and observed that lower value of this parameter (corresponding to easier merging of overlapping cones) leads to larger average value of mass area of the hardest jet and vice versa.
The mass areas of realistic jets from Monte Carlo simulations deserve detailed study. In this section we gave a brief illustration of what sort of effects one may expect if one goes beyond the 2-particle approximation of a jet. The main conclusion from our MC study is that the 2-particle results for x and z dependence, highlighted in Fig. 9, together with the study of sensitivity of the algorithms to the perturbative radiation, allow one to explain most of the features of mass areas of the simulated jets. Fig. 10 provides additional guidance for the choice of the jet algorithm which minimizes background contamination. Consistently with the results of the 2-particle study from preceding sections, it points at SISCone and anti-k t disfavouring, in that particular respect, the k t and C/A algorithms.

Correcting jet mass for pileup contamination
The definition of the active mass area from Section 4.3 suggests its practical application. Suppose that instead of ghosts we have in our event a dense set of soft particles distributed fairly uniformly in rapidity and azimuthal angle. Such particles may be coming, for instance, from pileup (PU). They will normally be clustered together with genuine jet particles leading to contamination of the jet. This contamination, in turn, will cause a systematic shift of a mass of the jet where m 2 J PU is a mass of the jet J from an event with pileup whereas m 2 J is a mass of the same jet from an event without pileup.
Consider an event with the average density of the transverse momentum of PU particles per unit area equal to ρ. Then, in the case of massless hadrons, the magnitude of the shift of the mass of jet J PU due to pileup is given by As we see, the leading correction in ρ comes from a term involving active mass area. It is important to notice that the mass area is computed from Eq. (4.16) using the uncorrected jet J PU containing both genuine jet particles and the contamination from pileup. Qualitatively, a contribution to m J PU proportional to ρ 2 is needed since it accounts for the fact that the change of mass of a jet due to pileup comes not only from clustering pileup particles with jet particles but also from clustering pileup particles among themselves. The latter is a subleading effect in powers of ρ. Quantitatively, to get the mass correction which is valid also at large ρ one needs a second, negative term in Eq. (5.2), which combined with the O(ρ 2 ) component from the first term gives the full subleading contribution.
We have studied the effects of mass shift due to pileup using dijet events from Pythia 6.4 [37] combined with pileup from Pythia 8 [38]. 3 All hadrons were passed through a simple calorimeter with cells in the (y, φ) plane of size 0.1 and rapidity coverage |y| < 4.5. Then, the calorimeter towers, which correspond to massless 4-vectors, were used as input to clustering algorithms. Jets were found with the anti-k t algorithm with R = 0.7 and only events with p t of the hardest jet greater than 150 GeV were accepted.
To correct jet masses for pileup contamination one needs to know the level of the pileup transverse momentum per unit area, ρ, for each event and then apply the formula (5.2). To determine ρ, we used the area/median method proposed in [21] and implemented in the FastJet package [31]. The method measures ρ on the event-by-event basis. It starts by adding ghost particles to an event. Then, all particles (physical and ghosts) are clustered with an infrared and collinear safe jet algorithm. That leads to a set of jets, {j}, with jets ranging from hard to very soft. That set is used to determine ρ, which is defined as a median of the distributions of {p tj /A j }, where p tj is the transverse momentum of the jet j and A j is its scalar area. Using the median is a way to dynamically separate the soft and hard parts of an event. The method leaves some freedom in the choice of the jet algorithm, the rapidity range in which ρ is measured as well as treatment of the hardest jets. Following suggestions from the literature [20,21], we used the C/A algorithm with R = 0.5 and the active area definition. Then, the median was determined taking all jets in the range |y| < 4. As shown in [20,21], for large rapidity range, the influence of the two hardest jets on the value of ρ is very small, therefore we did not remove them from the set of jets used for ρ determination.
One has to remember that ρ characterises pileup in a given event only on average and that there are always point-to-point fluctuations. Therefore, it may happen, especially for light jets, that our procedure occasionally leads to negative values of m 2 J . This corresponds to the cases with negative fluctuation in which the actual contamination from pileup to our jet is locally smaller than the typical level of PU in that event represented by the value of ρ. In such events, our procedure subtracts too much from a jet. There is, however, a second class of events with positive point-to-point fluctuation in the vicinity of a jet and for those events the correction based on global ρ is slightly underestimated. The above errors of under or overestimating the mass correction will cancel in quantities averaged over many events like m 2 J or m J . To make sure that this happens for the latter observable, for events with m 2 J < 0, one needs to set m J = − −m 2 J . In Fig. 11, we show the average mass of the hardest jet as a function of the number of pileup, i.e. additional min-bias events accompanying the production of hard dijet. The plots correspond to the LHC at √ s = 7 TeV (left) and 14 TeV (right). We see that the average mass of the hardest jet grows, approximately linearly with n. That is easily understood with our formula (5.2) in which the main contribution comes from the term linear in ρ and it is natural to expect that ρ of pileup will scale linearly with n, which is just the number of alike min-bias events. We see also that the effect of pileup contamination is strong reaching up to 70% shift in the mass for the cases with high n. For reference, we show a horizontal line corresponding to the case without pileup (n = 0). Then we apply the "mass area correction" using the first term from Eq. (5.2) as well as the full correction from that equation involving both mass area and the ρ 2 term. We see that the mass area term dominates the mass shift and the correction involving this term alone works very well up to fairly high pileup, n ∼ 12 − 15. For larger n, however, the second term, subleading in ρ, becomes necessary to get a decent value of corrected jet mass. We note that using both terms is equivalent to directly correcting the 4-vector of a jet with the help of the 4-vector area and then calculating its mass [39]. Our study from this section gives however a significant additional insight into the structure of these corrections. In particular, we have gained an understanding which contributions to the mass shift are dominant and why. Overall, the example from this section shows that most of the contamination of the jet mass due to pileup comes from the term involving mass area. On one hand, that confirms that the mass area is a robust characteristic of the susceptibility of the jet mass to contamination. On the other hand it provides a simple method to correct for that contamination and recover, with excellent accuracy, the value of the original jet mass. More studies of jet mass corrections, in particular a systematic analysis and optimisation along the lines of [3,21,40,41], is left for future work.

Conclusions
We have proposed a new characteristic of a jet, called mass area, which is supposed to measure the susceptibility of the jet's mass to soft background like pileup or underlying event. It is a close relative of the catchment area of jets introduced in the work of [3,21]. Two complementary definitions of the mass area were given suitable for two different limits of the distribution of UE/PU: the passive mass area, measuring the sensitivity of mass of a jet to contamination from pointlike radiation, and the active mass area, more appropriate if the soft background radiation is diffuse and uniform.
We have investigated the properties of the passive and active mass areas for four jet clustering algorithms, k t , Cambridge/Aachen, anti-k t and SISCone by studying systems with one or two particles of arbitrary hardness. We have also confronted the above results with those obtained with more realistic jets simulated by Pythia.
As a preparatory step, we have generalised in section 3 the results for passive and active areas of 2-particle jets to the case where the two constituent particles have arbitrary transverse momenta. This part of our study shows that even the "conical algorithms", SISCone and antik t , rarely produce jets whose shapes in the (y, φ) plane are circular. As discussed in section 3 and illustrated e.g. in Fig. 2, a very simple system of two particles with comparable hardness leads to the whole variety of jet areas depending on the algorithm, asymmetry parameter z or the distance between the particles x.
The study of mass areas of 1-and 2-particle jets, presented in section 4, reveals that similar richness exists also for this characteristic of a jet. A general pattern which is seen is that the "conical algorithms", SISCone and anti-k t , exhibit strong dependence on the asymmetry parameter z measuring how much of the total jet's transverse momentum is taken by the softer particle. On the contrary, the k t and C/A algorithms, with jets of highly irregular areas, show virtually no dependence on z.
The dependence on the distance between the two constituent particles in the (y, φ) plane, xR, is substantial for all algorithms though its character varies across them. The results from k t grow monotonically for x < 1 whereas for C/A and anti-k t they start differing from the 1particle result for much larger x in the ballpark of x > 0.5. Finally, SISCone shows a completely different x-dependence which is the largest for x > 1, though the active mass areas changes significantly with x also for x < 1.
In absolute terms, the active areas and mass areas of 2-particle jets from SISCone are much smaller than from the other three algorithms. It is related to the split-merge procedure, which is a part of the SISCone algorithm, and which results in splittings of stable cones with perturbative particles overlapping with cones containing only ghosts. The low active area and mass area of SISCone means that the jets from this algorithm will be, on average, much less contaminated by a soft and diffuse background than the jets from k t , C/A or anti-k t . This, in turn, will result in a very good p t and mass resolution.
Our study of active mass areas of jets from MC simulation showed the same pattern of x and z dependence as that found for 1-and 2-particle jets. This, together with the results from the study of sensitivity of mass areas from different algorithms to perturbative radiation, was sufficient to account for most of the features of mass areas of the simulated jets. We used the simulated jets also to study corrections to jet masses due to contamination from pileup. We found that most of the systematic shift in mass caused by soft background comes in the form of a term involving mass area. That confirms that the mass area indeed captures the essential aspects of the modification of jet mass and provides simple method to correct for it and recover its original value.
As for the the comparison between the areas and the mass areas of jets, we have seen that, qualitatively, the two characteristics exhibit similar behaviour. Quantitatively however, the effects observed for mass areas are always significantly bigger, a fact that we associate with an additional power of angle in the definitions of the mass areas with respect to the areas.
We envisage several ways in which the concept of mass area, introduced in this paper, could be used. Firstly, as a measure of the susceptibility of the jet's mass to contamination from soft radiation, it provides a guidance for choosing a given jet algorithm for a given purpose. For example, to minimise the systematic error in determination of mass of a jet one may choose the algorithms whose mass area is smallest. Secondly, knowing how the mass area depends on the relative hardness of the subjets may help in designing a discriminating variable which in turn could allow one to separate the QCD jets from the jets coming from decays of heavy objects. In this context, one could consider, for instance, using the mass area as an additional variable entering the Boosted Decision Tree [42]. It would be also very interesting to further study the corrections of jet masses for the contamination from soft background. Possible extensions of the analysis from section 5.2 could involve an optimization of jet definition as well as correcting for contamination from underlying event. Finally, it would be worth investigating the effects the new procedures of noise reduction, namely filtering [6], pruning [11] and trimming [12], have on the mass area of jets. We believe that all the above possibilities are worth investigating with jet events from Monte Carlo simulations. We leave these questions for future work.
In this appendix, we collect all analytic results for passive areas and mass areas of the hardest jet in the system with two particles of arbitrary transverse momenta. The calculations were done within the E recombination scheme and in the limit of small R. The latter corresponds to neglecting the difference between y and φ dimensions and was used consistently throughout the paper for both passive and active quantities. All the analytic results presented here were confirmed for a range of values of z by numerical study with FastJet.
As discussed in sections 3, the area and the mass area of the hardest jet in the system with two particles depends on the interparticle distance in the (y, φ) plane. For each algorithm, one distinguishes several subranges of the range 0 < x < 2. In each such subrange the x-dependence of the area and mass area is described by a different function. Table 5 summarises the results for passive area and mass area of the hardest jet from four algorithms normalised to the 1-particle result, i.e. πR 2 or πR 4 /2, respectively. The critical x values, x c1,...,4 , were defined in Eqs. (3.6), (3.7), (3.9) and (3.13).
The functions from table 5, different for areas and mass areas are defined below. To make the notation more concise we define the auxiliary function x SISCone, R = 0.6 Figure 12: Active mass area of the harder jet from SISCone in a 2-particle system with strongly ordered transverse momenta. The curve corresponds to the analytic result (B.1). At x = 0 and x = 2 the 1-particle result, 1/16, is recovered.
B Active mass areas from SISCone for strongly ordered 2-particle system The active mass area of the hardest jet in the system of two particles with strongly ordered transverse momenta (e.g. from QCD splitting) is given by with the following definitions