Pileup Per Particle Identification

We propose a new method for pileup mitigation by implementing"pileup per particle identification"(PUPPI). For each particle we first define a local shape $\alpha$ which probes the collinear versus soft diffuse structure in the neighborhood of the particle. The former is indicative of particles originating from the hard scatter and the latter of particles originating from pileup interactions. The distribution of $\alpha$ for charged pileup, assumed as a proxy for all pileup, is used on an event-by-event basis to calculate a weight for each particle. The weights describe the degree to which particles are pileup-like and are used to rescale their four-momenta, superseding the need for jet-based corrections. Furthermore, the algorithm flexibly allows combination with other, possibly experimental, probabilistic information associated with particles such as vertexing and timing performance. We demonstrate the algorithm improves over existing methods by looking at jet $p_T$ and jet mass. We also find an improvement on non-jet quantities like missing transverse energy.


Introduction
Pileup, i.e. overlapping secondary proton-proton collisions on top of the primary interaction, will be a major challenge for the high luminosity LHC runs. Several methods for dealing with pileup are being successfully applied by ATLAS [1][2][3][4][5][6] and CMS [7][8][9] on present data. Current methods, however, will be strained both in upcoming LHC runs with expected pileup levels of n PU = 140 or more, and at possible future hadron colliders. Recently, newer ideas for pileup mitigation have been proposed. A brief summary of the state-of-the-art is given below: • Four-vector area subtraction [10,11]: corrects the four-vector of a jet based on the characteristic pileup density of an event and on the jet area. It has been applied by ATLAS and CMS, but requires additional experimental tuning on top of area × pileup density subtraction [2,7].
• Shape subtraction [12]: a generalization of area subtraction from the jet four-vector to jet shapes, e.g. jet mass. Each shape is separately corrected using the same pileup density measure as area subtraction and using the susceptibility of a given shape to soft uniform contamination.

JHEP10(2014)059
• Pileup jet identification [2,4]: removes entire jets that are identified as being composed primarily of pileup using both charged particle information and jet shapes.
• Topological clustering [18][19][20]: calorimeter cell signals are required to be several standard deviations above the typical noise level in the cells. These high significance cells are used as seeds to form local cell clusters used as inputs to jet algorithms.
• Charged hadron subtraction (CHS) [21]: removes charged particles from pileup based on the vertex from which particles originate. Four-vector area subtraction is then applied using the remaining particles.
• Cleansing 1 [22]: uses vertex information to remove charged pileup and rescales neutral pileup based on charged pileup composition in a subjet.
• Constituent subtraction [23,24]: extends four-vector area subtraction and shape subtraction to particle level by representing the event density ρ with an area assigned to each particle, correcting the particles's four-vector.
The methods listed above progressively move from a global approach towards a more local one. We note that, broadly speaking, the methods utilize three basic pieces of information to identify pileup: the event-wide pileup density, vertex information from charged tracks, and the local distribution of pileup with respect to particles from the leading vertex. As each technique has advantages and disadvantages it is unlikely that a single method alone will optimally remove pileup. It is therefore crucial to have a flexible framework to integrate the various pieces of information. We propose an algorithm to combine this information. This method uses both global information from the event, as well as local information to identify pileup at the particle level. As a shorthand, we refer to the method as PUPPI (PileUp Per Particle Identification). The algorithm is intended to remove pileup, rather than just correct jet quantities, to ultimately produce a consistent event interpretation.
It has been shown [22] that individually rescaling the four-momenta of particles in a jet (i.e. the jet's constituents) is useful not only for correcting kinematic variables, but also for correcting jet shapes in an observable-independent way. Following the "jets without jets" paradigm [25], we propose a local approach, in which no clustering is performed and a weight is assigned to each individual particle. We then choose to use the weight to rescale the particle four-momentum. Ideally, particles coming from pileup would get a weight of zero and particles coming from the hard scatter would get a weight of one. This leads to a pileup-corrected event, where one can then proceed with jet finding without the need for further pileup correction. In fact, given the pileup-corrected event, event shapes can be measured with a reduced sensitivity to pileup. We show results for jet p T , jet mass, and missing transverse energy and demonstrate that our algorithm improves over existing methods. We find the improvements on missing transverse energy reconstruction particularly relevant, as disentangling pileup contamination from missing energy is typically harder than for jet-based observables.

JHEP10(2014)059
We anticipate that the PUPPI algorithm could potentially improve pileup mitigation for other jet and event shapes, as well as the identification of isolated photons and leptons. More generally, such a per particle approach may contribute valuable input into the design of future detectors by highlighting the complementary information measured by several detector subsystems.
The rest of the paper is organized as follows: in section 2 we describe the algorithm, in section 3 we describe the setup we used for generating Monte Carlo data, and in section 4 we present our results. Finally, we conclude and discuss future work in section 5.

The algorithm
Before discussing details, we describe qualitatively how the algorithm works. First we select a shape α, which attempts to locally distinguish parton shower-like radiation from pileuplike radiation, then compute it for each particle in an event. A basic handle to distinguish pileup and leading vertex particles is given by the p T spectrum, with the pileup spectrum falling much faster. While we make use of this feature in the algorithm, the shape α itself attempts to exploit additional and complementary information with respect to the p T of a single particle, as discussed in sections 2.1 and 2.4. Where tracking is available, we know the answer to whether charged particles are from the leading vertex (LV) or from a pileup vertex (PU). We can use the median and the RMS of the α values for charged pileup as an event-level characterization of the pileup distribution.
Next we assign a weight to each particle by comparing its α value to the median of the charged pileup distribution. The weight takes values between zero and one and indicates how much a particle is allowed to contribute to an event. Ideally, particles from the hard scatter would get a weight of one and pileup particles would get a weight of zero. Almost all pileup particles have α values within a few standard deviations of the median and are assigned small weights. On the other hand, α values that deviate far from the charged pileup are very uncharacteristic of pileup, and these particles are assigned large weights. As discussed in section 2.3, our method for computing weights allows for experimental information, such as vertexing and timing performance, to be smoothly incorporated.
Finally, we choose to apply the weights to rescale the particle's four-momentum. Particles with a very small weight or with a very small rescaled p T are discarded. The final set of pileup-corrected particles can now be used as input to a jet algorithm or directly in the calculation of missing energy.
The rest of this section goes through the algorithm in detail. We use a pp → dijet sample at √ s = 14 TeV generated with Pythia 8.176 [26] to show various distributions. The spectrum is generated such that the p T of the 2 → 2 process is roughly flat across the range 15 − 500 GeV, in order to maintain reasonable statistics across different kinematic ranges. Pileup events are generated as zero-bias soft QCD events and overlaid onto the hard scatter event. Further details of the simulation are discussed in section 3.

The local shape
For each particle i we define a shape Throughout the paper we use Θ(R min ≤ ∆R ij ≤ R 0 ) as a shorthand notation for Θ(∆R ij − R min )×Θ(R 0 −∆R ij ), where Θ is the Heaviside step function. ∆R ij is the distance between particles i and j in ηφ-space and p T j is the transverse momentum of particle j measured in units of GeV. R 0 defines a cone around each particle i, so that only particles within the cone enter the calculation of α i . In addition, particles closer to i than R min are discarded from the sum, with R min effectively serving as a regulator for collinear splittings of particle i. Here we use R 0 = 0.3 and R min = 0.02. 2 Note that the logarithm is outside of the sum so it plays no role in the infrared-collinear behavior of the variable and just serves to rescale the range. The choice of ξ ij is discussed in more detail in section 2.4. Figure 1 (left) shows a sample distribution of α for particles from the leading vertex and pileup. Due to the collinear singularity of the parton shower, a particle i from a hard physics process is likely to be near other particles from the same parent process so that α i tends to be larger. On the other hand, we expect pileup particles to have no shower-like structure and to be uncorrelated with particles from the leading vertex and so only to be spatially near by chance. 3 So, α i tends to be smaller if i is a pileup particle. In fact, this JHEP10(2014)059 implies that the ideal version of eq. (2.1) would sum over particles from the leading vertex and ignore those from pileup. While we obviously do not know a priori which particles are from the leading vertex, we do have a handle on charged particles in the central (|η| 2.5 for ATLAS and CMS) region of the detector. In that region, tracking information provides the ability to distinguish charged tracks originating from the leading vertex and charged tracks originating from pileup. Associating these tracks to particles can be done with the particle flow algorithm [21] which combines measurements from various detector subsystems to define individual candidates. 4 Using particle flow, identified candidates can be sorted into three classes: neutral particles, charged hadrons from the leading vertex, and charged hadrons from pileup. Thus we can use charged particles from the leading vertex as a proxy for all particles from the leading vertex. To be explicit, in the central region the sum in eq. (2.1) can be decomposed as where Ch,PU refers to charged pileup, Ch,LV refers to charged particles from the leading vertex, and Neutral refers to all neutral particles both from pileup and the leading vertex. This leads to defining two versions of α for when tracking information is and is not available.
Notice that α F i ≡ α i in eq. (2.1). Here it is renamed to stress the fact that we use this version of α i in the forward region of the detector, as opposed to α C i which is used in the central region. Effectively, when tracking information is not available, we assume all particles in the sum are from the leading vertex. While there are noise contributions from pileup, these are suppressed relative to contributions from leading vertex particles by the p T j in the numerator. Thus the algorithm can still assign weights in regions where there is no tracking. Figure 1 (right) shows the distributions of α C . When there are no particles from the leading vertex around particle i to sum over, formally α i → −∞. In these cases the particle is assumed to be pileup and discarded from the event. 5 The variable α C has more discrimination power than α F and is used in the central region of the detector.
There is a second advantage to be gained from tracking information. For central charged particles, we know the answer as to whether a particle is from pileup or not. Using only these particles, for a given event we can compute the distribution of both α C and α F and then assume that the neutral particles, for which we do not know the answer, belong 4 The use of particles is not strictly necessary. In principle the algorithm can be performed with calorimeter cells and charged tracks as inputs. We discuss this later in section 5. 5 The fact that, in practice, the appearance of a single isolated particle occurs much more frequently in pileup (with a moderately low number of pileup interactions) than in hard interactions, supports this assignment scheme.

JHEP10(2014)059
to a distribution with the same properties. This assumes the distribution of α F and α C is the same for charged and neutral particles, and for central and forward particles. Neither of these assumptions is exact, but they both can be corrected if necessary. As an example, in figure 1, we show the distribution of α for neutral and charged particles separately and we find good agreement overall. The quantities we use to characterize the distributions on an event-by-event basis are the median and the left-side RMS: 6 The characterization of pileup contamination on an event basis is reminiscent of the area subtraction method where average information over an entire event is used to correct individual jets within the event [10,11]. In the absence of vertex based discrimination, the median of α can be computed by taking the median over all particles as is done for the area subtraction method. Because the computation ofᾱ F PU and σ F PU is only done for charged pileup, it must be computed in the central region, even though these quantities are used to calculate the weights of particles in the forward region. Pileup density varies as a function of rapidity and the values ofᾱ F PU and σ F PU do not account for this variation. A proper extrapolation can be performed by estimating the rapidity-dependence in a sample of minimum-bias events. The weights would then be calculated using the correction

Particle weights
Having introduced a variable with some separation power between shower-like radiation and pileup-like particles, we will use it to compute a weight for each particle. The ideal weight is one for leading vertex particles and zero for pileup particles. Since we are trying to estimate whether a particle is pileup or not given the available information, one can imagine that the weight may not be restricted to one and zero and can be a fractional value. Furthermore, even if one insists on assigning integer weights, in a detector environment neutral particles that are closer than the detector granularity will be treated as a single particle, leading to possible fractional weights.
In order to define weights, we first introduce the following quantity where Θ is the Heaviside step function. Eq.  Figure 2. The distribution of weights from eq. (2.8), over many events, for neutral particles i with p T > 1 GeV from the leading vertex (gray) and particles from pileup (blue) in a dijet sample. The weights are calculated using α F i (left) and α C i (right). In this sample, for weights from above the median are very uncharacteristic of pileup and appropriately receive a weight close to 1. Any intermediate fluctuation above the median is assigned a fractional weight between zero and one. Whenever possible, the C variant of the quantities are used, and everywhere else the F variant is used. As seen in figure 1 the distribution of α for pileup looks roughly Gaussian-like. For this reason eq. (2.7) resembles a χ 2 NDF=1 distribution, as the notation suggests. In fact, interpreting this distribution as a χ 2 distribution lends itself nicely into incorporating additional information, as is discussed in section 2.3. Particles are then assigned a weight given by where F χ 2 is the cumulative distribution function of the χ 2 distribution. As anticipated, particles with χ 2 i = 0 receive a weight w i = 0. Figure 2 shows the weight distributions for particles both using α F (left) and α C (right). As expected, the weights are closer to their true value when computed from α C .
At this point we could cut on the weight to decide whether or not to identify a particle as pileup and discard it from the event. In [22] it was found that rescaling the particles in subjets was able to correct kinematics and shapes. In light of this, we choose to use the weight in eq. (2.8) to rescale the particle's four-momentum. The complete algorithm proceeds as follows:  Figure 3. The mean weight, over many events, of neutral particles from the leading vertex (red) and pileup (blue) as a function of the particle's p T in a dijet sample. 4. The four-momentum of each particle is rescaled by its weight p µ i → w i × p µ i . 5. Particles with small weights w i < w cut or with low (rescaled) transverse momentum p T i < p T,cut are discarded.
6. The remaining set of rescaled particles is considered the pileup-corrected event.
Let us summarize the parameters of the algorithm. First, we have the cone size R 0 which specifies which particles are considered local. Neighboring particles inside a cone are the ones used to calculate α. We also have an R min cutoff, such that neighboring particles with ∆R < R min are not included in the computation of α. In our studies we use R 0 = 0.3 and R min = 0.02. The choice of R min is related to typical detector resolutions, as is discussed in section 3. Then we have a weight cut, w cut , below which particles are deemed pileup and a p T cut, p T,cut . The precise choice of w cut and p T,cut depends mildly both on the expected amount of pileup that will be encountered and detector parameters, such as calorimeter granularity. They can also, in general, be different for the central and forward regions. In our studies we use w cut = 0.1, p T,cut 0.1 − 1.0 GeV (the exact value will be described in section 4). We have checked that the performance of PUPPI algorithm depends weakly on the exact choice of these parameters, with a more significant degradation for much larger values of R 0 .
One may note that information from the distribution of particles from the leading vertex is primarily ignored. This is in contrast to matrix-element-like methods like shower deconstruction [27][28][29] which aim to optimize discrimination power by using as much signal and background information as possible. The specifics of the distributions for leading vertex particles depends on the sample, so we choose not to use the information from the distribution. In this way, the algorithm is not optimized for any specific signal, but rather looks for general features like a parton shower-like structure, and we expect it to behave consistently across a range of signal topologies.

Incorporating additional information
Many pileup removal algorithms are designed assuming a perfect detector and in many cases it is not straightforward to fold in information related to detector efficiencies or limitations. Using the PUPPI algorithm, experimental information can be used to directly modify the weight that is assigned to a particle. If one interprets the weight as a probability the particle is from the leading vertex (this will be discussed further in section 5), then vertex reconstruction efficiencies, for example, may affect this probability.
One advantage to the χ 2 approximation presented above is that it provides a scheme for calculating the weight based on experimental input. We further make the assumptions that the experimental information is Gaussian-distributed and independent both from the computation of α and other experimental information. Under these assumptions we can extend the χ 2 NDF=1 approximation to a χ 2 NDF=N approximation The weight is then appropriately adjusted to Experimental information that may be useful includes tracking information, calorimeter depth information, and timing information.

Choice of metric
In separating pileup from leading vertex particles, it is necessary to identify features that distinguish between them. Here we consider leading vertex particles to originate from a parton shower. While the detailed jet structure will depend on the hard process, in the soft and collinear limit the parton shower is universal. In particular, it includes a soft and collinear singularity leading to the observed collinear structure of jets. Pileup, on the other hand, contains no hard scale and has no perturbative collinear structure. This motivates the use of a metric where this work uses β = 1. Particles from a parton shower are expected to have a small ∆R in relation to other particles from the shower, while pileup has no perturbative preference for small ∆R. The inclusion of p T j in the numerator is useful for the case where one sums over all particles. Here, the leading vertex contribution will dominate because the p T spectrum of pileup falls much faster than leading vertex particles, resulting in the pileup contribution α being supressed by p T . In eq. (2.11) β allows one to tune the relative importance of p T j vs. ∆R ij . We have tried many metrics, including those not in the form of eq. (2.11), and find the one used here to be optimal.

JHEP10(2014)059
One obvious question that may still arise in the choice of the metric is why p T i is not used. After all, we have already stated that the fact the p T spectrum of pileup falls much faster than the p T spectrum of leading vertex particles. Its exclusion from the metric is twofold. Firstly, it is already used in the algorithm. After weights are assigned and particles are rescaled, a cut on p T i is made. Secondly, we find that one of the reasons we find the weights useful as opposed to just a p T cut is that the weights tend to not be strongly correlated with p T in pileup, as shown in figure 3. In this way, α uses complementary information to just a p T i cut. In particular, in trying different metrics, we did try α i = p T i . We found its performance to be decent, however it degraded quicker than the ∆R-based metric when calorimeter cell discretization was introduced.

Simulation details
In order to study the performance of our algorithm and compare it to existing methods, we use a sample of pp → dijet at √ s = 14 TeV, unless specified otherwise. Events are generated with Pythia 8.176 [26], tune 4C [30,31]. The spectrum is generated such that the p T of the 2 → 2 process is roughly flat across the range 15 − 500 GeV. This is done in order to maintain reasonable statistics across a range of jet p T values and to demonstrate the method's utility across different kinematic regimes. Pileup events are generated as zero-bias soft QCD events using Pythia and overlaid onto the hard scatter event. The number n PU specifies the exact number of pileup interactions. We take as a baseline scenario n PU = 80 pileup interactions overlaid and several results in section 4 consider this scenario. We also consider performance versus n PU . Sections 4.1 and 4.2 show results on jet kinematics and shapes. In section 4.3 we show the algorithm's performance on missing transverse energy (E miss T ). In this section the sample used is pp → Z + jets at √ s = 14 TeV, where the Z decays invisibly to neutrinos. In order to focus the performance study on pileup mitigation, underlying event is not included in the simulation.
We reconstruct particles in a naive detector simulation. The detector extends to |η| < 5.0 and includes a perfect tracker for |η| < 2.5. The perfect tracker exactly identifies if a charged hadron is from the leading vertex or from a pileup vertex (in contrast to a real tracker where misidentifications are possible) and has perfect spatial resolution. Neutral particles are discretized into calorimeter cells of size 0.1 × 0.1 in the ηφ-plane.
Selecting an appropriate value for R min is closely tied to the properties of the detector. The detector itself restricts cells from being closer than approximately r cell = 0.1 from each other. Similarly, in a real detector the tracking efficiency degrades for distances closer than r track 0.02 from each other because for pairs of tracks closer than this distance it becomes possible that one of the tracks is lost. The distance between cells and tracks, on the other hand, is not necessarily regulated by the detector and could be as small as zero. Thus R min directly regulates the cell-track distances and should be chosen as R min = min(r track , r cell ) = 0.02 to be consistent with resolutions. We use R min = 0.02 in our simulation for consistency with all inter-object distances and to mock-up the effect of track resolution.
Where particles are clustered into jets, we use Fastjet 3.0.5 [32] and the anti-k T algorithm [33] with a radius of R = 0.7 (AK7). While smaller jet radii are more common

JHEP10(2014)059
in phenomenological studies, larger jets receive more pileup contamination and are commonplace in substructure studies where correcting more than only a jet's p T becomes important [34][35][36]. We choose R = 0.7 as a compromise between these applications.
We define four particle collections from which we can derive algorithmic performance. They are: • LV: only particles from the leading vertex.
• PFlow: all particles in the event including those from the leading vertex and pileup.
These are the inputs that would be used in particle flow.
• PFlowCHS: all particles in the event except for charged particles from pileup (within the tracker volume). This corresponds to particle flow with charged hadron subtraction.
• PUPPI: the resulting rescaled particles from the algorithm described in section 2.
The PFlowCHS particle collection can be considered the current experimental state-ofthe-art. We also apply four-vector subtraction to PFlow and PFlowCHS inputs as will be described in the following section, wherever jet quantities are shown.
In figure 4 we show a sample of an event display with n PU = 80 for the four particle collections we consider. Particles from the leading vertex are drawn with filled squares and colored according to their p T . Particles from pileup are drawn with unfilled, uncolored squares with their size logarithmically proportional to their p T . The unfilled colored circles show anti-k T R = 0.7 jets where the colors denote the p T bin. The bins 25 − 50 GeV, 50−200 GeV, and > 200 GeV correspond to colors of magenta, cyan, and blue respectively.
The LV plot (top left) shows the original uncontaminated event. The PFlow plot (top right) shows the effect of all pileup particles being added to the event. The PFlowCHS plot (bottom left) shows a reduced pileup density in central region where charged pileup has been removed. The PUPPI plot (bottom right) is an event display that reproduces not only the hard jets from the LV collection, but also manages to capture features outside of the jets and remove a large portion of the pileup completely. The p T of the jets from PFlow and PFlowCHS are area subtracted.

Results
In this section we study the performance of the PUPPI algorithm on several jet and event observables. Where jets are clustered using the PFlow collection, they are corrected using the "safe" modification of four-vector subtraction [37]. 7,8 Subtraction is also applied to jets clustered from the PFlowCHS. In the tracking region for PFlowCHS, charged pileup is already removed, so ρ is calculated only from neutral particles. In the forward region, ρ 7 The results can differ based on the variant of four-vector subtraction used, however, the qualitative conclusions remain unchanged. In this work we use a modified version of four-vector subtraction presented in [37] which forbids negative masses by setting the mass of (sub)jets to zero in certain cases. 8 We include corrections due to hadron masses following the method proposed in [12]. is computed using all particles. The jet clustering procedure is run separately on each particle collection.
For PUPPI we make the following parameter choices: In particular the p T,cut value has a weak dependence on the amount of pileup in the event and will depend on the granularity and energy resolution of a particular detector. We tune the values of this cut for our mock detector to minimize the offset between reconstructed observables and LV observables (see e.g. missing transverse energy in figure 10).

Jet kinematics
We start by looking at the jet multiplicity as a function of pseudorapidity shown in figure 5 for n PU = 80. Here all jets with p T > 25 GeV after the pileup correction techniques are applied are considered. We see that in pseudorapidity regions where pileup correction is solely from subtraction the jet multiplicity tends to be too high. This is primarily from high density regions of pileup resulting in pileup jets. For PFlow this occurs across the full rapidity range, while for PFlowCHS this only occurs in the forward region where charged hadrons cannot be removed. The PUPPI jet distribution matches the LV distribution well across pseudorapidity. Next we compare the jet p T resolution across the methods. We define the resolution of an observable O from the particle collection P to be Additionally, in plots where the resolution is cited as fitted σ, we adopt the common practice of fitting the distribution to a Gaussian and using the standard deviation as the resolution. To compare jets from different collections, one needs a scheme to match jets. We consider jets from two collections matched if they are within ∆R = 0.3 of each other. Figure 6 (left) shows the p T resolution for central jets with p T between 100 and 200 GeV. The p T resolution of PUPPI is roughly 1.5 times better than PFlowCHS and 2.5 times better than PFlow. Figure 6 (right) shows the p T resolution for forward jets with p T between 25 and 50 GeV. In both cases the p T response also tends to be more symmetric than PFlow and PFlowCHS. Despite the fact that there is no tracking information in the forward region, the PUPPI algorithm is able to maintain an improvement over subtraction even in the forward region. We also note that the improvement in PFlowCHS over PFlow in the forward region is due to the partial tracking information that PFlowCHS jets have near the tracker boundary.
Next we show the p T resolution as a function of p T for central jets in figure 7 (left). We show that the improvements found above hold across a wide kinematic range. (right) we show the p T resolution as a function of number of pileup interactions. For low levels of pileup we see that the PUPPI algorithm does not offer much of an improvment over existing methods. This is for two reasons. Firstly, at low levels of pileup there is not much improvement to make. Secondly, in low pileup environments, there is less information available locally just due to the lack of pileup. This means the α distribution is not as well populated and the uncertainty on σ PU is larger.

Jet shapes
Similar to our study of p T distributions, we can study resolution and its pileup dependence for jet shapes. Here we show results for jet mass which is considered a reasonable proxy for generic jet shapes and is used in many applications such as boosted object tagging (see [34][35][36] and references therein). First we look at jet mass for central jets with 100 GeV < p T < 200 GeV. The distribution is shown in figure 8 (left). Here we see that PUPPI is not only able to correct the mean of the distribution, but also the distribution itself. Figure 8 (right) shows the results of PUPPI on trimmed mass. Trimming is performed on jets from all collections, including LV, using r sub = 0.2 and f cut = 0.05. For jets from PFlow and PFlowCHS subtraction is applied to the trimmed jet. Even with the application of grooming, PUPPI distributions do a consistent job of restoring distributions near to their LV distributions. We regard this as a positive indication that PUPPI is returning a consistent event interpretation.
In figure 9 (left) we show the mass resolution 9 for jets with p T between 100 GeV and 200 GeV at n PU = 80. We find that the PUPPI jet mass resolution is improved with respect to the other inputs. Figure 9 (right) plots the mass resolution as a function of number of pileup interactions where the mass resolution from PUPPI is relatively stable as a function of n PU .

Missing transverse energy
We now look at an event quantity, the missing transverse energy (E miss T ), which is interesting from both a theoretical and an experimental point of view. From the theoretical perspective, missing transverse energy is one of the main signatures of new physics. For example, in R-parity conserving supersymmetry, every event in which superpartners are pair-produced the two lightest supersymmetric particles in the final state appear as missing transverse energy. Additionally for standard model measurements, E miss T plays a role in many analyses, such as the W mass measurement [38], the Higgs to W W discovery [39,40] and the Higgs to τ τ evidence [41,42]. On the experimental side, E miss T is challenging because it compounds errors from the measurement of all objects in the event, both pileup and non-pileup alike. In the presence of pileup, the E miss T resolution rapidly degrades because the full energy of the additional pileup events is incorporated into the event [6,9]. Attempts at reducing the impact of pileup on the E miss T resolution are typically more difficult than on jets, because traditional approaches that work on jets breakdown. The pileup component of events has a natural tendency to have near zero E miss T . Applying a method that reconstructs E miss T from a fraction of the particles in the event, e.g. charged hadron subtraction, breaks the cancellation between neutral and charged pileup particles resulting in large distortions in E miss T measurements. In order to mitigate the effects of pileup in E miss T , both ATLAS and CMS have resorted to approaches that rely on combinations of various methods of calculating E miss T [6,9,43]. Such methods, either through a linear combination of different E miss T variables or through a boosted decision tree regression, can lead to a reduction of the pileup dependence on the E miss T resolution by a factor of roughly three. These calculations are typically quite elaborate and rely on the commissioning of 10 − 20 additional E miss T related variables. To compare the performance of E miss T observables, we use a sample of events with large hadronic recoil and well-defined E miss T , in this case pp → Zj where the Z decays to neutrinos and has transverse momentum in the center of mass frame p T (Z) > 350 GeV. The missing transverse energy is constructed from negative vector sum of the particle transverse momenta where the length of this vector is denoted E miss Another related variable is the scalar sum of transverse energies (4.5) We show the resolution in figure 10 (left), where we see that the PUPPI algorithm noticeably improves the E T resolution over PFlow and PFlowCHS. While that fact that neither PFlow nor PFlowCHS are centered around zero is not an issue, the fact that PUPPI is centered around zero supports the claim that applying PUPPI produces a consistent event interpretation without the need for further pileup correction.
To compare the E miss T resolution, we look at the resolution of the x-component of E miss T , shown in figure 10 (right). The relevance of this quantity to phenomenology is more directly seen, as this is one component of the E miss T vector. Both the length and direction of the E miss T vector are important discriminating variables in many new physics searches so it is plausible that a small signal could be washed out by poor E miss T resolution or non-unity E miss T response. We find that in our simplified set-up PUPPI displays improvements over PFlow and PFlowCHS. In fact, the resolution for PFlowCHS degrades the resolution with respect to PFlow. This effect is due to the observation above that the partial removal of pileup interactions can lead to larger E miss T resolution. For the pileup-reduced E miss T computations in CMS [9], it was found that the key component to reducing the pileup dependence on the E miss T resolution resulted from the identification and (indirect) removal of pileup jets from the E miss T calculation. With PUPPI, pileup jet removal is naturally built into the algorithm, thereby allowing for a simplified approach to pileup mitigation in E miss T related quantities. We expect that given the algorithm's flexibility in using experimental information, the improvement will persist in the full detector environment.

Summary and outlook
In this paper, we have introduced a new algorithm, PUPPI, for removing pileup contamination. This method employs a per particle approach and improves the reconstruction of not only jet quantities, but also of event-wide observables like missing transverse energy. PUPPI operates by using charged pileup to characterize the pileup in an event and then uses that knowledge to assign a weight to particles of unknown origin, like neutral hadrons or any particle in the forward region. The weight is used to rescale the particle's four-momentum. The parameters of the algorithm are the size of the cone used to define neighboring par-JHEP10(2014)059 ticles R 0 , the minimum distance cutoff R min , the cut on the weights w cut , and the cut on the rescaled transverse momentum p T,cut .
By applying corrections at the particle level, before jet clustering, we can simultaneously perform pileup jet mitigation, and jet four-vector and shape corrections. We have shown the improvement of PUPPI over existing methods by studying jet p T , mass, and missing transverse energy over a wide range of jet p T and number of pileup interactions. Also, our method can be applied both in the central region of the detector (where tracking information is available) and in the forward region.
In this work we have introduced the simplest form of the algorithm. However, many modifications and extensions are worth further exploration. In particular, we have shown results for a single choice of metric, a particular weighting function, and a choice of how to use the weights. Further modifications considered for the metric can include a combination of discrimination power from a selection of metrics into a common multivariate discriminant. Preliminary studies with a boosted decision tree show modest improvements, although we leave it to future work to fully explore this avenue.
With regards to the particle weights, we have elected to allow fractional weights and chosen to use them to rescale four-vectors. It is not obvious that a four-vector rescaling is the optimal procedure to implement. As a simplification one could restrict weights to zero or one, in which case no rescaling is performed and particles are either kept or discarded. Taking a step in the opposite direction, one could interpret the weights as probabilities that a particle should be kept in the event. Given a probabilistic interpretation of weights, a natural approach would follow along the lines of Qjets [44,45], where a given event would yield many event interpretations with particles either kept or discarded according to their probabilistic weight. All observables for a single event would then become distributions. We leave this study for future work.
Though we frame our studies within a "particle flow"-like set of inputs, it is not restricted only to inputs of this type. If we consider as inputs calorimeter clusters rather than particles, we can still similarly compute the distance of tracks to a given calorimeter cluster, i, within the tracking volume. Then for the forward region, we can consider nearby calorimeter clusters. The challenge is to identify pure pileup clusters; we expect this can be achieved using tracks from the non-leading vertices.
This method may also be applied to heavy ion events, where energy densities of the underlying event are similar to the levels of pileup studied. In this case, however, all particles originate from the leading vertex. One can use a modified version of PUPPI in which the leading vertex constraint is not applied in the algorithm.
Finally, given the performance of PUPPI on jet mass and E miss T we are optimistic that the PUPPI algorithm will be useful in improving pileup mitigation of other jet and event shapes, and more generally in the identification of other physics objects. For instance, given a pileup-corrected event, it is reasonable to expect that identifying isolated photons or leptons will be improved using a per particle weighting scheme.
While PUPPI was developed, another particle level pileup removal method called SoftKiller [46] has been proposed. Preliminary comparisons indicate comparable performances [47].