FastJet user manual

FastJet is a C++ package that provides a broad range of jet finding and analysis tools. It includes efficient native implementations of all widely used 2-to-1 sequential recombination jet algorithms for pp and e+e- collisions, as well as access to 3rd party jet algorithms through a plugin mechanism, including all currently used cone algorithms. FastJet also provides means to facilitate the manipulation of jet substructure, including some common boosted heavy-object taggers, as well as tools for estimation of pileup and underlying-event noise levels, determination of jet areas and subtraction or suppression of noise in jets.


Introduction
Jets are the collimated sprays of hadrons that result from the fragmentation of a high-energy quark or gluon. They tend to be visually obvious structures when one looks at an experimental event display, and by measuring their energy and direction one can approach the idea of the original "parton" that produced them. Consequently jets are both an intuitive and quantitatively essential part of collider experiments, used in a vast array of analyses, from new physics searches to studies of Quantum Chromodynamics (QCD). For any tool to be so widely used, its behaviour must be well defined and reproducible: it is not sufficient that one be able to visually identify jets, but rather one should have rules that project a set of particles onto a set of jets. Such a set of rules is referred to as a jet algorithm. Usually a jet algorithm involves one or more parameters that govern its detailed behaviour. The combination of a jet algorithm and its parameters is known as a jet definition. Suitable jet definitions can be applied to particles, calorimeter towers, or even to the partonic events of perturbative QCD calculations, with the feature that the jets resulting from these different kinds of input are not just physically close to the concept of partons, but can be meaningfully be compared to each other.
Jet finding dates back to seminal work by Sterman and Weinberg [1] and several reviews have been written describing the various kinds of jet finders, their different uses and properties, and even the history of the field, for example [2,3,4,5,6].
It is possible to classify most jet algorithms into one of two broad classes: sequential recombination algorithms and cone algorithms.
Sequential recombination algorithms usually identify the pair of particles that are closest in some distance measure, recombine them, and then repeat the procedure over and again, until some stopping criterion is reached. The distance measure is usually related to the structure of divergences in perturbative QCD. The various sequential recombination algorithms differ mainly in their particular choices of distance measure and stopping criterion.
Cone algorithms put together particles within specific conical angular regions, notably such that the momentum sum of the particles contained in a given cone coincides with the cone axis (a "stable cone"). Because QCD radiation and hadronisation leaves the direction of a parton's energy flow essentially unchanged, the stable cones are physically close in direction and energy to the original partons. Differences between various cone algorithms are essentially to do with the strategy taken to search for the stable cones (e.g. whether iterative or exhaustive) and the procedure used to deal with cases where the same particle is found in multiple stable cones (e.g. split-merge procedures).
One of the aims of the FastJet C++ library is to provide straightforward, efficient implementations for the majority of widely used sequential-recombination algorithms, both for hadron-hadron and e + e − colliders, and easy access also to cone-type jet algorithms. It is distributed under the terms of the GNU General Public License (GPL), either version 2 of the License [7] or (at your option) any later version.
To help introduce the terminology used throughout FastJet and this manual, let us consider the longitudinally-invariant k t algorithm for hadron colliders [8,9]. This was the first jet algorithm to be implemented in FastJet [10] and its structure, together with that of other sequential recombination algorithms, has played a key role in the design of FastJet's interface. The k t algorithm involves a (symmetric) distance measure, d ij , between all pairs of particles i and j, d ij = d ji = min(p 2 ti , p 2 tj ) where p ti is the transverse momentum of particle i with respect to the beam (z) direction and ∆R 2 ij = (y i − y j ) 2 + (φ i − φ j ) 2 , with y i = 1 2 ln E i +p zi E i −p zi and φ i respectively i's rapidity and azimuth. The k t algorithm also involves a distance measure between every particle i and the beam R in eq. (1), usually called the jet radius, is a parameter of the algorithm that determines its angular reach. In the original, so-called "exclusive" formulation of the k t algorithm [8] (proposed with R ≡ 1), one identifies the smallest of the d ij and d iB . If it is a d ij , one replaces i and j with a single new object whose momentum is p i + p j -often this object is called a "pseudojet", since it is neither a particle, nor yet a full jet. 1 If instead the smallest distance is a d iB , then one removes i from the list of particles/pseudojets and declares it to be part of the "beam" jet. One repeats this procedure until the smallest d ij or d iB is above some threshold d cut ; all particles/pseudojets that are left are then that event's (non-beam) jets. 2 In the "inclusive" formulation of the k t algorithm [9], the d ij and d iB distances are the same as above. The only difference is that when a d iB is smallest, then i is removed from the list of particles/pseudojets and added to the list of final "inclusive" jets (this is instead of being incorporated into a beam jet). There is no d cut threshold and the clustering continues until no particles/pseudojets remain. Of the final jets, generally only those above some transverse momentum are actually used. 3 Because the distance measures are the same in the inclusive and exclusive algorithms, the clustering sequence is common to both formulations (at least up to d cut ), a property that will be reflected in FastJet's common interface to both formulations.
Having seen these descriptions, the reader may wonder why a special library is needed for sequential-recombination jet finding. Indeed, the k t algorithm can be easily implemented in just a few dozen lines of code. The difficulty that arises, however, is that at hadron colliders, clustering is often performed with several hundreds or even thousands of particles. Given N particles, there are N (N − 1)/2 d ij distances to calculate, and since one must identify the smallest of these O (N 2 ) distances at each of O (N ) iterations of the algorithm, original implementations of the k t algorithm [11,12] involved O (N 3 ) operations to perform the clustering. In practice this translates to about 1 s for N = 1000. Given that events with pileup can have multiplicities significantly in excess of 1000 and that it can be necessary to cluster hundreds of millions of events, N 3 timing quickly becomes prohibitive, all the more so in time-critical contexts such as online triggers. To alleviate this problem, FastJet makes use of the observation [10] that the smallest pairwise distance remains the same if one uses the following alternative (non-symmetric) d ij distance measure: For a given i, the smallest of the d ij is simply found by choosing the j that minimises the ∆R ij , i.e. by identifying i's geometrical nearest neighbour on the y − φ cylinder. Geometry adds many constraints to closest pair and nearest neighbour type problems, e.g. if i is geometrically close to k and j is geometrically close to k, then i and j are also geometrically close; such a property is not true for the d ij . The factorisation of the problem into momentum and geometrical parts makes it possible to calculate and search for minima among a much smaller set of distances. This is sufficiently powerful that with the help of the external Computational Geometry Algorithms Library (CGAL) [13] (specifically, its Delaunay triangulation modules), FastJet achieves expected N ln N timing for many sequential recombination algorithms. This N ln N strategy is supplemented in FastJet with several other implementations, also partially based on geometry, which help optimise clustering speed up to moderately large multiplicities, N 30000. The timing for N = 1000 is thus reduced to a few milliseconds. The same techniques apply to a range of sequential recombination algorithms, described in section 4.
At the time of writing, sequential recombination jet algorithms are the main kind of algorithm in use at CERN's Large Hadron Collider (LHC), notably the anti-k t algorithm [14], which simply replaces p 2 t with p −2 t in eqs. (1,2). Sequential recombination algorithms were also widely used at HERA and LEP. However at Fermilab's Tevatron, and in much preparatory LHC work, cone algorithms were used for nearly all studies. For theoretical and phenomenological comparisons with these results, it is therefore useful to have straightforward access also to cone algorithm codes. The main challenge that would be faced by someone wishing to write their own implementation of a given cone algorithm comes from the large number of details that enter into a full specification of such algorithms, e.g. the precise manner in which stable cones are found, or in which the split-merge step is carried out. The complexity is such that in many cases the written descriptions that exist of specific cone algorithms are simply insufficient to allow a user to correctly implement them. Fortunately, in most cases, the authors of cone algorithms have provided public implementations and these serve as a reference for the algorithm. While each tends to involve a different interface, a different 4-momentum type, etc., FastJet has a "plugin" mechanism, which makes it possible to provide a uniform interface to these different third party jet algorithms. Many plugins (and the corresponding third party code) are distributed with FastJet. Together with the natively-implemented sequential-recombination algorithms, they ensure easy access to all jet algorithms used at colliders in the past decade (section 5). Our distribution of this codebase is complemented with some limited curatorial activity, e.g. solving bugs that become apparent when updating compiler versions, providing a common build infrastructure, etc.
In the past few years, research into jets has evolved significantly beyond the question of just "finding" jets. This has been spurred by two characteristics of CERN's LHC experimental programme. The first is that the LHC starts to probe momentum scales that are far above the the electroweak scale, M EW , e.g. in the search for new particles or the study of high-energy W W scattering. However, even in events with transverse momenta M EW , there can simultaneously be hadronic physics occurring on the electroweak scale (e.g. hadronic W decays). Jet finding then becomes a multi-scale problem, one manifestation of which is that hadronic decays of W's, Z's and top quarks may be so collimated that they are entirely contained within a single jet. The study of this kind of problem has led to the development of a wide array of jet substructure tools for identifying "boosted" object decays, as reviewed in [15]. As was the situation with cone algorithms a few years ago, there is considerable fragmentation among these different tools, with some public code available from a range of different sources, but interfaces that differ from one tool to the next. Furthermore, the facilities provided with version 2 of FastJet did not always easily accommodate tools to manipulate and transform jets. Version 3 of FastJet aims to improve this situation, providing implementations of the most common jet substructure tools 4 and a framework to help guide third party authors who wish to write further such tools using a standard interface (section 9). In the near future we also envisage the creation of a FastJet "contrib" space, to provide a common location for access to these new tools as they are developed.
The second characteristic of the LHC that motivates facilities beyond simple jet finding in FastJet is the need to use jets in high-noise environments. This is the case for proton-proton (pp) collisions, where in addition to the pp collision of interest there are many additional soft "pileup" pp collisions, which contaminate jets with a high density of low-momentum particles. A similar problem of "background contamination" arises also for heavy-ion collisions (also at RHIC) where the underlying event in the nuclear collision can generate over a TeV of transverse momentum per unit rapidity, part of which contaminates any hard jets that are present. One way of correcting for this involves the use of jet "areas", which provide a measure of a given jet's susceptibility to soft contamination. Jet areas can be determined for example by examining the clustering of a large number of additional, infinitesimally soft "ghost" particles [17]. Together with a determination of the level of pileup or underlying-event noise in a specific event, one can then perform event-by-event and jet-by-jet subtraction of the contamination [18,19]. FastJet allows jet clustering to be performed in such a way that the jet areas are determined at the same time as the jets are identified, simply by providing an "area definition" in addition to the jet definition (section 7). Furthermore it provides the tools needed to estimate the density of noise contamination in an event and to subtract the appropriate amount of noise from each jet (section 8). The interface here shares a number of characteristics with the substructure tools, some of which also serve to remove noise contamination. Both the substructure and pileup removal make use also of a "selectors" framework for specifying and combining simple cuts (section 6).
While FastJet provides a broad range of facilities, usage for basic jet finding is straightforward. To illustrate this, a quick-start guide is provided in section 2, while the core classes (PseudoJet, JetDefinition and ClusterSequence) are described in section 3. For more advanced usage, one of the design considerations in FastJet has been to favour user extensibility, for example through plugins, selectors, tools, etc. This is one of the topics covered in the appendices. Further information is also available from the extensive "doxygen" documentation, available online at http://fastjet.fr.

Quick-start guide
For the impatient, the FastJet package can be set up and run as follows.
• Download the code and the unpack it curl -O http://fastjet.fr/repo/fastjet-X.Y.Z.tar.gz tar zxvf fastjet-X.Y.Z.tar.gz cd fastjet-X.Y.Z/ replacing X.Y.Z with the appropriate version number. On some systems you may need to replace "curl -O" with "wget".
• Compile and install (choose your own preferred prefix), and when you're done go back to the original directory ./configure --prefix='pwd'/../fastjet-install make make check make install cd ..
If you copy and paste the above lines from one very widespread PDF viewer, you should note that the first line contains back-quotes not forward quotes but that your PDF viewer may nevertheless paste forward quotes, causing problems down the line (the issue arises again below).
• Now paste the following piece of code into a file called short-example.cc • Then compile and run it with g++ short-example.cc -o short-example \ 'fastjet-install/bin/fastjet-config --cxxflags --libs --plugins' ./short-example (watch out, once again, for the back-quotes if you cut and paste from the PDF).
The output will consist of a banner, followed by the lines Clustering with Longitudinally invariant anti-kt algorithm with R = 0.7 and E scheme recombination pt y phi jet 0: 103 0 0 constituent 0's pt: 99.0001 constituent 1's pt: 4.00125 jet 1: 99 0 3.14159 constituent 0's pt: 99 More evolved example programs, illustrating many of the capabilities of FastJet, are available in the example/ subdirectory of the FastJet distribution.

Core classes
All classes are contained in the fastjet namespace. For brevity this namespace will usually not be explicitly written below, with the possible exception of the first appearance of a FastJet class, and code excerpts will assume that a "using namespace fastjet;" statement is present in the user code. For basic usage, the user is exposed to three main classes: class fastjet::PseudoJet; class fastjet::JetDefinition; class fastjet::ClusterSequence; PseudoJet provides a jet object with a four-momentum and some internal indices to situate it in the context of a jet-clustering sequence. The class JetDefinition contains a specification of how jet clustering is to be performed. ClusterSequence is the class that carries out jet-clustering and provides access to the final jets.

fastjet::PseudoJet
All jets, as well as input particles to the clustering (optionally) are PseudoJet objects. They can be created using one of the following constructors where the second form allows the initialisation to be obtained from any class T that allows subscripting to return the components of the momentum (running from 0 . . . 3 in the order p x , p y , p z , E). The default constructor for a PseudoJet sets the momentum components to zero.
The PseudoJet class includes the following member functions for accessing the components The reader may have observed that in some cases more than one name can be used to access the same quantity. This is intended to reflect the diversity of common usages within the community. 5 There are two ways of associating user information with a PseudoJet. The simpler method is through an integer called the user index /// set the user_index, intended to allow the user to label the object (default is -1) void set_user_index(const int index); /// return the user_index int user_index() const ; A more powerful method, new in FastJet 3, involves passing a pointer to a derived class of PseudoJet::UserInfoBase. The two essential calls are /// set the pointer to user information (the PseudoJet will then own it) void set_user_info(UserInfoBase * user_info); /// retrieve a reference to a dynamic cast of type L of the user info template<class L> const L & user_info() const; Further details are to be found in appendix B and in example/09-user info.cc.
A PseudoJet can be reset with and similarly taking as argument a templated some_lorentz_vector or a PseudoJet (in the latter case, or when some_lorentz_vector is of a type derived from PseudoJet, reset also copies the user and internal indices and user-info).
Additionally, the +, -, * and / operators are defined, with +, -acting on pairs of PseudoJets and *, / acting on a PseudoJet and a double coefficient. The analogous +=, etc., operators, are also defined. 6 There are also equality testing operators: (jet1 == jet2) returns true if the two jets have identical 4-momenta, structural information and user information; the (jet == 0.0) test returns true if all the components of the 4-momentum are zero. The != operator works analogously.
Finally, we also provide routines for taking an unsorted vector of PseudoJets and returning a sorted vector, /// return a vector of jets sorted into decreasing transverse momentum vector<PseudoJet> sorted_by_pt(const vector<PseudoJet> & jets); /// return a vector of jets sorted into increasing rapidity vector<PseudoJet> sorted_by_rapidity(const vector<PseudoJet> & jets); /// return a vector of jets sorted into decreasing energy vector<PseudoJet> sorted_by_E(const vector<PseudoJet> & jets); These will typically be used on the jets returned by ClusterSequence.
A number of further PseudoJet member functions provide access to information on a jet's structure. They are documented below in sections 3.5 and 3.6.

fastjet::JetDefinition
The class JetDefinition contains a full specification of how to carry out the clustering. According to the Les Houches convention detailed in [20], a 'jet definition' should include the jet algorithm name, its parameters (often the radius R) and the recombination scheme. Its constructor is JetDefinition(fastjet::JetAlgorithm jet_algorithm, double R, fastjet::RecombinationScheme recomb_scheme = E_scheme, fastjet::Strategy strategy = Best); The jet algorithm is one of the entries of the JetAlgorithm enum: 7 enum JetAlgorithm {kt_algorithm, cambridge_algorithm, antikt_algorithm, genkt_algorithm, ee_kt_algorithm, ee_genkt_algorithm, ...}; Each algorithm is described in detail in section 4. The . . . represent additional values that are present for internal or testing purposes. They include plugin algorithm, automatically set when plugins are used (section 5) and undefined jet algorithm, which is the value set in JetDefinition's default constructor.
The parameter R specifies the value of R that appears in eq. (1) and in the various definitions of section 4. For one algorithm, ee_kt_algorithm, there is no R parameter, so the constructor is to be called without the R argument. For the generalised k t algorithm and its e + e − version, one requires R and (immediately after R) an extra parameter p. Details are to be found in sections 4.4-4.6. If the user calls a constructor with the incorrect number of arguments for the requested jet algorithm, a fastjet::Error() exception will be thrown with an explanatory message.
The recombination scheme is set by an enum of type RecombinationScheme, and it is related to the choice of how to recombine the 4-momenta of PseudoJets during the clustering procedure. The default in FastJet is the E-scheme, where the four components of two 4-vectors are simply added. This scheme is used when no explicit choice is made in the constructor. Further recombination schemes are described below in section 3.4.
The strategy selects the algorithmic strategy to use while clustering and is an enum of type Strategy. The default option of Best automatically determines and selects a strategy that should be close to optimal in speed for each event, based on its multiplicity. A discussion of the main available strategies together with their performance is given in appendix A. Different strategies give identical clustering results, except potentially in the presence of distance degeneracies, where different strategies may resolve those degeneracies differently.
A textual description of the jet definition can be obtained by a call to the member function std::string description().
As of FastJet version 3.1, the JetDefinition class allows one to directly perform the clustering and access the inclusive jets through a () operator: returns the list of all the inclusive jets given by clustering with the anti-k t algorithm with radius parameter R = 0.4. Jets from the anti-k t algorithm and other hadron-collider algorithms are returned sorted in decreasing transverse momentum, while those from e + e − algorithms are returned sorted in decreasing energy. This behaviour differs from that of ClusterSequence::inclusive jets(), where the ordering of the jets is not a priori determined (i.e. the user must manually sort the jets themselves).

fastjet::ClusterSequence
To run the jet clustering, create a ClusterSequence object through the following constructor where input particles is the vector of initial particles of any type (PseudoJet, HepLorentzVector, etc.) that can be used to initialise a PseudoJet and jet def contains the full specification of the clustering (see Section 3.2).

Accessing inclusive jets
Inclusive jets correspond to all objects that have undergone a "beam" clustering (i.e. d iB recombination) in the description following Eq. (2). For nearly all hadron-collider algorithms, the "inclusive" jets above some given transverse momentum cut are the ones usually just referred to as the "jets". To access inclusive jets, the following member function should be used /// return a vector of all jets (in the sense of the inclusive /// algorithm) with pt >= ptmin. vector<PseudoJet> inclusive_jets (const double ptmin = 0.0) const; where ptmin may be omitted, then implicitly taking the value zero. Note that the order in which the inclusive jets are provided depends on the jet algorithm. To obtain a specific ordering, such as decreasing p t , the user should perform a sort themselves, e.g. with the sorted by pt(...) function, described in section 3.1.
With a zero transverse momentum cut, the number of jets found in the event is not an infrared safe quantity (adding a single soft particle can lead to one extra soft jet). However it can still be useful to talk of all the objects returned by inclusive jets() as being "jets", e.g. in the context of the estimation underlying-event and pileup densities, cf. section 8.

Accessing exclusive jets
There are two ways of accessing exclusive jets, one where one specifies d cut , the other where one specifies that the clustering is taken to be stopped once it reaches the specified number of jets.
/// return a vector of all jets (in the sense of the exclusive algorithm) that would /// be obtained when running the algorithm with the given dcut. vector<PseudoJet> exclusive_jets (const double dcut) const; /// return a vector of all jets when the event is clustered (in the exclusive sense) /// to exactly njets. Throws an error if the event has fewer than njets particles. vector<PseudoJet> exclusive_jets (const int njets) const; /// return a vector of all jets when the event is clustered (in the exclusive sense) /// to exactly njets. If the event has fewer than njets particles, it returns all /// available particles. vector<PseudoJet> exclusive_jets_up_to (const int njets) const; The user may also wish just to obtain information about the number of jets in the exclusive algorithm: /// return the number of jets (in the sense of the exclusive algorithm) that would /// be obtained when running the algorithm with the given dcut. int n_exclusive_jets (const double dcut) const; Another common query is to establish the d min value associated with merging from n + 1 to n jets. Two member functions are available for determining this: /// return the dmin corresponding to the recombination that went from n+1 to n jets /// (sometimes known as d n,n+1 ). double exclusive_dmerge (const int n) const; /// return the maximum of the dmin encountered during all recombinations up to the one /// that led to an n-jet final state; identical to exclusive_dmerge, except in cases /// where the dmin do not increase monotonically. double exclusive_dmerge_max (const int n) const; The first returns the d min in going from n + 1 to n jets. Occasionally however the d min value does not increase monotonically during successive mergings and using a d cut smaller than the return value from exclusive dmerge does not guarantee an event with more than n jets. For this reason the second function exclusive dmerge max is provided -using a d cut below its return value is guaranteed to provide a final state with more than n jets, while using a larger value will return a final state with n or fewer jets.
For e + e − collisions, where it is usual to refer to y ij = d ij /Q 2 (Q is the total (visible) energy) FastJet provides the following methods: double exclusive_ymerge (int njets); double exclusive_ymerge_max (int njets); int n_exclusive_jets_ycut (double ycut); std::vector<PseudoJet> exclusive_jets_ycut (double ycut); which are relevant for use with the e + e − k t algorithm and with the Jade plugin (section 5.4.2).

Other functionality
Unclustered particles. Some jet algorithms (e.g. a number of the plugins in section 5) have the property that not all particles necessarily participate in the clustering. In other cases, particles may take part in the clustering, but not end up in any final inclusive jet. Two member functions are provided to obtain the list of these particles. One is vector<PseudoJet> unclustered = clust_seq.unclustered_particles(); which returns the list of particles that never took part in the clustering. The other additionally returns objects that are the result of clustering but that never made it into a inclusive jet (i.e. into a "beam" recombination): vector<PseudoJet> childless = clust_seq.childless_pseudojets(); A practical example where this is relevant is with plugins that perform pruning [21], since they include a condition for vetoing certain recombinations. 8 Copying and transforming a ClusterSequence. A standard copy constructor is available for ClusterSequences. Additionally it is possible to copy the clustering history of a ClusterSequence while modifying the momenta of the PseudoJets at all (

Recombination schemes
When merging particles (i.e. PseudoJets) during the clustering procedure, one must specify how to combine the momenta. The simplest procedure (E-scheme) simply adds the four-vectors. This has been advocated as a standard in [3], was the main scheme in use during Run II of the Tevatron, is currently used by the LHC experiments, and is the default option in FastJet. Other choices are listed in table 1, and are described below.
Other schemes for pp collisions. Other schemes provided by earlier k t -clustering implementations [11,12] are the p t , p 2 t , E t and E 2 t schemes. They all incorporate a 'preprocessing' stage to make the initial momenta massless (rescaling the energy to be equal to the 3-momentum for the p t and p 2 t schemes, rescaling to the 3-momentum to be equal to the energy in the E t and E 2 t schemes). Then for all schemes the recombination p r of p i and p j is a massless 4-vector satisfying where w i is p t,i for the p t and E t schemes, and is p 2 t,i for the p 2 t and E 2 t schemes. Note that for massive particles the schemes defined in the previous paragraph are not invariant under longitudinal boosts. As a workaround for this issue, we propose boost-invariant p t and p 2 t schemes, which are identical to the normal p t and p 2 t schemes, except that they omit the preprocessing stage. They are available as BIpt scheme and BIpt2 scheme.
A generalisation of the p t and p 2 t schemes is to take a weighting by w i = p n t . An interesting limit is then n → ∞, referred to as the winner-takes-all scheme (WTA), because the new rapidity and azimuth in Eq. (4) coincide with those of the harder of the two particles [22]. This gives the WTA pt scheme. The WTA pt scheme does not perform any preprocessing, but differs from the BIpt... schemes in that the result of a recombination acquires the mass of the higher-p t of the two particles (instead of setting the mass to zero). This treatment ensures, for example, that recombination of a massive particle with an infinitesimally soft particle leaves the mass unchanged.
Other schemes for e + e − collisions. The default E-scheme is a sensible choice also for e + e − collisions. The only non-standard e + e − -specific scheme currently implemented is the WTA modp scheme, in which the recombination of p i ad p j points in the direction and has the mass of the particle with larger 3-vector modulus (| p|), and has a 3-vector modulus | p r | = | p i | + | p i |. 9 On request, we may in the future provide further dedicated schemes for e + e − collisions.
User-defined schemes. The user may define their own, custom recombination schemes, as described in Appendix E.1.

Constituents and substructure of jets
For any PseudoJet that results from a clustering, the user can obtain information about its constituents, internal substructure, etc., through member functions of the PseudoJet class. 10 Jet constituents. The constituents of a given PseudoJet jet can be obtained through vector<PseudoJet> constituents = jet.constituents(); Note that if the user wishes to identify these constituents with the original particles provided to ClusterSequence, she or he should have set a unique index for each of the original particles with the PseudoJet::set user index function. Alternatively more detailed information can also be set through the user_info facilities of PseudoJet, as discussed in appendix B.
Subjet properties. To obtain the set of subjets at a specific d cut scale inside a given jet, one may use the following PseudoJet member function: /// Returns a vector of all subjets of the current jet (in the sense of the exclusive /// algorithm) that would be obtained when running the algorithm with the given dcut std::vector<PseudoJet> exclusive_subjets (const double dcut) const; 9 It is tempting to define also a WTA E scheme in which the recombination would point in the direction of the particle with the largest energy. This is however dangerous in situations where the most energetic particle is at rest.
As a consequence, this scheme is not provided in FastJet. A similar consideration applies for a WTA mt scheme.
If m jets are found, this takes a time O (m ln m) (owing to the internal use of a priority queue). Alternatively one may obtain the jet's constituents, cluster them separately and then carry out an exclusive jets analysis on the resulting ClusterSequence. The results should be identical. This second method is mandatory if one wishes to identify subjets with an algorithm that differs from the one used to find the original jets.
One can also make use of the following methods, which allow one to follow the merging sequence (and walk back through it): /// If the jet has parents in the clustering, returns true and sets parent1 and parent2 /// equal to them. If it has no parents returns false and sets parent1 and parent2 to 0 bool has_parents(PseudoJet & parent1, PseudoJet & parent2) const; /// If the jet has a child then returns true and sets the child jet otherwise returns /// false and sets the child to 0 bool has_child(PseudoJet & child) const; /// If this jet has a child (and so a partner), returns true and sets the partner, /// otherwise returns false and sets the partner to 0 bool has_partner(PseudoJet & partner) const; Accessibility of structural information. If any of the above functions are used with a PseudoJet that is not associated with a ClusterSequence, an error will be thrown. Since the information about a jet's constituents is actually stored in the ClusterSequence and not in the jet itself, these methods will also throw an error if the ClusterSequence associated with the jet has gone out of scope, been deleted, or in any other way become invalid. One can establish the status of a PseudoJet's associated cluster sequence with the following enquiry functions: // returns true if this PseudoJet has an associated (and valid) ClusterSequence. bool has_valid_cluster_sequence() const; // returns a (const) pointer to the parent ClusterSequence (throws if inexistent // or no longer valid) const ClusterSequence* validated_cluster_sequence() const; There are also has associated cluster sequence() and associated cluster sequence() member functions. The first returns true even when the cluster sequence is not valid, and the second returns a null pointer in that case. Further information is to be found in appendix C.
There are contexts in which, within some function, one might create a ClusterSequence, obtain a jet from it and then return that jet from the function. For the user to be able to access the information about the jet's internal structure, the ClusterSequence must not have gone out of scope and/or been deleted. To aid with potential memory management issues in this case, as long the ClusterSequence was created via a new operation, then one can tell the ClusterSequence that it should be automatically deleted after the last external object (jet, etc.) associated with it has disappeared. The call to do this is ClusterSequence::delete self when unused(). There must be at least one external object already associated with the ClusterSequence at the time of the call (e.g. a jet, subjet or jet constituent). Note that ClusterSequence tends to be a large object, so this should be used with care.

Composite jets, general considerations on jet structure
There are a number of cases where it is useful to be able to take two separate jets and create a single object that is the sum of the two, not just from the point of view of its 4-momentum, but also as concerns its structure. For example, in a search for a dijet resonance, some user code may identify two jets, jet1 and jet2, that are thought to come from a resonance decay and then wish to return a single object that combines both jet1 and jet2. This can be accomplished with the function join: PseudoJet resonance = join(jet1,jet2); The 4-momenta are added, 11 and in addition the resonance remembers that it came from jet1 and jet2. So, for example, a call to resonance.constituents() will return the constituents of both jet1 and jet2. It is possible to join 1, 2, 3 or 4 jets or a vector of jets. If the jets being joined had areas (section 7) then the joined jet will also have an area.
For a jet formed with join, one can find out which pieces it has been composed from with the function vector<PseudoJet> pieces = resonance.pieces(); In the above example, this would simply return a vector of size 2 containing jet1 and jet2. The pieces() function also works for jets that come from a ClusterSequence, returning two pieces if the jet has parents, zero otherwise.
Enquiries as to available structural information. Whether or not a given jet has constituents, recursive substructure or pieces depends on how it was formed. Generally a user will know how a given jet was formed, and so know if it makes sense, say, to call pieces(). However if a jet is returned from some third-party code, it may not always be clear what kinds of structural information it has. Accordingly a number of enquiry functions are available: bool has_structure(); // true if the jet has some kind of structural info bool has_constituents(); // true if the jet has constituents bool has_exclusive_subjets(); // true if there is cluster-sequence style subjet info bool has_pieces(); // true if the jet can be broken up into pieces bool has_area(); // true if the jet has jet-area information string description(); // returns a textual description of the type // of structural info associated with the jet Asking (say) for the pieces() of a jet for which has pieces() returns false will cause an error to be thrown. The structural information available for different kinds of jets is summarised in appendix C.

Version information
Information on the version of FastJet that is being run can be obtained by making a call to std::string fastjet_version_string(); (defined in fastjet/JetDefinition.hh). In line with recommendations for other programs in high-energy physics, the user should include this information in publications and plots so as to facilitate reproducibility of the jet-finding. 12 We recommend also that the main elements of the jet def.description() be provided, together with citations to the original article that defines the algorithm, as well as to this manual and, optionally, the original FastJet paper [10]. As of version 3.1, it is also possible to conditionally compile code based on the the FastJet version number, which is encoded in the FASTJET VERSION NUMBER preprocessor symbol as a single integer Mmmpp: M is the major version number, mm the minor version number and pp the patch level. Thus version 3.1.12 would be represented as 30112. Code requiring at least version 3.1.0 could be included as follows: #include "fastjet/config.h" #if FASTJET_VERSION_NUMBER >= 30100 // code that needs version 3.1.0 or higher # else // alternative code that works also with version 3.0.x #endif In versions prior to 3.1.0 the FASTJET VERSION NUMBER symbol is undefined and is accordingly treated by the preprocessor as if it were zero.

.1 Longitudinally invariant k t jet algorithm
The longitudinally invariant k t jet algorithm [8,9] comes in inclusive and exclusive variants. The inclusive variant (corresponding to [9], modulo small changes of notation) is formulated as follows: 1. For each pair of particles i, j work out the k t distance 13 where p ti , y i and φ i are the transverse momentum (with respect to the beam direction), rapidity and azimuth of particle i. R is a jet-radius parameter usually taken of order 1. For each parton i also work out the beam distance d iB = p 2 ti .
2. Find the minimum d min of all the d ij , d iB . If d min is a d ij merge particles i and j into a single particle, summing their four-momenta (this is E-scheme recombination); if it is a d iB then declare particle i to be a final jet and remove it from the list.
3. Repeat from step 1 until no particles are left. 12 We devote significant effort to ensuring that all versions of the FastJet program give identical, correct clustering results, and that any other changes from one version to the next are clearly indicated. However, as per the terms of the GNU General Public License (v2), under which FastJet is released, we are not able to provide a warranty that FastJet is free of bugs that might affect your use of the program. Accordingly it is important for physics publications to include a mention of the FastJet version number used, in order to help trace the impact of any bugs that might be discovered in the future. 13 In the soft, small angle limit for i, the k t distance is the (squared) transverse momentum of i relative to j.
The exclusive variant of the longitudinally invariant k t jet algorithm [8] is similar, except that (a) when a d iB is the smallest value, that particle is considered to become part of the beam jet (i.e. is discarded) and (b) clustering is stopped when all d ij and d iB are above some d cut . In the exclusive mode R is commonly set to 1.

Cambridge/Aachen jet algorithm
The pp Cambridge/Aachen (C/A) jet algorithm [23,24] is provided in the form proposed in Ref. [24]. Its formulation is identical to that of the (inclusive) pp k t jet algorithm, except as regards the distance measures, which are: To use this algorithm, define JetDefinition jet_def(cambridge_algorithm, R); and then extract inclusive jets from the cluster sequence. Attempting to extract exclusive jets from the Cambridge/Aachen algorithm with a d cut parameter simply provides the set of jets obtained up to the point where all d ij , d iB > d cut . Having clustered with some given R, this can actually be an effective way of viewing the event at a smaller radius, R eff = √ d cut R, thus allowing a single event to be viewed at a continuous range of R eff within a single clustering.
We note that the original formulation of the Cambridge algorithm [23] (in e + e − ) instead makes use of an auxiliary (k t ) distance measure and 'freezes' pseudojets whose recombination would involve too large a value of the auxiliary distance measure. Details are given in section 5.4.1.

Anti-k t jet algorithm
This algorithm, introduced and studied in [14], is defined exactly like the standard k t algorithm, except for the distance measures which are now given by While it is a sequential recombination algorithm like k t and Cambridge/Aachen, the anti-k t algorithm behaves in some sense like a 'perfect' cone algorithm, in that its hard jets are exactly circular on the y-φ cylinder [14]. To use this algorithm, define JetDefinition jet_def(antikt_algorithm, R); and then extract inclusive jets from the cluster sequence. We advise against the use of exclusive jets in the context of the anti-k t algorithm, because of the lack of physically meaningful hierarchy in the clustering sequence.

Generalised k t jet algorithm
The "generalised k t " algorithm again follows the same procedure, but depends on an additional continuous parameter p, and has the following distance measure: For specific values of p, it reduces to one or other of the algorithms list above, k t (p = 1), Cambridge/Aachen (p = 0) and anti-k t (p = −1). To use this algorithm, define JetDefinition jet_def(genkt_algorithm, R, p); and then extract inclusive jets from the cluster sequence (or, for p ≥ 0, also the exclusive jets).

Generalised k t algorithm for e + e − collisions
FastJet also provides native implementations of clustering algorithms in spherical coordinates (specifically for e + e − collisions) along the lines of the original k t algorithms [25], but extended following the generalised pp algorithm of [14] and section 4.4. We define the two following distances: for a general value of p and R. At a given stage of the clustering sequence, if a d ij is smallest then i and j are recombined, while if a d iB is smallest then i is called an "inclusive jet". For values of R ≤ π in eq. (9), the generalised e + e − k t algorithm behaves in analogy with the pp algorithms: when an object is at an angle θ iX > R from all other objects X then it forms an inclusive jet. With the choice p = −1 this provides a simple, infrared and collinear safe way of obtaining a cone-like algorithm for e + e − collisions, since hard well-separated jets have a circular profile on the 3D sphere, with opening half-angle R. To use this form of the algorithm, define JetDefinition jet_def(ee_genkt_algorithm, R, p); and then extract inclusive jets from the cluster sequence.
For values of R > π, FastJet replaces the factor (1 − cos R) in the denominator of eq. (9a) with (3 + cos R). With this choice (as long as R < 3π), the only time a d iB will be relevant is when there is just a single particle in the event. The inclusive jets(...) will then always return a single jet consisting of all the particles in the event. In such a context it is only the exclusive jets(...) call that provides non-trivial information.

k t algorithm for e + e − collisions
The e + e − k t algorithm [25], often referred to also as the Durham algorithm, has a single distance: Note the difference in normalisation between the d ij in eqs. (9) and (10), and the fact that in neither case have we normalised to the total energy Q in the event, contrary to the convention adopted originally in [25] (where the distance measure was called y ij ). To use the e + e − k t algorithm, define JetDefinition jet_def(ee_kt_algorithm); and then extract exclusive jets from the cluster sequence. Note that the ee genkt algorithm with π < R < 3π and p = 1 gives a clustering sequence that is identical to that of the ee kt algorithm. The normalisation of the d ij 's will however be different.

Plugin jet algorithms
It can be useful to have a common interface for a range of jet algorithms beyond the native (k t , anti-k t and Cambridge/Aachen) algorithms, notably for the many cone algorithms that are in existence. It can also be useful to be able to use FastJet features such as area-measurement tools for these other jet algorithms. In order to facilitate this, the FastJet package provides a plugin facility, allowing almost any other jet algorithm 14 to be used within the same framework.
Generic plugin use is described in the next subsection. The plugins distributed with FastJet are described afterwards in sections 5. 2-5.4. They are all accessible within the fastjet namespace and their code is to be found in FastJet's plugins/ directory. New user-defined plugins can also be implemented, as described in section E.2. Some third-party plugins are linked to from the tools page at http://fastjet.fr/ .
Not all plugins are enabled by default in FastJet. At configuration time ./configure --help will indicate which ones get enabled by default. To enable all plugins, run configure with the argument --enable-allplugins, while to enable all but PxCone (which requires a Fortran compiler, and can introduce link-time issues) use --enable-allcxxplugins.

Generic plugin use
Plugins are classes derived from the abstract base class fastjet::JetDefinition::Plugin. A JetDefinition can be constructed by providing a pointer to a JetDefinition::Plugin; the resulting JetDefinition object can then be used identically to the normal JetDefinition objects used in the previous sections. We illustrate this with an example based on the SISCone plugin: #include "fastjet/SISConePlugin.hh" // allocate a new plugin for SISCone (for other plugins, simply // replace the appropriate parameters and plugin name) double cone_radius = 0.7; double overlap_threshold = 0.75; JetDefinition::Plugin * plugin = new SISConePlugin(cone_radius, overlap_threshold); // create a jet definition based on the plugin JetDefinition jet_def(plugin); // run the jet algorithm and extract the jets ClusterSequence clust_seq(particles, jet_def); vector<PseudoJet> inclusive_jets = clust_seq.inclusive_jets(); // then analyse the jets as for native fastjet algorithms ... // only when you will no longer be using the jet definition, or // ClusterSequence objects that involve it, may you delete the plugin delete plugin; In cases such as this where the plugin has been created with a new statement and the user does not wish to manage the deletion of the corresponding memory when the JetDefinition (and any copies) using the plugin goes out of scope, then the user may wish to call the JetDefinition's delete plugin when unused() function, which tells the JetDefinition to acquire ownership of the pointer to the plugin and delete it when it is no longer needed.

SISCone Plugin
SISCone provides infrared and collinear-safe implementations of the two most common kinds of cone algorithm. Cone algorithm execution generally involves two phases. The first is the search for stable cones (SC). Then, because a given particle can appear in more than one stable cone, there are two options for the second phase: the standard approach is to apply a 'split-merge' (see e.g. Ref. [3]) procedure, which ensures that no particle ends up in more than one jet. Alternatively, one can identify the highest-p t stable cone, call it a jet, remove all particles that it contained and then repeat the stable-cone search and removal over and over until no particles remain (we call this progressive removal).
The stable cones are identified using an O (N 2 ln N ) seedless approach [27]. This (and some care in the 'split-merge' procedure, if used) ensures that the jets it produces are insensitive to additional soft particles and collinear splittings, i.e. the algorithm is infrared and collinear safe.
By default SISCone runs with a split-merge step. The implementation with progressive removal was made public in version 3.0 of SISCone, first released with version 3.1 of FastJet.

Default mode: SISCone with a split-merge step (SISCone-SM)
The original implementation of SISCone [27] corresponded to a stable-cone jet algorithm with a splitmerge step (SC-SM in the notation of [5]). This remains SISCone's default and, unless explicitly mentioned otherwise, it corresponds to the version of SISCone used in the literature.
The plugin library and include files are to be be found in the plugins/SISCone directory, and the main header file is fastjet/SISConePlugin.hh. The SISConePlugin class has a constructor with the following structure #include "fastjet/SISConePlugin.hh" SISConePlugin (double cone_radius, double overlap_threshold, int n_pass_max = 0, double protojet_ptmin = 0.0, bool caching = false, SISConePlugin::SplitMergeScale split_merge_scale = SISConePlugin::SM_pttilde); A cone centred at y c , φ c is stable if the sum of momenta of all particles i satisfying ∆y 2 ic + ∆φ 2 ic < cone_radius 2 has rapidity y c , φ c . The overlap_threshold is the fraction of overlapping momentum above which two protojets are merged in a Tevatron Run II type [3] split-merge procedure. The radius and overlap parameters are a common feature of most modern cone algorithms. Because some event particles are not to be found in any stable cone [28], SISCone can carry out multiple stable-cone search passes (as advocated in [29]): in each pass one searches for stable cones using just the subset of particles not present in any stable cone in earlier passes. Up to n_pass_max passes are carried out, and the algorithm automatically stops at the highest pass that gives no new stable cones. The default of n_pass_max = 0 is equivalent to setting it to ∞.
Concern had at some point been expressed that an excessive number of stable cones might complicate cone jets in events with high noise [3], and in particular lead to large "monster" jets. The protojet_ptmin parameter allows one to use only protojets with p t ≥ protojet_ptmin in the splitmerge phase (all others are thrown away), so as to limit this issue. A large value of the split-merge overlap threshold, e.g. 0.75, also helps to disfavour the production of these monster jets through repeated merge operations.
In many cases SISCone's most time-consuming step is the search for stable cones. If one has multiple SISConePlugin-based jet definitions, each with caching=true, a check will be carried out whether the previously clustered event had the same set of particles and the same cone radius and number of passes. If it did, the stable cones are simply reused from the previous event, rather than being recalculated, and only the split-merge step is repeated, often leading to considerable speed gains.
A final comment concerns the split_merge_scale parameter. This controls both the scale used for ordering the protojets during the split-merge step during the split-merge step, and also the scale used to measure the degree of overlap between protojets. While various options have been implemented, enum SplitMergeScale {SM_pt, SM_Et, SM_mt, SM_pttilde}; we recommend using only the last of themp t = i∈jet |p t,i |, which is also the default scale. The other scales are included only for historical comparison purposes: p t (used in several other codes) is IR unsafe for events whose hadronic component conserves momentum, E t (advocated in [3]) is not boost-invariant, and m t = m 2 + p 2 t is IR unsafe for events whose hadronic component conserves momentum and stems from the decay of two identical particles.
An example of the use of the SISCone plugin was given in section 5.1. As can be seen there, SISCone jets are to be obtained by requesting inclusive jets from the cluster sequence. Note that it makes no sense to ask for exclusive jets from a SISCone based ClusterSequence.
Some cone algorithms provide information beyond that simply about the jets. Such additional information, where available, can be accessed with the help of the ClusterSequence::extras() function. In the case of SISCone, one can access that information as follows: const fastjet::SISConeExtras * extras = dynamic_cast<const fastjet::SISConeExtras *>(clust_seq.extras()); To determine the pass at which a given jet was found, one then uses 15 int pass = extras->pass(jet); One may also obtain a list of the positions of original stable cones as follows: const vector<PseudoJet> & stable_cones = extras->stable_cones(); The stable cones are represented as PseudoJets, for which only the rapidity and azimuth are meaningful. The user_index() indicates the pass at which a given stable cone was found.
SISCone uses E-scheme recombination internally and also for constructing the final jets from the list of constituents. For the latter task, the user may instead instruct SISCone to use the jet-definition's own recombiner, with the command plugin->set_use_jet_def_recombiner(true); For this to work, plugin must explicitly be of type SISConePlugin * rather than JetDefinition::Plugin *.
Since SISCone is infrared safe, it may meaningfully be used also with the ClusterSequenceArea class.
Note however that in that case ones loses the cone-specific information from the jets, because of the way FastJet filters out the information relating to ghosts in the clustering. If the user needs both areas and cone-specific information, she/he may use the ClusterSequenceActiveAreaExplicitGhosts class (for usage information, see the corresponding .hh file).
A final remark concerns speed and memory requirements: as mentioned above, SISCone takes O (N 2 ln N ) time to find jets, and the memory use is O (N 2 ); taking N = 10 3 as a reference point, it runs in a few tenths of a second, making it about 100 times slower than native FastJet algorithms. These are 'expected' results, i.e. valid for a suitably random set of particles. 16 Note that the underlying implementation of SISCone is optimised for large N . An alternative implementation that is faster for N 10 was presented in [30].

SISCone with progressive removal (SISCone-PR)
After creating an instance of SISConePlugin, the progressive-removal option can be enabled by calling plugin->set_progressive_removal(true); The value of the split-merge overlap threshold is then ignored and caching is not available. The SplitMergeScale argument in the constructor, instead of being interpreted as the choice of scale to use in the split-merge step, is instead interpreted as the scale used to order the stable cones (the default is SISConePlugin::SM pttilde, i.e. the scalar sum of p t 's of the constituents) and it is the hardest stable cone according to that choice that is removed. The stable cone finding is then repeated on the remaining set of particles, the new hardest cone is removed, and so forth. As for other progressive-removal cone algorithms, SISCone-PR guarantees that the first jet that is found has area πR 2 . 17 There can be interest in trying out other choices of scale variable for the ordering of stable cones. This can be achieved by calling where user scale is an instance of any class that derives from the SISConePlugin::UserScaleBase class (actually defined in SISConeBasePlugin).
In discussing the speed of SISCone with progressive removal, it is useful to distinguish N , the total number of particles in the event from n, the number inside a typical circle of radius R. SISCone with split-merge actually takes a time N n ln n. SISCone with progressive removal, in its current implementation, repeats the full stable cone finding O (N/n) times. Thus it takes a time ∼ N 2 ln n. This could be reduced by repeating the stable-cone finding only for particles in the neighbourhood of the last stable-cone that was removed: that more limited stable-cone finding should take a time n 2 ln n at each iteration, so that the total time would then be N n ln n, albeit with a larger coefficient than for the split-merge variant. We leave such an improvement to future work, should there be demand for it.

Other plugins for hadron colliders
Most of the algorithms listed below are cone algorithms. They are usually either infrared (IR) or collinear unsafe. The details are indicated for each algorithm following the notation of Ref. [5]: IR n+1 means that the hard jets may be modified if, to an ensemble of n hard particles in a common neighbourhood, one adds a single soft particle; Coll n+1 means that for n hard particles in a common neighbourhood, the collinear splitting of one of them may modify the hard jets. The FastJet authors (and numerous theory-experiment accords) advise against the use of IR and/or collinear unsafe jet algorithms. Interfaces to these algorithms have been provided mainly for legacy comparison purposes.
Except where stated, the usual way to access jets from these plugins is through ClusterSequence::inclusive jets().

CDF Midpoint
This is one of the two cone algorithms used by CDF in Run II of the Tevatron, based on [3]. It is a midpoint-type iterative cone with a split-merge step. The overlap threshold (f ) used by CDF is usually 0.5, the seed threshold is 1 GeV. With a cone area fraction α < 1, the search for stable cones is performed with a radius that is R × √ α, i.e. it becomes the searchcone algorithm of [28]. CDF has used both α = 0.25 and α = 1.0. It is our understanding that the particular choice of α is not always clearly documented in the corresponding publications.
Further control over the plugin can be obtained by consulting the header file. The original underlying code for this algorithm was provided on a webpage belonging to Joey Huston [31] (with minor modifications to ensure reasonable run times with optimising compilers for 32-bit Intel processors -these modifications do not affect the final jets).
This algorithm is IR 3+1 unsafe in the limit of zero seed threshold [27] (with α = 1 it becomes IR 2+1 unsafe [29]). With a non-zero seed threshold (and no preclustering into calorimeter towers) it is collinear unsafe. It is to be deprecated for new experimental or theoretical analyses.

CDF JetClu
JetClu is the other cone algorithm used by CDF during Run II, as well as their main algorithm during Run I [32]. It is an iterative cone with split-merge and optional "ratcheting" if iratch == 1 (particles that appear in one iteration of a cone are retained in future iterations). It can be obtained as follows: The overlap threshold is usually set to 0.75 in CDF analyses. Further control over the plugin can be obtained by consulting the header file. The original underlying code for this algorithm was provided on a webpage belonging to Joey Huston [31].
This algorithm is IR 2+1 unsafe (for zero seed threshold; some IR unsafety persists with non-zero seed threshold). It is to be deprecated for new experimental or theoretical analyses. Note also that the underlying implementation groups particles together into calorimeter towers, with CDF-type geometry, before running the jet algorithm.

DØ Run I cone
This is the main algorithm used by DØ in Run I of the Tevatron [33]. It is an iterative cone algorithm with a split-merge step. It comes in two versions corresponding to versions of the algorithm used respectively before and after 1996. They differ only in the recombination scheme used to determine the jet momenta once each jet's constituents have been identified. In the pre-1996 version, a hybrid between an E-like scheme and an E t scheme recombination is used, while in the post-1996 version it is just the E t scheme [33].
The algorithm places a cut on the minimum E t of the cones during iteration (related to min_jet_Et). The split_ratio is the same as the overlap threshold in other split-merge based algorithms (DØ usually use 0.5). It is the FastJet authors' understanding that the value used for min_jet_Et was 8 GeV, corresponding to a cut of 4 GeV on cones. The publication that describes this algorithm [33] mentions the use of a 1 GeV seed threshold applied to preclustered calorimeter towers in order to obtain the seeds for the stable cone finding. Such a threshold and the pre-clustering appear not to be included in the code distributed with FastJet.
The underlying code for this algorithm was provided by Lars Sonnenschein. Permission to redistribute this code with FastJet has been granted by the DØ collaboration under the terms of the GPL license.
Note: this algorithm is IR 2+1 unsafe. It is recommended that it be used only for the purpose of comparison with Run I data from DØ. It is to be deprecated for new experimental or theoretical analyse

DØ Run II cone
This is the main algorithm used by DØ in Run II of the Tevatron. It is a midpoint type iterative cone with split-merge step. The DØ collaboration usually refers to Ref. [3] when introducing the algorithm in its articles. That generic Tevatron document does not reflect all details of the actual DØ algorithm, and for a complementary description the reader is referred to Ref. [34].
The parameters have the same meaning as in the DØ Run I cone. There is a cut on the minimum E t of the cones during iteration, which by default is half of min_jet_Et. It is the FastJet authors' understanding that two values have been used for min_jet_Et, 8 GeV (in earlier publications) and 6 GeV (in more recent publications). As concerns seed thresholds and preclustering, DØ describes them as being applied to calorimeter towers in data and in Monte Carlo studies that include detector simulation [34]. However, for NLO calculations and Monte Carlo studies based on stable particles, no seed threshold is applied. The code distributed with FastJet does not allow for seed thresholds.
The underlying code for this algorithm was provided by Lars Sonnenschein. Permission to redistribute this code with FastJet has been granted by the DØ collaboration under the terms of the GPL license.
Note: this algorithm is IR 3+1 unsafe (IR 2+1 for jets with energy too close to min_jet_Et). It is to be deprecated for new experimental or theoretical analyses.

ATLAS iterative cone
This iterative cone algorithm, with a split-merge step, was used by ATLAS during the preparation for the LHC. f is the overlap threshold The underlying code for this algorithm was extracted from an early version of SpartyJet [16] (which itself was distributed under the GPL license). Since version 3.0 of FastJet it is a slightly modified version that we distribute, where an internal sort function has been replaced with a stable sort, to ensure reproducibility of results across compilers and architectures (results are unchanged when the results of the sort are unambiguous).
Note: this algorithm is IR 2+1 unsafe (in the limit of zero seed threshold). It is to be deprecated for new experimental or theoretical analyses.

CMS iterative cone
This iterative cone algorithm with progressive removal was used by CMS during the preparation for the LHC. The underlying code for this algorithm was extracted from the CMSSW web site, with certain small service routines having been rewritten by the FastJet authors. Permission to redistribute the resulting code with FastJet has been granted by CMS under the terms of the GPL license. The code was validated by clustering 1000 events with the original version of the CMS software and comparing the output to the clustering performed with the FastJet plugin. The jet contents were identical in all cases. However the jet momenta differed at a relative precision level of 10 −7 , related to the use of single-precision arithmetic at some internal stage of the CMS software (while the FastJet version is in double precision).
Note: this algorithm is Coll 3+1 unsafe [14]. It is to be deprecated for new experimental or theoretical analyses.

PxCone
The PxCone algorithm is an iterative cone with midpoints and a split-drop procedure: It includes a threshold on the minimum transverse energy for a cone (jet) to be included in the splitdrop stage. If E_scheme_jets is true then the plugin applies an E-scheme recombination to provide the momenta of the final jets (by default an E t type recombination scheme is used). The base code for this plugin is written in Fortran and, on some systems, problems have been reported at the link stage due to mixing Fortran and C++. The Fortran code has been modified by the FastJet authors to provide the same jets regardless of the order of the input particles. This involved a small modification of the midpoint procedure, which can have a minor effect on the final jets and should make the algorithm correspond to the description of [35].
The underlying code for this algorithm was written by Luis del Pozo and Michael Seymour with input also from David Ward [36] and they have granted permission for their code to be distributed with FastJet under the terms of the GPL license.
This algorithm is IR 3+1 unsafe. It is to be deprecated for new experimental or theoretical analyses.

TrackJet
This algorithm has been used at the Tevatron for identifying jets from charged-particle tracks in underlying-event studies [37]. Two recombination schemes are involved: the first one indicates how momenta are recombined to provide the final jets (once particle-jet assignments are known), the second one indicates how momenta are combined in the procedure that constructs the jets. The underlying code for this algorithm was written by the FastJet authors, based on code extracts from the (GPL) Rivet implementation, written by Andy Buckley with input from Manuel Bahr and Rick Field. Since version 3.0 of FastJet it is a slightly modified version that we distribute, where an internal sort function has been replaced with a stable sort, to ensure reproducibility of results across compilers and architectures (results are unchanged when the results of the sort are unambiguous, which is the usual case).
Note: this algorithm is believed to be Coll 3+1 unsafe. It is to be deprecated for new experimental or theoretical analyses.

GridJet
GridJet allows you to define a grid and then cluster particles such that all particles in a common grid cell combine to form a jet. Its main interest is in providing fast clustering for high multiplicities (the clustering time scales linearly with the number of particles). The jets that it forms are not always physically meaningful: for example, a genuine physical jet may lie at the corner of 4 grid cells and so be split up somewhat arbitrarily into 4 pieces. It is therefore not intended to be used for standard jet finding. However for some purposes (such as background estimation) this drawback is offset by the greater uniformity of the area of the jets. Its interface is as follows #include "fastjet/GridJetPlugin.hh" // ... GridJetPlugin (double ymax, double requested_grid_spacing); creating a grid that covers |y| <ymax with a grid spacing close to the requested grid spacing: the spacings chosen in φ and y are those that are closest to the requested spacing while also giving an integer number of grid cells that fit exactly into the rapidity and 0 < φ < 2π ranges.
Note that for background estimation purposes the GridMedianBackgroundEstimator is much faster than using the GridJetPlugin with ghosts and a JetMedianBackgroundEstimator.
The underlying code for this algorithm was written by the FastJet authors.

5.4
Plugins for e + e − collisions

EECambridgePlugin (double ycut);
This algorithms performs sequential recombination of the pair of particles that is closest in angle, except when y ij = (1 − cos θ) > y cut , in which case the less energetic of i and j is labelled a jet, and the other member of the pair remains free to cluster.
To access the jets, the user should use the inclusive_jets(), i.e. as they would for the majority of the pp algorithms.
The underlying code for this algorithm was written by the FastJet authors.
Note: the JADE algorithm has been used with various recombination schemes. The current plugin will use whatever recombination scheme the user specifies with for the jet definition. The default Escheme is what was used in the original JADE publication [38]. To modify the recombination scheme, the user may first construct the jet definition and then use either of void JetDefinition::set_recombination_scheme(RecombinationScheme recomb_scheme); void JetDefinition::set_recombiner(const Recombiner * recomb) (cf. sections 3.4, E.1) to modify the recombination scheme.
The underlying code for this algorithm was written by the FastJet authors.

Spherical SISCone algorithm
The spherical SISCone algorithm is an extension [40] to spherical coordinates of the hadron-collider SISCone algorithm [27]. The functionality follows directly that of SISConePlugin, including options for using a split-merge step (the default) or progressive removal.
Note that the underlying implementation of spherical SISCone is optimised for large N . An alternative implementation that is faster for N 10 was presented in [30]. That reference also contains a nice description of the algorithm.

Selectors
Analyses often place constraints (cuts) on jets' transverse momenta, rapidity, maybe consider only some N hardest jets, etc. There are situations in which it is convenient to be able to define a basic set of jet cuts in one part of a program and then have it used elsewhere. To allow for this, we provide a fastjet::Selector class, available through #include "fastjet/Selector.hh"

Essential usage
As an example of how Selectors are used, suppose that we have a vector of jets, jets, and wish to select those that have rapidities |y| < 2.5 and transverse momenta above 20 GeV. We might write the following: Here, Selector is a class, while SelectorAbsRapMax and SelectorPtMin are functions that return an instance of the Selector class containing the internal information needed to carry out the selection. Selector::operator(const vector<PseudoJet> & jets) takes the jets given as input and returns a vector containing those that pass the selection cuts. The logical operations &&, || and ! enable different selectors to be combined. Nearly all selectors, like those above, apply jet by jet (the function Selector::applies jet by jet() returns true). For these, one can query whether a single jet passes the selection with the help of the function bool Selector::pass(const PseudoJet &).
There are also selectors that only make sense applied to an ensemble of jets. This is the case specifically for SelectorNHardest(unsigned int n), which, acting on an ensemble of jets, selects the n jets with largest transverse momenta. If there are fewer than n jets, then all jets pass.
When a selector is applied to an ensemble of jets one can also use to obtain the vectors of PseudoJets that pass or fail the selection. For selectors that apply jet-by-jet, the selectors on either side of the logical operators && and || naturally commute. For operators that act only on the ensemble of jets the behaviour needs specifying. The design choice that we have made is that SelectorNHardest (2) && SelectorAbsRapMax(2.5) SelectorAbsRapMax(2.5) && SelectorNHardest (2) give identical results: in logical combinations of selectors, each constituent selector is applied independently to the ensemble of jets, and then a decision whether a jet passes is determined from the corresponding logical combination of each of the selectors' results. Thus, here only jets that are among the 2 hardest of the whole ensemble and that have |y| < 2.5 will be selected. If one wishes to first apply a rapidity cut, and then find the 2 hardest among those jets that pass the rapidity cut, then one should instead use the * operator: SelectorNHardest(2) * SelectorAbsRapMax (2.5) In this combined selector, the right-hand selector is applied first, and then the left-hand selector is applied to the results of the right-hand selection.
A complementary selector can also be defined using the negation operator. For instance Selector sel_allbut2hardest = !SelectorNHardest (2); Note that, if directly applying (as opposed to first defining) a similar negated selector to a collection of jets, one should write vector<PseudoJet> allbut2hardest = (!SelectorNHardest(2))(jets); with the brackets around the selector definition being now necessary due to () having higher precedence in C++ than Boolean operators. A user can obtain a string describing a given Selector's action by calling its description() member function. This behaves sensibly also for compound selectors.
New selectors can be implemented as described in section E.3.

Other information about selectors
Selectors contain a certain amount of additional information that can provide useful hints to the functions using them.
One such piece of information is a selector's rapidity extent, accessible through a get rapidity extent(rapmin,rapmax) call, which is useful in the context of background estimation (section 8). If it is not sensible to talk about a rapidity extent for a given selector (e.g. for SelectorPtMin) the rapidity limits that are returned are the largest (negative and positive) numbers that can be represented as doubles. The function is geometric() returns true if the selector places constraints only on rapidity and azimuth.
Selectors may also have areas associated with them (in analogy with jet areas, section 7). The has finite area() member function returns true if a selector has a meaningful finite area. The area() function returns this area. In some cases the area may be computed using ghosts (by default with ghosts of area 0.01; the user can specify a different ghost area as an argument to the area function). Following a similar naming convention, there are also SelectorPhiRange(φ min , φ max ) and SelectorRapPhiRange(y min , y max , φ min , φ max ).

Relative kinematical cuts
Some selectors take a reference jet. They have been developed because it is can be useful for a selector to make its decision based on information about some other jet. For example one might wish to select all jets within some distance of a given reference jet; or all jets whose transverse momentum is at least some fraction of a reference jet's. That reference jet may change from event to event, or even from one invocation of the Selector to the next, even though the Selector is fundamentally performing the same underlying type of action.
The available selectors of this kind are: // a circle of radius R around the reference jet SelectorDoughnut(R in , R out ) // a doughnut between R in and R out SelectorStrip(half_width) // a rapidity strip 2*half_width large SelectorRectangle(half_rap_width, half_phi_width) // a rectangle in rapidity and phi SelectorPtFractionMin(f ) // p t larger than f p ref t One example of selectors taking a reference jet is the following. First, one constructs the selector, Then if one is interested in the subset of jets near jet1, and then those near jet2, one performs the following operations: sel.set_reference(jet1); vector<PseudoJet> jets_near_jet1 = sel(jets); sel.set_reference(jet2); vector<PseudoJet> jets_near_jet2 = sel(jets); If one uses a selector that takes a reference without the reference having been actually set, an exception will be thrown. If one sets a reference for a compound selector, the reference is automatically set for all components that take a reference. One can verify whether a given selector takes a reference by calling the member function bool Selector::takes_reference() const; Attempting to set a reference for a Selector that returns false here will cause an exception to be thrown.

Other selectors
The following selectors are also available: SelectorNHardest(n) // selects the n hardest jets SelectorIsPureGhost() // selects jets that are made exclusively of ghost particles SelectorIsZero() // selects jets with zero momentum SelectorIdentity() // selects everything. Included for completeness

Jet areas
Jet areas provide a measure of the surface in the y-φ plane over which a jet extends, or, equivalently, a measure of a jet's susceptibility to soft contamination. Since a jet is made up of only a finite number of particles, one needs a specific definition in order to make its area an unambiguous concept. Three definitions of area have been proposed in [17] and implemented in FastJet: • Active areas add a uniform background of extremely soft massless 'ghost' particles to the event and allow them to participate in the clustering. The area of a given jet is proportional to the number of ghosts it contains. Because the ghosts are extremely soft (and sensible jet algorithms are infrared safe), the presence of the ghosts does not affect the set of user particles that end up in a given jet. Active areas give a measure of a jet's sensitivity to diffuse background noise.
• Passive areas are defined as follows: one adds a single randomly placed ghost at a time to the event. One examines which jet (if any) the ghost ends up in. One repeats the procedure many times and the passive area of a jet is then proportional to the probability of it containing the ghost. Passive areas give a measure of a jet's sensitivity to point-like background noise.
• The Voronoi area of a jet is the sum of the Voronoi areas of its constituent particles. The Voronoi area of a particle is obtained by determining the Voronoi diagram for the event as a whole, and intersecting the Voronoi cell of the particle with a circle of radius R centred on the particle. Note that for the k t algorithm (but not in general for other algorithms) the Voronoi area of a jet coincides with its passive area.
In the limit of very densely populated events, all area definitions lead to the same jet-area results [17]. 18 The area of a jet can be calculated either as a scalar, or as a 4-vector. Essentially the scalar case corresponds to counting the number of ghosts in the jet; the 4-vector case corresponds to summing their 4-vectors, normalised such that for a narrow jet, the transverse component of the 4-vector is equal to the scalar area.
To access jet areas, the user is exposed to two main classes: Details are to be found below and an example program is given as example/06-area.cc. When jet areas are to be used to establish the level of a diffuse noise that might be present in the event (e.g. from underlying event particles or pileup), and maybe subtract it from jets, further classes such as fastjet::JetMedianBackgroundEstimator and fastjet::Subtractor are useful. This topic is discussed in Section 8 and an example program is given in example/07-subtraction.cc. for the Voronoi area. A default constructor exists, and provides an active area with a ghost spec that is acceptable for a majority of area measurements with clustering algorithms and typical Tevatron and LHC rapidity coverage.

AreaDefinition
Information about the current AreaDefinition can be retrieved with the help of description(), area type(), ghost spec() and voronoi spec() member functions.

Ghosted Areas (active and passive)
There are two variants each of the active and passive areas, as defined by the AreaType enum: The two active variants give identical results for the areas of hard jets. The second one explicitly includes the ghosts when the user requests the constituents of a jet and also leads to the presence of "pure ghost" jets. The first of the passive variants explicitly runs through the procedure mentioned above, i.e. it clusters the events with one ghost at a time, and repeats this for very many ghosts. This can be quite slow, so we also provide the passive area option, which makes use of information specific to the jet algorithm in order to speed up the passive-area determination. 19 In order to carry out a clustering with a ghosted area determination, the user should also create an object that specifies how to distribute the ghosts. 20 This is done via the class GhostedAreaSpec whose constructor is The ghosts are distributed on a uniform grid in y and φ, with small random fluctuations to avoid clustering degeneracies.
The ghost maxrap variable defines the maximum rapidity up to which ghosts are generated. If one places ghosts well beyond the particle acceptance (at least R beyond it), then jet areas also stretch beyond the acceptance, giving a measure of the jet's full extent in rapidity and azimuth. If ghosts are placed only up to the particle acceptance, then the jet areas are clipped at that acceptance and correspond more closely to a measure of the jet's susceptibility to contamination from accepted soft particles. This is relevant in particular for jets within a distance R of the particle acceptance boundary. The two choices are illustrated in fig. 1. To define more complicated ghost acceptances it is possible to replace ghost maxrap with a Selector, which must be purely geometrical and have finite rapidity extent.
The ghost area parameter in the GhostedAreaSpec constructor is the area associated with a single ghost. The number of ghosts is inversely proportional to the ghost area, and so a smaller area leads to a longer CPU-time for clustering. However small ghost areas give more accurate results. We have found the default ghost area given above to be adequate for most applications. Smaller ghost areas are beneficial mainly for high-precision determinations of areas of jets with small R. 19 This ability is provided for k t , Cambridge/Aachen, anti-k t and the SISCone plugin. In the case of k t it is actually a Voronoi area that is used, since this can be shown to be equivalent to the passive area [17] (though some approximations are made for 4-vector areas). For other algorithms it defaults back to the one ghost passive area approach. 20 Or accept a default -which uses the default values listed in the explicit constructor and ghost maxrap = 6  Figure 1: Two choices for ghost placement. The grey area in each plot indicates the region containing ghosts, while the dots are particles, which here are accepted up to |y| < 3. The circular regions indicate the areas that will be found for two particular jets. In the left-hand case, with ghosts that extend well beyond the acceptance for particles, jet areas are unaffected by the particle acceptance; in the right-hand case, with ghosts placed only up to the acceptance limit, jet areas are clipped at the edge of the acceptance.
By default, one set of ghosts is generated for each event that is clustered. The small random fluctuations in ghost positions and p t 's, introduced to break clustering degeneracies, mean that for repeated clustering of the same event a given hard jet's area will be different after each clustering. This is especially true for sparse events, where a jet's particle content fails to accurately delineate the boundaries of the jet. For the active area choice (and certain passive areas), specifying repeat > 1 causes FastJet to directly cluster the same hard event with multiple ghost sets. This results in a pseudo-Monte Carlo type evaluation of the jet areas. A statistical uncertainty on the area is available for each jet, through the jet.area error() call. It is calculated as the standard deviation of areas obtained for that jet, divided by √ repeat − 1. While there are situations in which this facility is useful, for most applications of jet areas it is sufficient to use repeat = 1. 21 After initialisation, the parameters can be modified and retrieved respectively with calls such as set ghost area(...) and ghost maxrap() (similarly for the other parameters 22 ). A textual description of the GhostedAreaSpec can be obtained, as usual, with the description() member function.

Voronoi Areas
The Voronoi areas of jets are evaluated by summing the corresponding Voronoi areas of the jets' constituents. The latter are obtained by considering the intersection between the Voronoi cell of each particle and a circle of radius R centred on the particle's position in the rapidity-azimuth plane. 21 Several parameters are available to control the properties and randomness of the ghosts: each ghost's position differs from an exact grid vertex by a random amount distributed uniformly in the range ± 1 2 grid scatter times the grid spacing in both the rapidity and azimuth directions. Each ghost's p t is distributed randomly in the range (1 ± 1 2 pt scatter) times mean ghost pt. For nearly all applications, it makes sense to use the default values. Facilities are also available to set and retrieve the seeds for the random-number generator, notably through the set random status(...) and get random status(...) members of GhostedAreaSpec. 22 In versions of FastJet prior to 3.0.1, the names mean ghost kt and kt scatter should be used rather than mean ghost pt and pt scatter. The former names will be maintained for the foreseeable future.
The jets' Voronoi areas can be obtained from ClusterSequenceArea by passing the proper VoronoiAreaSpec specification to AreaDefinition. Its constructors are /// default constructor (effective_Rfact = 1) VoronoiAreaSpec() ; /// constructor that allows you to set effective_Rfact VoronoiAreaSpec(double effective_Rfact) ; The second constructor allows one to modify (by a multiplicative factor effective Rfact) the radius of the circle which is intersected with the Voronoi cells. With effective Rfact = 1, for the k t algorithm, the Voronoi area is equivalent to the passive area. Information about the specification in use is returned by effective Rfact() and description() member functions.
The Voronoi areas are calculated with the help of Fortune's (N ln N ) Voronoi diagram generator for planar static point sets [41].
One use for the Voronoi area is in background determination with the k t algorithm (see below, section 8): with the choice effective Rfact 0.9 it provides an acceptable approximation to the k t algorithm's active area and is often significantly faster to compute than the active area. Note that it is not currently possible to clip Voronoi areas with a given particle acceptance. As a result, given particles up to |y| < y max , only jets with |y| y max − R will have areas that reflect the jets' sensitivity to accepted particle contamination. It is only these jets that should then be used for background determinations.

ClusterSequenceArea
This is the class 23 used for producing a cluster sequence that also calculates jet areas. Its constructor is As long as an instance of this class is in scope, a user can access information about the area of its jets using the following methods of PseudoJet: 23 ClusterSequenceArea is derived from ClusterSequenceAreaBase (itself derived from ClusterSequence) and makes use of one among ClusterSequenceActiveAreaExplicitGhosts, ClusterSequenceActiveArea, ClusterSequencePassiveArea, ClusterSequence1GhostPassiveArea or ClusterSequenceVoronoiArea (all of them in the fastjet namespace of course), according to the choice made with AreaDefinition. The user can also use these classes directly. ClusterSequenceActiveAreaExplicitGhosts is particularly useful in that it allows the user to specify their own set of ghost particles. This is exploited to provide area support in a number of the transformers of section 9. /// Returns the scalar area associated with the given jet double area = jet.area(); /// Returns the error (uncertainty) associated with the determination of the /// scalar area of the jet; gives zero when the repeat=1 and for passive/Voronoi areas double area_error = jet.area_error(); /// Returns a PseudoJet whose 4-vector is defined by the following integral /// /// dydφ PseudoJet(y,φ,p t = 1) * Θ("y, φ inside jet boundary") /// /// where PseudoJet(y,φ,p t = 1) is a 4-vector with the given rapidity (y), /// azimuth (φ) and p t = 1, while Θ("y, φ inside jet boundary") is 1 /// when y, φ define a direction inside the jet boundary and 0 otherwise. PseudoJet area_4vector = jet.area_4vector(); /// When using active_area_explicit_ghosts, returns true for jets made /// exclusively of ghosts and for ghost constituents. bool is_pure_ghost = jet.is_pure_ghost(); In the limit of small-R jets, the transverse component of the 4-vector area is close to the scalar area; for moderate R values, the transverse component of the 4-vector area is smaller by a factor that reads 1 − R 2 /8 + R 4 /192 + O (R 6 ) for the case of circular jets [42]. Note that 4-vector areas are not currently computed exactly for Voronoi areas of jets in sparse events, insofar as the 4-vector area of each particle's Voronoi cell is approximated as massless and pointing in the direction of the particle.

Background estimation and subtraction
Events with hard jets are often accompanied by a more diffuse "background" of relatively soft particles, for example from the underlying event (in pp or PbPb collisions) or from pileup (in pp collisions). For many physical applications, it is useful to be able to estimate characteristics of the background on an event-by-event basis, for example the p t per unit area (ρ), or fluctuations from point to point (σ). One use of this information is to correct the hard jets for the soft contamination, as discussed below in section 8.1.2.
One of the issues in characterising the background is that it is difficult to introduce a robust criterion to distinguish "background" jets from hard jets. The main method that is available in FastJet involves the determination of the distribution of p t /A for the jets in a given event (or region of the event) and then taking the median of the distribution as an estimate of ρ, as proposed in [18] and studied in detail also in [43,44]. This is largely insensitive to the presence of a handful of hard jets, and avoids any need for introducing a p t scale to distinguish hard and background jets.
The original form of this method used the k t or Cambridge/Aachen jet algorithms to find the jets. These algorithms have the advantage that the resulting jets tend to have reasonably uniform areas 24 In the meantime a variant of the approach that has emerged is to cluster the particles into rectangular grid cells in y and φ and determine their median p t /A. This has the advantage of simplicity and much greater speed. One may worry that a hard jet will sometimes lie at a corner of multiple grid cells, inducing larger biases in the median than with a normal jet finder jets, however we have found this not to be an issue in practice [44]. where the selector is used to indicate which jets are used for background estimation (for simple use cases, one just specifies a rapidity range, e.g. SelectorAbsRapMax (4.5) to use all jets with |y| < 4.5), together with a jet definition and an area definition. We have found that the k t or Cambridge/Aachen jet algorithms with R = 0.4 − 0.6 generally provide adequate background estimates, with the lower range of R values to be preferred if the events are likely to be busy [43,44]. An active area with explicit ghosts is generally recommended. 25 For the grid based method one creates a fastjet::GridMedianBackgroundEstimator, #include "fastjet/tools/GridMedianBackgroundEstimator.hh" // ... GridMedianBackgroundEstimator bge(double max_rapidity, double requested_grid_spacing); We have found grid spacings in the range 0.5 − 0.7 to be adequate [44], with lower values preferred for events that are likely to have high multiplicities. Both of the above background estimators derive from a fastjet::BackgroundEstimatorBase class and the remaining functionality is common to both. In particular, for each event, one tells the background estimator about the event particles,

bge.set_particles(event_particles);
where event particles is a vector of PseudoJet, and then extracts the background density and a measure of its fluctuations with the two following calls // the median of (p t /area) for grid cells, or for jets that pass the selection cut, // making use also of information on empty area in the event (in the jets case) rho = bge.rho(); // an estimate of the fluctuations in the p t density per unit √ A, 25 With the k t algorithm one may also use a Voronoi area (effective Rfact = 0.9 is recommended), which has the advantage of being deterministic and faster than ghosted areas. In this case however one must use a selector that is geometrical and selects only jets well within the range of event particles. E.g. if particles are present up to |y| = y max one should only use jets with |y| y max − R. When using ghosts, the selector can instead go right up to the edge of the acceptance, if the ghosts also only go right up to the edge, as in the right-hand plot of fig. 1. // which is obtained from the 1-sigma half-width of the distribution of pt/A. // To be precise it is defined such that a fraction (1-0.6827)/2 of the jets // (including empty jets) have p t /A < ρ − σ/ A sigma = bge.sigma(); Note that ρ and σ determinations count empty area within the relevant region as consisting of jets of zero p t . Thus (roughly speaking), if more that half of the area covered by the jets selector or grid rapidity range is empty, the median estimate for ρ will be zero, as expected and appropriate for quiet events.

Background subtraction
A common use of an estimation of the background is to subtract its contamination from the transverse momentum of hard jets, in the form p sub t,jet = p raw t,jet − ρA jet (11) or its 4-vector version p sub µ,jet = p raw µ,jet − ρA µ,jet , as first proposed in [18].
If jet.pt() < bge.rho(jet)*jet.area 4vector().pt(), then the subtractor instead returns a jet with zero 4-momentum (so that (subtracted jet==0) returns true). In both cases, the returned jet retains the user and structural information of the original jet.
An example program is given in example/07-subtraction.cc. Note that Subtractor derives from the Transformer class (see section 9) and hence from FunctionOfPseudoJet<PseudoJet> (cf. appendix D).

Positional dependence of background
The background density in pp and heavy-ion collisions usually has some non-negligible dependence on rapidity (and sometimes azimuth). This dependence is not accounted for in the standard estimate of ρ based on all jets or grid cells from (say) |y| < 4.5. Two techniques are described below to help alleviate this problem. In each case the properties of the background are to be obtained through the methods (available for both JetMedianBackgroundEstimator and GridMedianBackgroundEstimator) double rho (const PseudoJet & jet); // p t density per unit area A near jet double sigma(const PseudoJet & jet); // fluctuations in the p t density near jet

Local estimation (jet based)
The first technique, "local estimation", available for now only in the case of the jet-based estimator, involves the use of a more local range for the determination of ρ, with the help of a Selector that is able to take a reference jet, e.g. SelectorStrip(∆y), a strip of half-width ∆y (which might be of order 1) centred on whichever jet is set as its reference. With this kind of selector, when the user calls either rho(jet) or sigma(jet) a selector.set reference(jet) call is made to centre the selector on the specified jet. Then only the jets in the event that pass the cut specified by this newly positioned selector are used to estimate ρ or σ. 26 This method is adequate if the number of jets that pass the selector is much larger than the number of hard jets in the range (otherwise the median p t /A will be noticeably biased by the hard jets). It therefore tends to be suitable for dijet events in pp or PbPb collisions, but may fare less well in event samples such as hadronically decaying tt which may have many central hard jets. One can attempt to remove some given number of hard jets before carrying out the median estimation, e.g. with a selector such as selector = SelectorStrip(∆y) * (!SelectorNHardest (2)) which removes the 2 hardest jets globally and then, of the remainder, takes the ones within the strip. 27 This is however not always very effective, because one may not know how many hard jets to remove.

Estimation in regions (grid based)
The grid-based estimator does not currently provide for local estimates, in the sense that the set of tiles used for a given instance of the grid-based estimator is always fixed. However, as of FastJet 3.1, it is possible to obtain relatively fine control over which fixed set of tiles a grid-based estimator uses. This is done with the help of the RectangularGrid class RectangularGrid grid(rap_min, rap_max, rap_spacing, phi_spacing, selector); GridMedianBackgroundEstimator bge(grid); A given grid tile will be used only if a massless PseudoJet placed at the centre of the tile passes the selector. So, for example, to obtain an estimate for ρ based on the activity in the two forward regions of a detector, one might set rap min and rap min to cover the whole detector and then supply a SelectorAbsRapRange(rap central max, rap max) to select just the two forward regions.

Rescaling method
A second technique to account for positional dependence of the background is "rescaling". First one parametrises the average shape of the rapidity dependence from some number of pileup events. Then for subsequent event-by-event background determinations, one carries out a global ρ determination and then applies the previously determined average rescaling function to that global determination to obtain an estimate for ρ in the neighbourhood of a specific jet.
The rescaling approach approach is available for both grid and jet-based methods. To encode the background shape, one defines an object such as // gives rescaling(y) = 1.16 + 0 · y − 0.023 · y 2 + 0 · y 3 + 0.000041 · y 4 fastjet::BackgroundRescalingYPolynomial rescaling(1.16, 0, -0.023, 0, 0.000041); (for other shapes, e.g. parametrisation of elliptic flow in heavy ion collisions, with both rapidity and azimuth dependence, derive a class from FunctionOfPseudoJet<double> -see appendix D). Then one tells the background estimator (whether jet or grid based) about the rescaling with the call // tell JetMedianBackgroundEstimator or GridMedianBackgroundEstimator about the rescaling bge.set_rescaling_class(&rescaling); Subsequent calls to rho() will return the median of the distribution p t /A/rescaling(y) (rather than p t /A). Any calls to rho(jet) and sigma(jet) will include an additional factor of rescaling(y jet ). Note that any overall factor in the rescaling function cancels out for rho(jet) and sigma(jet), but not for calls to rho() and sigma() (which are in any case less meaningful when a rapidity dependence is being assumed for the background).
In ongoing studies [44], we have found that despite its use of an average background shape, the rescaling method generally performs comparably to local estimation in terms of its residual p t dispersion after subtraction. Additionally, it has the advantage of reduced sensitivity to biases in events with high multiplicities of hard jets.

Handling masses
There are several subtleties in handling masses in subtraction. The first is related to the fact that hadrons have masses. In some contexts those masses are ignored and set to zero, e.g. because they are experimentally unconstrained: detectors do not systematically give information on the mass for every particle. However, if particle masses are kept non-zero, then Eq. (12) must be extended [45] where ρ m accounts the massive component of the energy flow. It can be approximately determined as the median across patches of a quantity m δ,patch /A patch , where By default, as of FastJet 3.1, the grid and jet-median background estimators automatically determine ρ m . It is returned from a call to rho m(), with fluctuations accessible through sigma m().
The determination of ρ m involves a small speed penalty and can be disabled with a call to set compute rho m(false) for either of the background estimators. To avoid changes in results relative to version 3.0, by default FastJet 3.1 does not use ρ m in the Subtractor, i.e. it uses Eq. (12). However, for a given subtractor, a call to set use rho m(true), will cause it to instead use Eq. (13) for all subsequent subtractions. We strongly recommend switching this on if your input particles have masses, and in future versions of FastJet we may change the default so that it is automatically switched on. 28 An alternative is to make the input particles massless.
A second issue, relevant both for Eqs. (12) and (13), is that sometimes the resulting squared jet mass is negative. 29 This is obviously unphysical. By default the 4-vector returned by the subtractor is left in that unphysical state, so that the user can decide what to do with it. For most applications a sensible behaviour is to adjust the 4-vector so as to maintain its p t and azimuth, while setting the mass to zero. This behaviour can be enabled for a given subtractor by a call to its set safe mass(true) function (available since v.3.1). In this case the rapidity is then taken to be that of the original unsubtracted jet. In future versions of FastJet, the default behaviour may be changed so that "safe" subtraction is automatically enabled.
A general note with regards to jet masses is that a number of techniques have been proposed as alternatives to 4-vector subtraction for jet masses. They include Jet Cleansing [46] (which requires charged-track information; see also the discussion in Ref. [47]), Constituent Subtraction [48], the SoftKiller [49] method and PUPPI [50]. Some of these can give significant improvements in the resolution of the subtraction for the jet masses, though potentially at the expense of some bias. SoftKiller and PUPPI appear to give improved resolution also for the jet p t (again at the potential expense of modest biases). Implementations of most of these techniques are, or will soon become available in fjcontrib. There one will also find the GenericSubtractor contrib, with code for areabased subtraction of jet shapes and other generic observables as discussed in Refs. [45,51].

Other facilities
The JetMedianBackgroundEstimator has a number of enquiry functions to access information used internally within the median ρ and σ determination. For area definitions with explicit ghosts the last two functions return 0. For active areas without explicit ghosts the results are calculated based on the observed number of internally recorded pure ghost jets (and unclustered ghosts) that pass the selector; for Voronoi and passive areas, they are calculated using the difference between the total range area and the area of the jets contained in the 28 It is also possible to construct a Subtractor with explicit ρ and ρ m values, Subtractor subtractor(rho, rho m); if this is done, then ρ m use is enabled by default. 29 Recall that if m 2 < 0, m() returns − √ −m 2 , to avoid having to introduce complex numbers just for this special case. range, with the number of empty jets then being calculated based on the average jet area for ghost jets (0.55πR 2 [17]). All four functions above return a result corresponding to the last call to rho or sigma (as long as the particles, cluster sequence or selector have not changed in the meantime).
The Subtractor  The idea here is that there are contexts where it is possible, for some of a jet's constituents, to identify which vertex they come from. In that case it is possible to provide a user-defined a selector (section 6) that indicates whether a particle comes from an identified vertex or not and a second user-defined selector that indicates whether that vertex was the leading vertex. The 4-momentum from the non-leading vertex is then discarded, that from the leading vertex is kept, and subtraction is applied to component that is not from identified vertices. It follows that ρ must correspond only to the momentum flow from non-identified vertices. With this setup, the jet p t is bounded to be at least equal to that from the leading vertex, as is the mass if the "safe" subtraction option is enabled.

Alternative workflows
To allow flexibility in the user's workflow, alternative constructors to JetMedianBackgroundEstimator are provided. These can come in useful if, for example, the user wishes to carry out multiple background estimations with the same particles but different selectors, or wishes to take care of the jet clustering themselves, e.g. because the results of that same jet clustering will be used in multiple contexts and it is more efficient to perform it just once. These constructors are: // create an estimator that uses the inclusive jets from the supplied cluster sequence JetMedianBackgroundEstimator(const Selector & rho_range, const ClusterSequenceAreaBase & csa); // a default constructor that requires all information to be set later JetMedianBackgroundEstimator(); In the first case, the background estimator already has all the information it needs. Instead, if the default constructor has been used, one can then employ // (re)set the selector to be used for future calls to rho() etc. void set_selector(const Selector & rho_range_selector); // (re)set the cluster sequence to be used by future calls to rho() etc. // (as with the cluster-sequence based constructor, its inclusive jets are used) void set_cluster_sequence(const ClusterSequenceAreaBase & csa); to set the rest of the necessary information. If a list of jets is already available, they can be submitted to the background estimator in place of a cluster sequence: // (re)set the jets to be used by future calls to rho() etc. void set_jets(const std::vector<PseudoJet> & jets); Note that the jets passed via the set jets() call above must all originate from a common ClusterSequenceAreaBase type class.

Recommendations
While the basic idea of pileup subtraction is rather simple, in practice a few details are relevant to obtaining the best possible performance. Here we summarise some useful recommendations:

The
GridMedianBackgroundEstimator is significantly faster than the JetMedianBackgroundEstimator and performs equally well in nearly all cases.
2. Pileup usually has non-negligible rapidity dependence (and in the case of heavy-ion collisions, the underlying event also has azimuthal dependence). It is often important to account for this dependence, which can be done either with the rescaling method (section 8.2.3) or via a local background estimator (section 8.2.1) or the use of regions (section 8.2.2). For their own work in pileup subtraction the FastJet authors tend to prefer the rescaling method, with local estimation also a useful option e.g. in the case of heavy-ion collisions.
3. If you are interested in jet masses, and wish to use non-zero input particle masses, make sure you use the ρ m component in Eq. (13)  Performing post-clustering actions on jets has in recent years become quite widespread: for example, numerous techniques have been introduced to tag boosted hadronically decaying objects, and various methods also exist for suppressing the underlying event and pileup in jets, beyond the subtraction approach discussed in section 8. FastJet 3 provides a common interface for such tools, intended to help simplify their usage and to guide authors of new ones. Below, we first discuss generic considerations about these tools, which we call fastjet::Transformers. We then describe some that have already been implemented. New user-defined transformers can be implemented as described in section E.4. A transformer derived from Transformer, e.g. the class MyTransformer, will generally be used as follows: MyTransformer transformer; PseudoJet transformed_jet = transformer(jet); Often, transformers provide new structural information that is to be associated with the returned result. For a given transformer, say MyTransformer, the new information that is not already directly accessible from PseudoJet (like its constituents, pieces or area when they are relevant), can be accessed through transformed_jet.structure_of<MyTransformer>() which returns a reference to an object of type MyTransformer::StructureType. This is illustrated below on a case-by-case basis for each of the transformers that we discuss. Using the Boolean function transformed jet.has structure of<MyTransformer>() it is possible to check if transformed jet is compatible with the structure provided by MyTransformer.
A number of the transformers that we discuss below are "taggers" for boosted objects. In some cases they will determine that a given jet does not satisfy the tagging conditions (e.g., for a top tagger, because it seems not to be a top jet). We will adopt the convention that in such cases the result of the transformer is a jet whose 4-momentum is zero, i.e. one that satisfies jet == 0. Such a jet may still have structural information however (e.g. to indicate why the jet was not tagged).

Noise-removal transformers
In section 8.1.2 we already saw one transformer for noise removal, i.e. Subtractor. Others have emerged in the context of jet substructure studies and are described here.

Jet Filtering and Trimming using Filter
Filtering was first introduced in [52] to reduce the sensitivity of a boosted Higgs-candidate jet's mass to the underlying event. Generally speaking, filtering clusters a jet's constituents with a smaller-thanoriginal jet radius R filt . It then keeps just the n filt hardest of the resulting subjets, rejecting the others. Trimming [53] is similar, but selects the subjets to be kept based on a p t cut. The use of filtering and trimming has been advocated in number of contexts, beyond just the realm of boosted object reconstruction.
The fastjet::Filter class derives from Transformer, and can be constructed using a JetDefinition, a Selector and (optionally) a value for the background density, #include "fastjet/tools/Filter.hh" // ... Filter filter(subjet_def, selector, rho); This reclusters the jet's constituents with the jet definition subjet def 30 and then applies selector on the inclusive jets resulting from the clustering to decide which of these (sub)jets have to be kept. If rho is non-zero, each of the subjets is subtracted (using the specified value for the background density) prior to the selection of the kept subjets. Alternatively, the user can set a Subtractor (see section 8.1.2), e.g.
GridMedianBackgroundEstimator bge(...); Subtractor sub(&bge); filter.set_subtractor(sub); When this is done, the subtraction operation is performed using the Subtractor, independently of whether a value had been set for rho.
If the jet definition to be used to recluster the jet's constituents is the Cambridge/Aachen algorithm, two additional constructors are available: Filter(double Rfilt, Selector selector, double rho = 0.0); Filter(FunctionOfPseudoJet<double> * Rfilt_dyn, Selector selector, double rho = 0.0); 30 When the input jet was obtained with the Cambridge/Aachen algorithm and the subjet definition also involves the Cambridge/Aachen algorithm, the Filter uses the exclusive subjets of the input jet to avoid having to recluster its constituents.
In the first one, only the radius parameter is specified instead of the full subjet definition. In the second, one has to provide a (pointer to) a class derived from FunctionOfPseudoJet<double> which dynamically computes the filtering radius as a function of the jet being filtered (as was originally used in [52] where R filt = min(0.3, R bb/2 ), with R bb the distance between the parents of the jet).
As an example, a simple filter, giving the subjets obtained clustering with the Cambridge/Aachen algorithm with radius R filt and keeping the n filt hardest subjets found, can be set up and applied using Filter filter(Rfilt, SelectorNHardest(nfilt)); PseudoJet filtered_jet = filter(jet); The pieces() of the resulting filtered/trimmed jet correspond to the subjets that were kept: vector<PseudoJet> kept = filtered_jet.pieces(); Additional structural information is available as follows: // the subjets (on the scale Rfilt) not kept by the filtering vector<PseudoJet> rejected = filtered_jet.structure_of<Filter>().rejected(); Trimming, which keeps the subjets with a p t larger than a fixed fraction of the input jet, can be obtained defining Filter trimmer(Rfilt, SelectorPtFractionMin(pt_fraction_min)); and then applying trimmer similarly to filter above.
Note that the jet being filtered must have constituents. Furthermore, if rho is non-zero or if a Subtractor is set, the input jet must come from a cluster sequence with area support and explicit ghosts. If any of these requirements fail, an exception is thrown. In cases where the filter/trimmer has been defined with just a jet radius, the reclustering of the jet is performed with the same recombination scheme as was used in producing the original jet (assuming it can be uniquely determined).

Jet pruning
Pruning was introduced in [21]. It works by reclustering a jet's constituents with some given sequential recombination algorithm, but vetoing soft and large-angle recombinations between pseudojets i and j, specifically when the two following conditions are met 1. the geometric distance between i and j is larger than a parameter Rcut, with Rcut = Rcut factor×2m/p t , where m and p t are the mass and transverse momentum of the original jet being pruned; When the veto condition occurs, the softer of i and j is discarded, while the harder one continues to participate in the clustering.
Pruning bears similarity to filtering in that it reduces the contamination of soft noise in a jet while aiming to retain hard perturbative radiation within the jet. However, because by default the parameters for the noise removal depend on the original mass of the jet, the type of radiation that is discarded depends significantly on the initial jet structure. As a result pruning, in its default form, is better thought of as a noise-removing boosted-object tagger (to be used in conjunction with a pruned-jet mass cut) rather than a generic noise-removal procedure.
When using the constructor given above, the jet radius used by the pruning clustering sequence is set internally to the functional equivalent of infinity. Alternatively, a pruner transformer can be constructed with a JetDefinition instead of just a JetAlgorithm: JetDefinition pruner_jetdef(jet_algorithm, Rpruner); Pruner pruner(pruner_jetdef, zcut, Rcut_factor); In this situation, the jet definition pruner jetdef should normally have a radius Rpruner large enough to ensure that all the constituents of the jet being pruned are reclustered into a single jet. If this is not the case, pruning is applied to the entire reclustering and it is the hardest resulting pruned jet that is returned; the others can be retrieved using vector<PseudoJet> extra_jets = pruned_jet.structure_of<Pruner>().extra_jets(); Finally, note that a third constructor for Pruner exists, that allows one to construct the pruner using functions that dynamically compute zcut and Rcut for the jet being pruned: Pruner (const JetDefinition &jet_def, const FunctionOfPseudoJet< double > *zcut_dyn, const FunctionOfPseudoJet< double > *Rcut_dyn); The pruned jet.structure of<Pruner>() object also provides Rcut() and zcut() members, in order to retrieve the actual R cut and z cut values used for that jet.

Boosted-object taggers
A number of the taggers developed to distinguish 2-or 3-pronged decays of massive objects from plain QCD jets (see the review [15]) naturally fall into the category of transformers. Typically they search for one or more hard branchings within the jet and then return the part of the jet that has been identified as associated with those hard branchings. They share the convention that if they were not able to identify suitable substructure, they return a jet with zero momentum, i.e. one that has the property jet == 0. At the time of writing, we provide only a small set of taggers. These include one main twobody tagger, the fastjet::MassDropTagger introduced in [52] and one main boosted top tagger, fastjet::JHTopTagger from [54] (JHTopTagger derives from the fastjet::TopTaggerBase class, expressly included to provide a common framework for all top taggers capable of also returning a W ).
In addition, to help provide a more complete set of examples of coding methods to which users may refer when writing their own taggers, we have also included the fastjet::CASubJetTagger introduced in [55], which illustrates the use of a WrappedStructure (cf. appendix E.4) and the rest-frame fastjet::RestFrameNSubjettinessTagger from Ref. [56], which makes use of facilities to boost a cluster sequence.
We refer the reader to the original papers for a more extensive description of the physics use of these taggers.
More taggers may be provided in the future, either through native implementations or, potentially, through a "contrib" type area. Users are invited to contact the FastJet authors for further information in this regard.

The mass-drop tagger
Introduced in [52] for the purpose of identifying a boosted Higgs decaying into a bb pair, this is a general 2-pronged tagger. It starts with a fat jet obtained with a Cambridge/Aachen algorithm (originally, R = 1.2 was suggested for boosted Higgs tagging). Tagging then proceeds as follows: 1. the last step of the clustering is undone: j → j 1 , j 2 , with m j 1 > m j 2 ; 2. if there is a significant mass drop, µ ≡ m j 1 /m j < µ cut , and the splitting is sufficiently symmetric, y ≡ min(p 2 tj 1 , p 2 tj 2 )∆R 2 j 1 j 2 /m 2 j > y cut , then j is the resulting heavy particle candidate with j 1 and j 2 its subjets; 3. otherwise, redefine j to be equal to j 1 and go back to step 1.
The tagger can be constructed with #include "fastjet/tools/MassDropTagger.hh" // ... MassDropTagger mdtagger(double µ cut , double y cut ); and applied using PseudoJet tagged_jet = mdtagger(jet); This tagger will run with any jet that comes from a ClusterSequence. A warning will be issued if the ClusterSequence is not based on the C/A algorithm. If the JetDefinition used in the ClusterSequence involved a non-default recombiner, that same recombiner will be used when joining the final two prongs to form the boosted particle candidate.
For a jet that is returned by the tagger and has the property that tagged jet != 0, two enquiry functions can be used to return the actual value of µ and y for the clustering that corresponds to the tagged structure: tagged_jet.structure_of<MassDropTagger>.mu(); tagged_jet.structure_of<MassDropTagger>.y(); Note that in [52] the mass-drop element of the tagging was followed by a filtering stage using min(0.3, R jj /2) as the reclustering radius and selecting the three hardest subjects. That can be achieved with vector<PseudoJet> tagged_pieces = tagged_jet.pieces(); double Rfilt = min(0.3, 0.5 * pieces[0].delta_R(pieces[1])); PseudoJet filtered_tagged_jet = Filter(Rfilt, SelectorNHardest(3))(tagged_jet); (It is also possible to use the Rfilt dyn option to the filter discussed in section 9.1.1).
A significantly updated version of the mass-drop tagger, modified as in Ref. [57], is available as part of the RecursiveTools fjcontrib module (see section 13).

The Johns-Hopkins top tagger
The Johns Hopkins top tagger [54] is a 3-pronged tagger specifically designed to identify top quarks. It recursively breaks a jet into pieces, finding up to 3 or 4 subjets and then looking for a W candidate among them. The parameters used to identify the relevant subjets include a momentum fraction cut and a minimal separation in Manhattan distance (|∆y| + |∆φ|) between subjets obtained from a declustering.
The tagger will run with any jet that comes from a ClusterSequence, however to conform with the original formulation of [54], the ClusterSequence should be based on the C/A algorithm. A warning will be issued if this is not the case. If the JetDefinition used in the ClusterSequence involves a non-default recombiner, that same recombiner will be used when joining the final two prongs to form the boosted particle candidate. The tagger can be used as follows: #include "fastjet/tools/JHTopTagger.hh" // ... double delta_p = 0.10; // subjets must carry at least this fraction of original jet's p t double delta_r = 0.19; // subjets must be separated by at least this Manhattan distance double cos_theta_W_max = 0.7; // the maximal allowed value of the W helicity angle JHTopTagger top_tagger(delta_p, delta_r, cos_theta_W_max); // indicate the acceptable range of top, W masses top_tagger.set_top_selector (SelectorMassRange(150,200)); top_tagger.set_W_selector (SelectorMassRange( 65, 95)); // now try and tag a jet PseudoJet top_candidate = top_tagger(jet); // jet should come from a C/A clustering if (top_candidate != 0) { // successful tagging double top_mass = top_candidate.m(); double W_mass = top_candidate.structure_of<JHTopTagger>().W().m(); } Other information available through the structure of<JHTopTagger>() call includes: W1() and W2(), the harder and softer of the two W subjets; non W(), the part of the top that has not been identified with a W (i.e. the candidate for the b); and cos theta W(). The top candidate.pieces() call will return 2 pieces, where the first is the W candidate (identical to structure of<JHTopTagger>().W()), while the second is the remainder of the top jet (i.e. non W).
Note the above calls to set top selector() and set W selector(). If these calls are not made, then the tagger places no cuts on the top or W candidate masses and it is then the user's responsibility to verify that they are in a suitable range.
Note further that JHTopTagger does not derive directly from Transformer, but from the fastjet::TopTaggerBase class instead. This class (which itself derives from Transformer) has been included to provide a proposed common interface for all the top taggers. In particular, TopTaggerBase provides (via the associated structure) top_candidate.structure_of<TopTaggerBase>().W() top_candidate.structure_of<TopTaggerBase>().non_W() and standardises the fact that the resulting top candidate is a PseudoJet made of these two pieces.
The benefits of the base class for top taggers will of course be more evident once more than a single top tagger has been implemented.

The Cambridge/Aachen subjet tagger
The Cambridge/Aachen subjet tagger [55], originally implemented in a 3-pronged context, is really a generic 2-body tagger, which can also be used in a nested fashion to obtained multi-pronged tagging. It can be obtained through the include #include "fastjet/tools/CASubjetTagger.hh" As it is less widely used than the taggers mentioned above, we refer the user to the online doxygen documentation for further details.

The rest-frame N -subjettiness tagger
The rest-frame N -subjettiness tagger [56], meant to identify a highly boosted colour singlet particle decaying to 2 partons, can be obtained through the include #include "fastjet/tools/RestFrameNSubjettinessTagger.hh" As it is less widely used than the taggers mentioned above, we refer the user to the online doxygen documentation for further details.

Compilation notes
Compilation and installation make use of the standard % ./configure % make % make check % make install procedure. Explanations of available options (e.g. to select which plugins are built or enable/disable python support) are given in the INSTALL file in the top directory, and a list can also be obtained running ./configure --help.
In order to access the NlnN strategy for the k t algorithm, the FastJet library needs to be compiled with support for the Computational Geometry Algorithms Library CGAL [13]. This same strategy gives N ln N performance for Cambridge/Aachen and N 3/2 performance for anti-k t (whose sequence for jet clustering triggers a worst-case scenario for the underlying computational geometry methods.) CGAL can be enabled with the --enable-cgal at the configure stage. 31 CGAL may be obtained 31 Since version 4.9, CGAL can be used as a header-only library (this is the default from version 5.0 of CGAL). In this case, FastJet requires an installed version of CGAL as well as the --enable-cgal-header-only configure argument, see e.g. section 4 of the CGAL manual. in source form from http://www.cgal.org/ and is also available in binary form for many common Linux distributions. For CGAL versions 3.4 and higher, the user can specify --with-cgaldir=... if the CGAL files are not installed in a standard location. 32 The NlnNCam strategy does not require CGAL, since it is based on a considerably simpler computational-geometry structure [58].

Python interface
As of version 3.3, FastJet ships with a Python interface. In version 3.3 this interface is to be considered at a beta-release stage. While we expect it to remain largely stable, some details may evolve in future releases.
Its compilation can be enabled at configure time with the option --enable-pyext. It includes access to much of the functionality in FastJet, except plugins and certain features that are difficult to map to Python. In particular templated types, nested classes and user-derived classes are not currently supported (with a few exceptions, see below). The Python interface was largely created automatically, using SWIG [59]. The Python documentation was generated automatically using Doxygen and the doxy2swig program [60].

Essential usage
The following points are useful to keep in mind: • You can pass a Python list such as [PseudoJet0, PseudoJet1, ...] to any FastJet call that expects a vector of PseudoJets.
• Any FastJet call that in C++ returns a vector of PseudoJets will in Python return a tuple of PseudoJets • For many objects that provide definitions of some kind, a pythonic str call maps to description().
• Any python variable var (which may be a simple variable or a more complicated object) can be associated with a PseudoJet pj using the pj.set python info(var) call; it can be retrieved with pj.python info(). This is based on the C++ PseudoJet user info functionality described in Appendix B.
• Python functions can be used to create FastJet Selector objects, using SelectorPython(some python function). The function must accept a PseudoJet as its single argument and return True or False. You can also supply a class, in which case it should implement a call (self,particle) member to indicate whether a particle passes the selector and a str (self) member for the description (see example/python/06-selector.py).
• A Python class can also be used to construct a recombiner, which must have preprocess(self,pa), recombine(self,pa,pb) and str (self) members, in analogy with the C++ structure, Appendix E.1 (with str providing the functionality of the C++ description member).
preprocess(self,pa) should (optionally) modify pa directly, while recombine(self,pa,pb) should return a new PseudoJet, resulting from the recombination of pa and pb.
You can then assign the python-based recombiner using a call such as jet def.set python recombiner(python recombiner), cf. example example/python/07-recombiner.py.
• When a FastJetError exception is thrown from C++, a corresponding fastjet.Error (...) exception is raised in Python. This allows for graceful handling of exceptions, notably in an interactive Python session.
• Remember that Python uses references, e.g. a = b means that a is a reference to b. If you need to copy a PseudoJet pj with a view to altering it, do "pjcopy = PseudoJet(pj)".

Known issues as of version 3.3.x
• C++ enums are replaced by integers in Python. This causes complications for some overloaded interfaces, notably that of JetDefinition, which takes a jet algorithm specification, followed by zero, one or two C++ double arguments (e.g. jet radius R and the power p of the generalisedk t algorithm), then specifications of the recombination scheme and clustering strategy (both enums). With Python the arguments corresponding to the C++ doubles must be Python floats. If instead they are integers, then they are interpreted as specifying the recombination scheme and clustering strategy. This issue is relevant also if you write code for other users that takes the parameters as an argument and then passes them on to FastJet: you may want to either explicitly call the NParam versions, or explicitly convert the user parameters to float in your own code.
• For the Selector::sift(jets, jets that pass, jets that fail) C++ member, the last two arguments are references to vector<PseudoJet> objects. In this case, it is not possible to use a Python list in place of those arguments. Instead you should pass variables initialised to be vectorPJ() (which corresponds to the C++ vector<PseudoJet>).
• Plugins and fjcontrib code do not yet have a Python interface, nor does fjcore. Further examples are to be found in the example/python/ directory.

FJcore
The fjcore package provides a simple, compact way of accessing the main FastJet functionality. It consists of just two files fjcore.hh and fjcore.cc. A program can access the fjcore functionality by linking with fjcore.cc instead of the full FastJet library, g++ main.cc fjcore.cc where main.cc includes fjcore.hh rather than the usual FastJet headers. Clustering results are identical to those obtained by linking to the full FastJet distribution. One of the main intended uses of fjcore is to provide jet-finding code that can be easily distributed with 3rd party code (e.g. it is currently distributed with Pythia 8 [61] and MadGraph5 aMC@NLO). We emphasize that the FastJet licensing conditions must be respected when incorporating fjcore. This includes a mention of fjcore's GPL license in the third-party package's own LICENSE or COPYING file. Furthermore for interactive usage, the fjcore banner may not be removed. This is because it contains crucial information, such as the fjcore version number, that users are expected to quote in their scientific publications. In order to make it possible for fjcore and the full FastJet (e.g. used in a separate user program) to coexist, the fjcore package uses the fjcore namespace instead of fastjet.
In particular, fjcore provides: • access to all native hadron-collider and e + e − algorithms, k t , anti-k t , C/A. For C/A, the N ln N method is available, while anti-k t and k t are limited to the N 2 one (still the fastest for N < 100k particles) • access to selectors, for implementing cuts and selections • access to all functionalities related to PseudoJets (e.g. a jet's structure or user-defined information) Instead, it does not provide: • jet-area functionality • background estimation • access to other algorithms via plugins • interface to CGAL • fastjet tools, e.g. filters, taggers If these functionalities are needed, the full FastJet installation must be used. The code will be fully compatible, with the sole replacement of the header files and of the fjcore namespace with the fastjet one. Note that the fjcore package comes with a basic example 01-basic.cc as well as a fortran interface and an associated example.
The fjcore package has been available since release 3.0.4 of FastJet, and tracks FastJet versions. It is available for download from the http://fastjet.fr/ web site.

FastJet Contrib
The FastJet update schedule is geared towards providing stability. Typically, a minor (e.g. 3.0 → 3.1) release is made every two or three years, and a new minor or major release may only become widely adopted (notably by the experiments) a further year or two down the line. In comparison, the field of jet physics, especially jet substructure, and the development of associated tools proceeds at a much faster pace.
To accommodate this, part way through the 3.1 development cycle, in 2013, FastJet Contrib (fjcontrib) was introduced. This provides a home for a variety of contributed tools. New tools from 3rd-party authors can usually be added in the space of days or weeks, after some basic review of the interface, so far usually carried out by the FastJet authors. Contents, instructions and information for contributing are provided on the fjcontrib web pages [62], http://fastjet.hepforge.org/ contrib/.

N2Plain
Plain N 2 algorithm N2Tiled Tiled N 2 algorithm N2MinHeapTiled Tiled N 2 algorithm with a heap for tracking the min. of d ij

N2MHTLazy9
Variant of N2MinHeapTiled, with "lazy" (see text) evaluation of distances to particles in the set of 9 = 3 × 3 nearby tiles N2MHTLazy25 Like N2MHTLazy9, but uses tiles of size R/2 and a set of 25 = 5 × 5 neighbours N2MHTLazy9AntiKtSeparateGhosts Similar to N2MHTLazy9, but neglects ghost-ghost clusterings (to be considered preliminary in FJ3.1)

NlnN
Voronoi-based N ln N algorithm

NlnNCam
Based on Chan's N ln N closest pairs algorithm, suitable only for the Cambridge jet algorithm Best Automatic selection of the best of these based on N and R

BestFJ30
Automatic strategy selection as was done in FastJet 3.0 For jet algorithms with spherical distance measures (those whose name starts with "ee "), only the N2Plain strategy is available. The N range in which a given strategy is optimal depends on R and on the rapidity extent of the particles. See figure 2 for details.

A Clustering strategies and performance
The constructor for a JetDefinition can take a strategy argument (cf. section 3.2), which selects the algorithmic "strategy" to use while clustering. It is an enum of type Strategy with relevant values listed in table 2. Nearly all strategies are based on the factorisation of energy and geometrical distance components of the d ij measure [10]. In particular they involve the dynamic maintenance of a nearestneighbour graph for the geometrical distances. They apply equally well to any of the internally implemented hadron-collider jet algorithms. The two exceptions are: NlnNCam, which is based on a computational geometry algorithm for dynamic maintenance of closest pairs [58] (rather than the more involved nearest neighbour graph), and is suitable only for the Cambridge algorithm, whose distance measure is purely geometrical; and N2MHTLazy9AntiKtSeparateGhosts, intended specifically for area calculations with the anti-k t algorithm (to be considered preliminary in v3.1; it can also be used for passive area calculations for other algorithms). The N2Plain strategy uses a "nearest-neighbour heuristic" [63] approach to maintaining the geometrical nearest-neighbour graph; N2Tiled tiles the y − φ cylinder to limit the set of points over which nearest-neighbours are searched for, 33 and N2MinHeapTiled differs only in that it uses an N ln N (rather than N 2 ) data structure for maintaining in order the subset of the d ij that involves nearest neighbours. Both use tiles of size at least R × R, and search for a particle's nearest neighbours in the 3 × 3 set of tiles centred on that particle's own tile. The N2MHTLazy9 strategy is similar, however before considering a neighbouring tile it establishes whether the edge of the tile is closer than the particle's nearest same-tile neighbour. If not, it skips the tile, hence the name "lazy". The extra checks and bookkeeping introduce a speed penalty for moderate N , but at large N that is more than compensated for by the fact that one searches for neighbours in a smaller set of tiles. The N2MHTLazy25 strategy is almost identical except that it uses a 5 × 5 set of tiles of size at least R/2. All the lazy algorithms are new in FastJet 3.1. The NlnN strategy uses CGAL's Delaunay Triangulation [13] for the maintenance of the nearest-neighbour graph. Note that N ln N performance is an expected result, and it holds in practice for the k t and Cambridge algorithms, while for anti-k t and generalised-k t with p < 0, hub-and-spoke (or bicycle-wheel) type configurations emerge dynamically during the clustering and these break the conditions needed for the expected result to hold (this however has a significant impact only for N 10 5 ; we believe that it leads to N 3/2 asymptotic time for clustering). 34 Given the many strategies, it is not entirely trivial to select the fastest one for any given event. The optimal choice depends on the jet algorithm, its radius parameter, the event multiplicity, and the way in which the particles are distributed across the event (e.g. all concentrated at small rapidities, or covering a large rapidity range). Figure 2 shows our determination of the optimal strategy for different R and N for events covering rapidities up to |y| < 5. 35 The solid black lines indicate the transitions encoded in the "Best" meta-strategy, which automatically selects one of the actual strategies for execution. Note that the choice of optimal strategy can to some extent depend also on the processor, compiler version and optimisation flags. Accordingly, the choices made in the Best strategy may not be perfectly optimal on all systems.
Illustrative timings for the Best strategy are shown as a function of N in figure 3 for the antik t , k t and the Cambridge/Aachen algorithms, with R = 0.4 on the left and R = 1.2 on the right.
For comparison the figure also shows the timings for the anti-k t algorithm in Large N values were obtained by taking a single hard dijet event and adding simulated minimum-bias events to it. Particles are limited to rapidities |y| < 5. The results include the time to extract the inclusive jets with p t > 5 GeV and sort them into decreasing p t . Note that timings observed in practice can differ from those given here for events with substantially different distributions of particles. the transition to the N ln N strategies occurring at somewhat larger N than in version 3.0, typically at or beyond N = 10 5 . Taking N = 10 4 particles as a reference (e.g. corresponding to moderate pp pileup plus ghosts for area subtraction 36 ), the improvement in speed is about a factor of 2 for R = 0.4 and 7 for R = 1.2. We note that there are a few places where there remains scope for timing improvements. In particular at low N the overheads related to copying and sorting a vector of PseudoJet objects are a relevant fraction of the total time, and could be reduced. Additionally, for the Cambridge/Aachen algorithm at moderate to large N , the use of multiple grid sizes could bring some benefit. The improvements contained in the N2MHTLazy9AntiKtSeparateGhosts strategy could be further extended to the Lazy25 case and made available automatically. Finally the Best strategy selection could try to better take into account the rapidity extent of the event, and special cases like single-jet reclustering. Should users have applications where such improvements are critical, they are encouraged to contact the FastJet authors.

B User Info in PseudoJets
One method for associating extra user information with a PseudoJet is via its user index (section 3.1). This is adequate for encoding simple information. such as an input particle's barcode in a HepMC event. However, it can quickly show its limitations; for example, when simulating pileup one might have several HepMC events and it is then useful for each particle to additionally store information about which HepMC event it comes from.
A second method for supplementing a PseudoJet with extra user information is for the user to derive a class from PseudoJet::UserInfoBase and associate the PseudoJet with a pointer to an instance of that class: void set_user_info(UserInfoBase * user_info); const UserInfoBase* user_info_ptr() const; The function set user info(...) transfers ownership of the pointer to the PseudoJet. This is achieved internally with the help of a shared pointer. Copies of the PseudoJet then point to the same user info. When the PseudoJet and all its copies go out of scope, the user info is automatically deleted. Since nearly all practical uses of user info require it to be cast to the relevant derived class of UserInfoBase, we also provide the following member function for convenience: which explicitly performs the cast of the extra info to type L. If the cast fails, or the user info has not been set, an error will be thrown. 37 The user may wonder why we have used shared pointers internally (i.e. have ownership transferred to the PseudoJet) rather than normal pointers. An example use case where the difference is important is if, for example, one wishes to write a Recombiner that sets the user info in the recombined PseudoJet. Since this is likely to be new information, the Recombiner will have to allocate some 36 For the specific case of the anti-k t algorithm with ghosts, the N2MHTLazy9AntiKtSeparateGhosts can bring further benefits in this region. 37 For clustering with explicit ghosts, even if the particles being clustered have user information, the ghosts will not. The user should take care therefore not to ask for user information about the ghosts, e.g. with the help of the PseudoJet::is pure ghost() or PseudoJet::has user info<L>() calls. The SelectorIsPureGhost() can also be used for this purpose. memory for it. With a normal pointer, there is then no easy way to clean up that memory when the PseudoJet is no longer relevant (e.g. because the ClusterSequence that contains it has gone out of scope). In contrast, with a shared pointer the memory is handled automatically. 38 The shared pointer type in FastJet is a template class called SharedPtr, available through #include "fastjet/SharedPtr.hh" It behaves almost identically to the C++0x shared ptr. 39 The end-user should not usually need to manipulate the SharedPtr, though the SharedPtr to user info is accessible through PseudoJet's user info shared ptr() member. An example of the usage might be the following. First you define a class MyInfo, derived from Then you might set the info as follows PseudoJet particle(...); particle.set_user_info(new MyInfo(its_pdg_id)); and later access the PDG id through the function particle.user_info<MyInfo>().pdg_id(); More advanced examples can be provided on request, including code that helps handle particle classes from third party tools such as Pythia 8 [61].

C Structural information for various kinds of PseudoJet
Starting with FastJet version 3.0, a PseudoJet can access information about its structure, for example its constituents if it came from a ClusterSequence, or its pieces if it was the result of a join(...) operation. In this appendix, we summarise what the various structural access methods will return for different types of PseudoJets: input particles, jets resulting from a clustering, etc. Table 3 provides the information for the most commonly-used methods.
Additionally, all the methods that access information related to the clustering (has partner(), is inside(), has exclusive subjets(), exclusive subjets(), n exclusive subjets(), exclusive subdmerge(), and exclusive subdmerge max) require the presence of an associated cluster sequence and throw an error if none is available (except for has exclusive subjets() which just returns false). For area-related calls, has area() will be false unless the jet is obtained from a ClusterSequenceAreaBase or is a composite jet made from such jets. All other area calls 38 The user may also wonder why we didn't simply write a templated version of PseudoJet in order to contain extra information. The answer here is that to introduce a templated PseudoJet would imply that every other class in FastJet should then also be templated. 39 Internally it has been designed somewhat differently, in order to limit the memory footprint of the PseudoJet that contains it. One consequence of this is that dynamic casts of SharedPtr's are not supported. throws from CS throws from CS throws throws Table 3: summary of the behaviour obtained when requesting structural information from different kinds of PseudoJet. A particle (also p 1 , p 2 ) is a PseudoJet constructed by the user, without structural information; a "jet" (also j 1 , j 2 ) is the output from a ClusterSequence; "from CS" means that the information is obtained from the associated ClusterSequence. A "jet (no CS)" is one whose ClusterSequence has gone out of scope. All other entries should be self-explanatory.
(validated csab(), area(), area error(), area 4vector(), is pure ghost()) will return the information from the ClusterSequenceAreaBase, or from the pieces in case of a composite jet. An error will be thrown if the jet does not have area information.
Internal storage of structural information. The means by which information about a jet's structure is stored is generally transparent to the user. The main exception that arises is when the user wishes to create jets with a new kind of structure, for example when writing boosted-object taggers. Here, we simply outline the approach adopted. For concrete usage examples one can consult section 9 and appendix E.4, where we discuss transformers and taggers.
To be able to efficiently access structural information, each PseudoJet has a shared pointer to a class of type fastjet::PseudoJetStructureBase. For plain PseudoJets the pointer is null. For PseudoJets obtained from a ClusterSequence the pointer is to a class fastjet::ClusterSequenceStructure, which derives from PseudoJetStructureBase. For PseudoJets obtained from a join(...) operation, the pointer is to a class fastjet::CompositeJetStructure, again derived from PseudoJetStructureBase. It is these classes that are responsible for answering structural queries about the jet, such as returning its constituents, or indicating whether it has pieces(). Several calls are available for direct access to the internal structure storage, among them where the first two return simply the structure pointer, while the last two cast the pointer to the desired derived structure type.

D Functions of a PseudoJet
A concept that is new to FastJet 3 is that of a fastjet::FunctionOfPseudoJet. Functions of PseudoJets arise in many contexts: many boosted-object taggers take a jet and return a modified version of a jet; background subtraction does the same; so does a simple Lorentz boost. Other functions return a floating-point number associated with the jet: for example jet shapes, but also the rescaling functions used to provide local background estimates in section 8.2.
To help provide a uniform interface for functions of a PseudoJet, FastJet has the following template base class: The FunctionOfPseudoJet framework makes it straightforward to pass functions of PseudoJets as arguments. This is, e.g., used for the background rescalings in section 8.2, which are just derived from FunctionOfPseudoJet<double>. It is also used for the Transformers of section 9, which all derive from FunctionOfPseudoJet<PseudoJet>. The use of a class for these purposes, rather than a pointer to a function, provides the advantage that the class can be initialised with additional arguments.

E.1 External Recombination Schemes
A user who wishes to introduce a new recombination scheme may do so by writing a class derived from JetDefinition::Recombiner: A jet definition can then be constructed by providing a pointer to an object derived from The derived class JetDefinition::DefaultRecombiner is what is used internally to implement the various recombination schemes if an external Recombiner is not provided. It provides a useful example of how to implement a new Recombiner class.
The recombiner can also be set with a set recombiner(...) call. If the recombiner has been created with a new statement and the user does not wish to manage the deletion of the corresponding memory when the JetDefinition (and any copies) using the recombiner goes out of scope, then the user may wish to call the delete recombiner when unused() function, which tells the JetDefinition to acquire ownership of the pointer to the recombiner and delete it when it is no longer needed.

E.2 Implementation of a plugin jet algorithm
The base class from which plugins derive has the following structure: Certain (cone) jet algorithms do not perform pairwise clustering -in these cases the plugin must invent a fictitious series of pairwise recombinations that leads to the same final jets. Such jet algorithms may also produce extra information that cannot be encoded in this way (for example a list of stable cones), but to which one may still want access. For this purpose, during run_clustering(...), the plugin may call the ClusterSequence member function: inline void plugin_associate_extras(std::auto_ptr<ClusterSequence::Extras> extras); where ClusterSequence::Extras is an abstract base class, which the plugin should derive from so as to provide the relevant information: A method of ClusterSequence then provides the user with access to the extra information: /// returns a pointer to the extras object (may be null) const ClusterSequence::Extras * extras() const; The user should carry out a dynamic cast so as to convert the extras back to the specific plugin extras class, as illustrated for SISCone in section 5.2.
Note that when calculating areas, the extras() member works only for active areas with explicit ghosts and for Voronoi areas. The reason for this is that other types of area first carry out a clustering with explicit ghosts and then edit the ClusterSequence to remove the information related to pure ghost jets. This modifies the internal indices of the jets, making it very challenging, for example, for the Extras class to handle queries about individual jets.

E.2.1 Building new sequential recombination algorithms
To enable users to more easily build plugins for new sequential recombination algorithms, FastJet also provides several classes that give access to implementations of dynamic nearest neighbour finding: • A class NNH, which provides users with access to an implementation of the nearest-neighbour heuristic for establishing and maintaining information about the closest pair of objects in a dynamic set of objects with generic distance measures (see [66] for an introduction to this and other generic algorithms). In good cases (e.g. C/A-like distances) this allows one to construct clustering that runs in N 2 time, though its worst case can be as bad as N 3 (e.g. anti-k t ).
• A class NNFJN2Plain, for distances that can be written in the form d ij = min(x i , x j )D ij , d iB = x i D iB , making use of the FastJet lemma and maintaining a nearest neighbour table for the D ij part of the distance measure. Closely related code forms the basis of the internal N2Plain strategy for the generalised-k t algorithms, where x i ≡ p 2p ti and D ij ≡ ∆R 2 ij /R 2 , D iB = 1. If D ij is geometrical, then the nearest-neighbour maintenance for the D ij can be done in N 2 time, ensuring an overall N 2 algorithm.
• A class NNFJN2Tiled. This is intended for problems with a cylindrical geometry. The user specifies the tile size and then the algorithm facilitates clustering with distance measures of the form d ij = min(x i , x j )D ij (like NNFJN2Plain), with the additional assumption that for a pair of particles i and j on non-identical, non-neighbouring tiles, D iB , D jB < D ij , so that i and j will never cluster with each other. The code is closely related to that used for the internal N2Tiled strategy for generalised-k t algorithms. The asymptotic scaling is also N 2 , but at high multiplicities the coefficient is usually significantly smaller than for NNFJN2Tiled.
All three classes derive from NNBase. 41 They are templated with an underlying ("BriefJet") class, which stores the minimal information for each particles so as to be able to calculate interparticle and particle-beam distances. The e + e − Cambridge plugin relies on the NNH class, the JadePlugin can use both NNH and NNFJN2Plain, while all three NN*** classes are used in the VariableR contrib's plugin. The interested user should consult those plugins for examples, and the headers of the NN*** classes for more detailed information.

E.3 Implementing new selectors
Technically a Selector contains a shared pointer to a SelectorWorker. Classes derived from SelectorWorker actually do the work.
So, for example, the call to the function SelectorAbsRapMax(2.5) first causes a new instance of the internal SW AbsRapMax class to be constructed with the information that the limit on |y| is 2.5 (SW AbsRapMax derives from SelectorWorker). Then a Selector is constructed with a pointer to the SW AbsRapMax object, and it is this Selector that is returned to the user: Selector SelectorAbsRapMax(double absrapmax) { return Selector(new SW_AbsRapMax(absrapmax)); } Since Selector is really nothing more than a shared pointer to the SW AbsRapMax object, it is a lightweight object. The fact that it's a shared pointer also means that it looks after the memory management issues associated with the SW AbsRapMax object.
If a user wishes to implement a new selector, they should write a class derived from SelectorWorker. The base is defined with sensible defaults, so for simple usage, only two SelectorWorker functions need to be overloaded: /// returns true if a given object passes the selection criterion. pass(const PseudoJet & jet) const = 0; /// returns a description of the worker virtual std::string description() const {return "missing description";} For information on how to implement more advanced workers (for example workers that do not apply jet-by-jet, or that take a reference), users may wish to examine the extensive in-code documentation of SelectorWorker, the implementation of the existing workers and/or consult the authors. A point to be aware of in the case of constructors that take a reference is the need to implement the SelectorWorker::copy() function.

E.4 User-defined transformers
All transformers are derived from the Transformer base class, declared in the fastjet/tools/Transformer.hh header: Relative to the FunctionOfPseudoJet<PseudoJet> (cf. appendix D) from which it derives, the Transformer's main additional feature is that the jets resulting from the transformation are generally expected to have standard structural information, e.g. constituents, and will often have supplemental structural information, which the StructureType typedef helps access. As for a FunctionOfPseudoJet<PseudoJet>, the action of a Transformer is to be implemented in the result(...) member function, though typically it will be used through the operator() function, as discussed in appendix D.
To help understand how to create user-defined transformers, it is perhaps easiest to consider the example of a filtering/trimming class. The simplest form of such a class is the following: 42 /// a simple class to carry out filtering and/or trimming The function that does the work in this class is result(...): 42 The actual Filter class is somewhat more elaborate than this, since it also handles areas, pileup subtraction and avoids reclustering when the jet and subjet definitions are C/A based. as with the fully fledged Filter of section 9.1.1.
A second way of extending the structural information of an existing jet is to "wrap" it. This can be done with the help of the WrappedStructure class.
#include "fastjet/WrappedStructure.hh" /// a class to wrap and extend existing jet structures with information about /// "rejected" pieces The WrappedStructure's constructor takes a SharedPtr to an existing structure and simply redirects all standard structural queries to that existing structure. A class derived from it can then reimplement some of the standard queries, or implement non-standard ones, as done above with the rejected() call. To use the wrapped class one might proceed as in the following lines: // create a jet with some existing structure PseudoJet joined = join(selected_subjets, *_subjet_def.recombiner()); // create a new structure that wraps the existing one and supplements it with new info SharedPtr<PseudoJetStructureBase> structure(new SimpleFilterWrappedStructure(joined.structure_shared_ptr(), rejected_subjets)); // assign the new structure to the original jet joined.set_structure_shared_ptr(structure); The SharedPtrs ensure that memory allocated for the structural information is released when no jet remains that refers to it. For the above piece of code to be used in the SimpleFilter it would then suffice to include a typedef SimpleFilterWrappedStructure StructureType; line in the SimpleFilter class definition.
In choosing between the templated join<...> and WrappedStructure approaches to providing advanced structural information, two elements are worth considering: on one hand, the WrappedStructure can be used to extend arbitrary structural information; on the other, while join<...> is more limited in its scope, it involves fewer pointer indirections when accessing structural information and so may be marginally more efficient.

F Error handling
FastJet provides warning and error messages through the classes fastjet::LimitedWarning and fastjet::Error respectively. A user does not normally need to interact with them, however, they do provide some customisation facilities, especially to redirect and summarise their output.
Each different kind of warning is written out a maximum number of times (the current default is 5) before its output is suppressed. The program is allowed to continue. At the end of the run (or at any other stage) it is possible to obtain a summary of all warnings encountered, both explicit or suppressed, through the following static member function of the LimitedWarning class: #include "fastjet/LimitedWarning.hh" // ... cout << LimitedWarning::summary() << endl; The throwing of an Error aborts the program. One can use /// controls whether the error message (and the backtrace, if its printing is enabled) /// is printed out or not static void Error::set_print_errors(bool print_errors); /// controls whether the backtrace is printed out with the error message or not. /// The default is "false". static void Error::set_print_backtrace(bool enabled); to control whether an error message is printed (default = true) and whether a full backtrace is also given (default = false). Switching off the printing of error messages can be useful, for example, if the user expects to repeatedly catch FastJet errors. The message() member function can then be used to access the specific error message.
With a suitable design of the output stream, the output redirection facility can also be used by the user to record additional information when an error or warning occurs, for example the event number. One only stream << string type operation is performed for each warning or error, so as to help with formatting in such cases.
As well as performing output of warnings and errors, FastJet also outputs a banner the first time that clustering is performed. If the user wishes to have the banner appear before the first clustering (e.g. during the initialisation phase of their program), they may call the static ClusterSequence::print banner() function.

G Evolution of FastJet across versions G.1 History
Version 1 of FastJet provided the first fast implementation of the longitudinally invariant k t clustering [8,9], based on the factorisation of momentum and geometry in that algorithm's distance measure [10].
Version 2.0 brought the implementation of the inclusive Cambridge/Aachen algorithm [23,24] and of jet areas and background estimation [18,17]; other changes include a new interface, 43 and new algorithmic strategies that could provide a factor of two improvement in speed for events whose number N of particles was ∼ 10 4 . Choices of recombination schemes and plugins for external jet algorithms were new features of version 2.1. The initial set of plugins included SISCone [27], the CDF midpoint [3] and JetClu [32] cones and PxCone [36,35]. The plugins helped provide a uniform interface to a range of different jet algorithms and made it possible to overlay FastJet features such as areas onto the external jet algorithms. Version 2.2 never made it beyond the beta-release stage, but introduced a number of the features that eventually were released in 2.3. The final 2.3 release included the anti-k t algorithm [14], a broader set of area measures, improved access to background estimation, means to navigate the ClusterSequence and a new build system (GNU autotools). Version 2.4 included the new version 2.0 of SISCone (including the spherical variant), as well as plugins to the DØ Run II cone, the ATLAS cone, the CMS cone, TrackJet and a range of e + e − algorithms, and also further tools to help investigate jet substructure. It also added a wrapper to FastJet allowing one to run SISCone and some of the sequential recombination algorithms from Fortran programs.
A major practical change in version 3.0 was that PseudoJet acquired knowledge (where relevant) about its underlying ClusterSequence, allowing one to write e.g. jet.constituents() It also became possible to associate extra information with a PseudoJet beyond just a user index. It brought the first of a series of FastJet tools to help with advanced jet analyses, namely the Selector class, filters, pruners, taggers and new background estimation classes. Version 3.0 also added the D0-Run I cone [33] plugin and support for native jet algorithms to be run with R > π/2. Version 3.1 gave significant speed improvements for multiplicities from a few thousand up to a few hundred thousand and various other small additions. The period in between versions 3.0.0 and 3.1.0 also saw the first releases of the fjcore compact subset of FastJet (section 12) and of the fjcontrib project (section 13). Version 3.2 exposed more of FastJet's core clustering functionality to third-party code (NNFJN2Plain and NNFJN2Tiled classes). Version 3.3 brought a Python interface to FastJet.

G.2 Deprecated and removed features
While we generally aim to maintain backwards compatibility for software written with old versions of FastJet, there are occasions where old interfaces or functionality no longer meet the standards that 43 The old one was retained through v2  Table 4: Summary of interfaces and features of earlier versions that have been deprecated and/or removed. For brevity we have used the following abbreviations: Dep. = version since which a feature has been deprecated, Rem. = version where removed, CS = ClusterSequence, JD = JetDefinition, CSAB = ClusterSequenceAreaBase, GAS = GhostedAreaSpec.