Confronting Experimental Data with Heavy-Ion Models: Rivet for Heavy Ions

The Rivet library is an important toolkit in particle physics, and serves as a repository for analysis data and code. It allows for comparisons between data and theoretical calculations of the final state of collision events. This paper outlines several recent additions and improvements to the framework to include support for analysis of heavy ion collision simulated data. The paper also presents examples of these recent developments and their applicability in implementing concrete physics analyses.


I. INTRODUCTION
High energy collisions of hadrons have, with access to data from colliders at the energy frontier over the past 20 years, been the main avenue for precision studies of the phenomenology of the strong nuclear force (QCD). Since such collisions involve a vast amount of different phenomena -some calculable from first principles, other relying on models -development has tended towards large calculation packages, which attempt to include as many effects as possible, with the aim of simulating full "events" resembling the collisions observed by experiments. While such event generators are very useful, their individual model components are difficult to validate in a systematic way. This is because adding new model components may compromise existing agreement with some results while improving agreement with other results.
To overcome this challenge, mass comparison with current and past data is needed.
Such comparisons are, for the most widespread proton-proton (pp) event generators, facilitated by the software package Rivet [1].
Heavy-ion physics has entered the precision era and quantitative and comprehensive model comparisons have become crucial [2]. Technical difficulties -similar to those faced earlier for collisions of protons -now also surface in this area of research. Inclusion of several new model components on the theoretical side, for example in the Jetscape generator package [3], introduces similar difficulties of validation, well known from hadron collisions. Furthermore, the discovery of QGPlike behaviour in collisions of small systems, i.e. pp and proton-ion (pA) [4,5], suggests that similar physics is at play in these collisions. These points strengthen the need for a generalized and common method of performing systematic comparisons of theoretical models to data for proton-proton, nucleon-nucleus, and ion-ion (AA) collisions. Such a method will not only be practically useful but would also enable new avenues for discovering similarities between two previously quite detached fields of subatomic physics [6]. This paper introduces the new features of the Rivet framework to allow one to obtain a direct comparison between Monte-Carlo event generators and experimental results from pA and AA collisions. In Section II a brief introduction to the framework is given, along with additions which are motivated by requirements of heavy ion analyses, but are of more general nature. In Section III two further additions, more specialized to specific heavy ion analyses are introduced: 1) estimation of centrality  in Section III A) and 2) the Generic Framework for flow observables in Section III B.
In Section IV, a simple analysis example is given, with guidelines for the user to run the analysis and obtain a simple comparison. For the two latter additions, Section V presents use-cases where the need for calculating final-state observables in the same way as the experiment are obvious. For centrality estimation (in Section V A) we show how different centrality estimators on final state and initial state level change results on basic quantities like multiplicity at mid-rapidity, in particular in pA, but to some degree also in AA. For flow observables (in Section V B) we show how important non-flow contributions are captured using the generic framework implementation, as opposed to a simple calculation based on the initial state event plane.

II. THE RIVET FRAMEWORK AND NEW FEATURES OF GENERAL NATURE
Rivet is a computational framework [7], written as a C ++ shared library supplying core functionality, and a large number (949 at the time of writing) of C ++ plugin libraries implementing collider analysis routines.
In Figure 1, the Rivet frameworks' connection to experiment and theory is outlined. The figure illustrates how comparison of experimental results to results from event generators provide a feed-back loop to the phenomenological models and the concrete event generator development. This type of feedback loop is widely used for development of Monte-Carlo event generators [8][9][10], as well as their validation and tuning [11][12][13].
Rivet analyses are performed as comparisons to data unfolded to "particle level", meaning that the experimental data should be processed to a level where it is directly comparable to Monte-Carlo output, as a list of stable or unstable particles.
After a short technical introduction in Section II A, the following sections present additions to Rivet motivated by heavy ion development, but of a general nature. In Section II C an example of implementing experimental particle definitions is shown.
In Section II D the ability to generate cross-system observables by running the same analysis on Monte-Carlo for both AA/pA and pp, followed by the ability to run analyses in different "modes" specified by analysis options in Section II E. Finally, the possibility to perform event mixing is presented in Section II F.

A. Technical introduction
The core Rivet library provides tools to process simulated events in the HepMC format [14] with the aim of calculating observable quantities. The Projection paradigm is one of the key Rivet concepts. Each projection calculates a specific observable from the full, generated event. This can, for example, be all charged particles in a certain fiducial volume, all possible reconstructed Z-bosons decaying leptonically, event shapes, or, as introduced in the Section III, event centralities or Q-vectors. There are two key advantages to projections. The first is computational.
Since a projection can be reused across analyses, several analyses can be run simultaneously without the need to repeat a "particle loop" for each analysis. Secondly, and most importantly, projections serve as convenient shorthands when implementing an experimental analysis. There is no need to repeat the implementation to select stable particles, or of a certain experimental trigger definition if it already exists as a pre-defined projection.
A Rivet analyses is split into three parts, implemented as separate member functions of an analysis class, inheriting from the Analysis base class Initialization: : The init() method is called once in the beginning of the analysis, and is intented for the user to initialize (or "book") objects such as projections or histograms used throughout the analysis.
Analysis: : The analyze(const Rivet::Event&) method is called once per event, and is intended for the user to implement calculation of single event quantities, and fill histograms.
Finalization: : The finalize() method is called once at the end of the analysis, and is intended for the user to apply the correct normalization of histograms, compute all-event averages and possibly construct ratios of aggregated quantities. In Section II D the possibility to run this method for a merged sample is outlined.

B. The HepMC heavy ion container
A Rivet analysis has access to heavy-ion specific generator information, through the heavy-ion container of a HepMC input event, provided that it is filled by the event generator. The heavy-ion container contains several data members, which in general are model dependent generation information, in most cases linked to the Glauber model [15,16], such as number of wounded nucleons, number of hard sub-collisions and number of spectator neutrons/protons. The heavy-ion record also holds more general information, such as the impact parameter, the event plane angle, and from HepMC version 2.6.10 onward, it is possible to set a value for generatorlevel centrality. Detailed information about data members available in Rivet, and for implementing a HepMC heavy-ion container in an event generator, is provided in the projects' respective online documentation. It is important to note the following for analyses implemented using information from the HepMC heavy-ion container: • There is no guarantee that an event generator implements the heavy-ion container, and one should perform appropriate tests to avoid code failure. Figure 3. Visualization of the experimental primary particle definition by Alice [17], with primary particles in red, particles not considered primary in black and resonances which can possibly be added in addition to primary particles in blue.
• There is no guarantee that the generator definition of a certain model dependent quantity is equivalent to the same as used in the experimental analysis.
C. Implementing experimental primary-particle definitions The full event record of the Monte-Carlo event generator cannot be compared directly to experimental results as these are often corrected to a certain definition of primary particles, illustrated in Figure 3. In the context of the heavy-ion developments, pre-defined Projections have been developed for the primary-particle definitions of various experiments. As an example, the definition by the ALICE experiment [17] has been implemented in the projection ALICE::PrimaryParticles, where a particle is considered primary if it: 1. has a proper lifetime of τ ≥ 1 cm/c, and 2. is either produced directly in the interaction; 3. or through the decay of a particle with mean proper lifetime τ < 1 cm/c.
This provides a reusable selection of primary particles for the analysis of a given experiment. In addition, this approach allows one to clearly identify cases where a requirement has been relaxed for a specific analysis. For example, in Ref. [18], the φ as well as its kaon decay daughters have been considered primary.

D. Support for multiple runs through reentrant finalize
In order to produce statistically significant results with Rivet analyses, it is often desirable to parallelize the Monte-Carlo generation step over multiple computational units, and merge the individual results into a whole. If the final result includes anything but simple objects such as distributions represented as histograms or mean values, a simple merging will not yield the correct result if the finalize method (see Section II A) has already been executed. Furthermore, a large number of heavy-ion analyses requires a cross-system merging, to construct, for example, the nuclearmodification factors (see Section IV), which are ratios of distributions obtained in AA to the same distribution in pp, scaled by an appropriate factor.
For this version of Rivet, both of these requirements have been addressed. This solution is called re-entrant finalize, and is implemented in the script rivet-merge.
The procedure for re-entrant finalization is as follows: 1. Before finalize, all output objects are saved in raw (un-normalized) form [19].
2. Finalize is run as always, and both finalized and raw objects are saved in the normal output. Note that raw objects are written before finalize is executed, and any modifications done in finalize will thus not affect the raw objects.
3. By running rivet-merge on one or more produced output files, all used analyses are loaded and all raw objects will be initialized with summed content of all runs to be merged, and finalize will be run again.
For this procedure to work, it is important that all information needed for finalizing the analysis is properly booked in the analysis setup. This implies that only Yoda objects booked in the analysis init method can be used in the analysis' finalize, thus excluding plain C ++ member objects (e.g., int or std::string) as these cannot be made persistent by the framework.
It is worth mentioning that rivet-merge can be run in two different modes. If the files to be merged contain different processes (e.g. one with pp events and another one with AA events for an R AA analysis) the program is run in default mode. If, on the other hand, the files contain the result from several identical runs, the weight handling is different and the flag -e or --equiv should be given to indicate this.

E. Analysis options
Normally, the implementation of a Rivet analysis is done such that the user is left with no possibility to influence or modify analysis logic. In some cases, for example when selecting centrality (see Section III A), this dogma would severely restrict the usability of the framework: A specific combination of analysis and Monte-Carlo event generator may require a user to inform the analysis about the nature of the generated events. For that purpose, the possibility to implement analysis options has been implemented. The value of an analysis option is specified for each individual analysis at run-time, and is then used throughout the analysis pass. Analysis options can have any type that can be constructed from a string.
In practice, the options are given as one or more suffixes to the analysis name, as in rivet -a analysis name : option 1 = value 1 : option 2 = value 2 and the corresponding values for a given option is then available in the analysis class by calling the std::string getOption(std::string) function. It is possible to load the same analysis several times, with different option values, and each instance will then be executed with the corresponding optional values. If analysis options are given to an analysis, the output histogram title(s) will be postfixed with a string [Opt=Val] to allow the user to distinguish between histograms generated with different settings.

F. Event mixing
The technique of event mixing is often used when studying correlations of two (or more) particles. By mixing data from distinct events, one can correct for acceptance and efficiency effects in a data-driven fashion. Data which is compared by Rivet should already be efficiency corrected. However, the effect of limited acceptance on observables needs to be estimated and is dependent on the distributions of particles and therefore on the Monte-Carlo generator used. A simple example is a rectangular acceptance cut (e.g. |η| < 1). Sampling two particles from a uniform distribution in this region and calculating their pseudorapidity difference, leads to a triangular shape, which has no relation to the underlying physics. While this particular case can be analytically calculated, the distribution has a non-trivial shape as soon as particle production as a function of η is not uniform. Nature, and all state-of-the-art MC generators, falls into the latter category.
The implementation of event mixing in Rivet is based on the usual strategy at Lhc experiments [20]. In this approach the studied observable is not only built from pairs of particles from a given event, but also from pairs where the particles are from different events. In these mixed events all physical correlations are presumably not present while those originating from the detector effects remain and are thus estimated in a data-driven way.
One way to illustrate event-mixing analyses is to consider some n-particle corre- The signal is then defined as the event-averaged measurement while the background is defined as the average over mixed events where E i ∩ E j is the random mixture of events E i and E j . In the final result the random correlations are factored out.
The Rivet implementation allows for booking a projection which supplies the mixed event sample, by storing events in memory. The mixed event sample can be divided into bins based on any event property such that the mixed event sample is taken from events which are similar to the signal event in question. Common use cases are centrality and multiplicity, but it is possible to use other properties, for example event shapes (such as eg. transverse spherocity), should the need be.
Finally the possibility that the Monte-Carlo event generator supplies weighted events is considered. If a sample contains weighted events, the particle content of the mixed sample will be weighted as well, with particles being selected to the mixed sample with a probability given by w e / i w i , where w e is the weight of the event containing the signal, and w i are the weights off all events in the ensemble of events that form the mixed event.

III. SPECIALIZED FEATURES FOR HEAVY-ION COLLISIONS
Heavy-ion analyses commonly make use of concepts often not considered relevant for high energy particle physics and, thus, has been not available in Rivet. In this section two new features motivated by physics requirements are outlined. In Section III A the framework for centrality is described, and in Section III B the generic framework for flow measurements is briefly introduced, and the implementation out-

lined.
A. Centrality framework The impact parameter (b) is the distance between the centers of the colliding nuclei. Its value determines on average the size and the transverse shape of the interaction region of a heavy-ion collision. Observables' dependence on the geometry of the interaction region, provides insight into the effect of the size and geometry on the underlying physical mechanisms.
Experimentally, centrality classifies collisions on the basis of some measurement (or measurements) which is assumed to depend monotonically [21] on the impact parameter b of a given collision. For a single measurement X which is assumed large for small b and small for large b, we can define where dσ vis /dX is the full measured distribution of X over the visible cross-section σ vis (typically defined in terms of minimum-bias trigger conditions) of the experiment, and X is the single collision measurement. From a theoretical point of view, the centrality of a single collision can be defined as If σ vis ≈ σ inel and X(b) is reasonably well behaved (smooth, positive, monotonically decreasing, with appropriate limiting behaviour) [22], scintillator systems covering −3.7 < η < −1.7 and 2.8 < η < 5.1. The Star collaboration [24] uses the number of charged tracks in the central pseudorapidity interval spanning −0.5 < η < 0.5, while Brahms used the charged-particle multiplicity at mid-rapidity [25,26]. The Atlas collaboration [27] uses transverse energy deposited in the forward calorimeters located at 4.9 < |η| < 3.2.
In central heavy ion collisions c b = c exp to good precision [22]. In peripheral collisions, or in pA and pp collisions the correspondence is less obvious [28]. Moreover, biases introduced by experimental trigger conditions for measurements of X, are by construction not included in equation (6). When possible, it should therefore always be preferred to use equation (5) (with appropriate trigger conditions) for estimation of centrality in a simulated event, essentially comparing experimentally obtained X directly to X obtained in a Monte-Carlo. By construction, such an approach also includes biases not related to the detector used, but to the definition of X [29].
In some cases, neither using equation (6) nor equation (5), is an option. For example, the event generator Jewel [30] does not calculate the underlying event, but considers only a jet quenched in a medium at a certain impact parameter. Other generators, for example Herwig+Pista [9,31] attempt to generate the forwardgoing particle multiplicity but do not obtain good agreement with experimental results. In other cases still, it is not possible to compare to the experimentally measured X without taking into account the detector response. This is for example the case for Alice, where X is defined as a detector response (affected by secondary particles), not unfolded to particle level. This latter case is illustrated in Figure 4, where three soft inclusive approaches (Herwig+Pista, Epos-lhc and Pythia 8.2/Angantyr) are shown together with centrality data from ALICE. In In consequence, the slicing from data cannot be naively applied to Monte-Carlo, making the calibration step necessary. In total, four possibilities for centrality estimation can be used: 1. Direct comparison to, and binning in, measured X. This, to some extend, circumvents the notion of centrality altogether, but requires that the simulated and measured X are roughly equally distributed. In Rivet, this is selected by the analysis option cent=REF.
2. Comparison to a user-implemented X, which should be a particle level, final state observable to get generator calibrated centrality. As noted above, it is important to impose the same requirements as used by the experiment to determine σ vis [32]. In Rivet, this is selected by the analysis option cent=GEN.
In Rivet, this is selected by the analysis option cent=IMP.
4. Provide a centrality percentile number in the HepMC heavy-ion record (avail-able from HepMC 3.0). In Rivet, this is selected by the analysis option cent=RAW.
Possibilities 2 and 3 require the user to perform a calibration run, to generate the distribution of either X or b and then evaluate the integrals in equations (6) or (5).
The result of the calibration pass can then be used in subsequent passes with the analysis calculating the desired centrality-dependent observable. Concrete examples of how to perform such tasks using Rivet, are given in Section IV.

B. Flow measurements
In heavy-ion collisions, the study of azimuthal anisotropy of particle production is used to understand the transport properties of the produced medium, as well as the geometry of the initial state [33]. This is quantified in flow coefficients v n 's, obtained by a Fourier expansion of the particle yield with respect to the reaction where E is the energy of the particle, p T is the transverse momentum, φ is the azimuthal angle and y is the rapidity. As indicated, the coefficients are in general dependent on y and p T . The present Rivet implementation allows one to perform a simple calculation of coefficients differentially with respect to p T , as well as integrated coefficents.
Since the reaction plane cannot be determined experimentally, flow coefficients are estimated from two-or multi-particle correlations. For example, using two- where φ 1 = φ 2 is excluded from the average. A more advanced technique -with several benefits, which are introduced in the following -calculates multi-particle cumulants from Q-vectors [34] (Q n = M k=1 w k exp(inφ k ) for an event with M particles) [35], which are combined into m-particle correlators denoted m n 1 ,n 2 ,...,nm .
This expression allows one to easily calculate any correlator, since N m n 1 ,n 2 ,...,nm can be calculated from Q-vectors, with the concrete expressions obtained using a recursion relation [36], for any values of m and n i . As an example, for m = 2, one obtains: For the special case of n 1 = n 2 = n, the two-particle correlator directly becomes: Evaluation of the product in the numerator |Q n | 2 = k,l exp(in(φ k − φ l )) reveals that the correlator reduces to equation (8). However, when evaluating using Qvectors, the process requires only a single pass over data, as opposed to nested passes.
Without the use of Q-vectors, the complexity of an m-particle correlator is obviously The double averages denote an all-event average of event averages. Even higher-order cumulants, or other combinations of correlators, such as Symmetric Cumulants [38] can be easily added by the user, as the framework allows the user to fully access the correlators. The cumulants are finally transformed into flow coefficients where the coefficient k m is written here for convenience for m < 4: Differential flow observables are constructed using an additional correlator, limited to a certain phase space. In the case of the implementation in Rivet, this is restricted to observables differential in p T . The differential correlator can be constructed in a similar fashion [36] as the integrated ones in equation (9), but the transformation to differential flow coefficients differs. Denoting the differential mparticle correlator m , the differential two-particle cumulant and the corresponding differential flow coefficient are given by: where c n {2} is a reference cumulant calculated for the full phase space as above. In the same way, the differential four-particle cumulant and the corresponding differential flow coeffiecient are: The statistical uncertainty on cumulants and flow coefficients is calculated by a simple variance method, often denoted as bootstrap [39] method in experimental literature. Instead of propagating the sampling uncertainty on the individual correlators through the (long) expressions for cumulants and flow coefficients, the calculation is internally split up into sub-samples [40]. The default behaviour is to display the square root of the sample variance as statistical uncertainty, but the user can choose to use the envelope instead.

Suppressing non-flow with an η-gap
The approximate equality in equation (8)  such as jet production or resonance decay, and are few-particle correlations as opposed to flow which is correlation of all particles to a common symmetry plane. The idea behind the η-gap or sub-event method is to suppress sources of non-flow which are local in (pseudo)rapidity. The non-flow is then suppressed by requiring that particle measurements contributing to a correlator must come from different parts of the detector acceptance, usually separated in pseudorapidity. Using equation (8), an η-gap can be implemented simply by requiring that the two particles come from two different sub-events, here labeled A and B: Translating to Q-vectors, it is first noted that φ 1 = φ 2 does not need to be excluded from the average in this case, as it is excluded by construction. Equation (11) reduces to N 2 n 1 ,n 2 = Q A n 1 Q A n 2 , where superscripts A and B indicates that Q-vectors are calculated in that specific sub-event only, as Q X n = M X k=1 exp(inφ X k ). This procedure is generalized to multi-particle correlations, by removing terms from the correlator where cross-terms do not share a common sub-event.
In Rivet, correlators with two sub-events are implemented, with the possibility of calculating both integrated and differential flow.

IV. EXAMPLES OF IMPLEMENTED HEAVY-ION ANALYSES
In this section, some of the implemented Rivet analyses using new features are presented, along with detailed instructions for a user to run the analyses and plot the resulting figures.
The centrality framework introduced in Section III A is used by many analyses, each one implementing the centrality estimator of the respective experiment.
In these cases the analysis implementation relies on an analysis for centrality calibration, called e.g., ALICE_2015_PBPBCentrality, BRAHMS_2004_AUAUCentrality, and STAR_BES_CALIB. These can be reused by many analyses sharing the same centrality estimator.
Executing the centrality calibration analysis fills histograms for each defined centrality estimator, for example the impact parameter and the number of charged particles or their energy in the rapidity region where the relevant detector is situated. Figure 5 exemplifies the calibration for Brahms, for which an unfolded measure does not exist, and the options 2 or 3 (GEN and IMP) must be used.
In order to use the generated calibration histograms in the actual analyses, they must be pre-loaded when running the analysis code. The analysis will then calculate the percentile of the calibration curve for the value determined in the newly generated event and, thus, allow the user to retrieve the centrality for the current event.  Figure 4).
To reproduce those results based on a HepMC file containing AA collisions generated by the considered generators, the following steps need to be followed for the example of the Brahms analysis: 1. Run the calibration analysis  rivet-mkhtml result.yoda Note that in this case (and in fact in all currently implemented heavy ion analyses), the calibration and analysis steps are carried out on the same event sample. This would not be the case for a signal analysis considering a signal process (eg. Zproduction). Here the calibration step would be carried out on a minimum bias sample, and the signal step on a signal sample.
The real power of Rivet analyses is, however, not to compare a single generator at a time, but the ability to compare several generators to the same experimental data at once. This use case is shown in Figure 6 (middle), where both Epos-lhc and The feature reentrant finalize introduced in Section II D, is for example used in the analysis ALICE_2012_I1127497. The main observable is the nuclear-modification factor R AA defined as the ratio of the yield in AA collisions over pp collisions scaled by an appropriate factor N coll to accommodate for the trivial scaling of binary collisions [16]: In order to compute this quantity, Rivet is executed three times, first for a generator with AA settings, second with pp settings, and a third and final time to compute the ratio. A comparison of this observable to a calculation with the Jewel generator is shown in Figure 7, and can be performed with the following steps, given generated HepMC files of pp and AA events by Jewel.  3. Next the Rivet analysis is run on the AA sample. Note the beam option signifying that this is an AA run. Since the Jewel event generator does not supply this information in the HepMC file, but denotes all beams as pp beams, the analysis requires this additional information: rivet aa-sample.hepmc --ignore-beams -p calibration.yoda -a ALICE_2012_I1127497:beam=HI:cent=IMP -o jewel-aa.yoda 4. The two samples must be merged, and the ratio plots constructed, with rivet-merge, which loads histograms from the two runs above, and runs finalize on the preloaded histograms: rivet-merge --merge-option beam --merge-option cent jewel-pp.yoda jewel-aa.yoda -o jewel-merged.yoda The optional argument merge-option removes the analysis options beam and cent from the heavy ion sample.
5. The merged sample can be plotted. There is no need to plot the unmerged histograms, as all histograms are contained in the merged file.

V. UNDERSTANDING BIASES IN SIMULATION AND DATA
Apart from direct validation of a given Monte-Carlo event generator, the analysis capabilities of Rivet are useful as tools to reveal inherent biases in representation of data. In this section the use of the physics motivated extensions presented in Section III are exemplified by presenting short analyses of centrality biases introduced by using different centrality observables for theory and data (Section V A) and from inclusion of non-flow effects in flow measurements (Section V B). These two effects are well known, and chosen to show the capabilities of the framework, rather than presenting an independent physics point.

A. Centrality observables
In pA collisions, such as those performed at the Lhc with p-Pb collisions at √ s NN = 5.02 TeV, and recorded by the Atlas experiment [49], the notion of centrality is used as in AA collisions, shown in Section IV, but without the clear relation to the physical impact parameter. Since the physical impact parameter is well defined and useful from a theoretical point of view (as it can be used to infer properties for peripheral (60-90%) (left) and central (0-1%) (right) events [49]. We see a small, but systematic, overestimate of the charged-particle pseudorapidity density when using either generator or experimentally calibrated centrality estimators. The impact parameter based centrality estimator has overall larger deviations from experimental results (also in centrality classes not shown here, see ref. [28] for more comparisons), but without any clear systematics in the deviations.
In Figure 10 the charged-particle multiplicity in Pb-Pb differential in p T (left) and η for 1.7 GeV/c < p T < 2.0 GeV/c (right) in the 5-10% centrality bin are shown [50], again for the three different centrality estimators. Here, the results from the generated and impact parameter centrality estimators fully coincides, thus in agreement with the assumption that the measured centrality (equation (5)) coincides with theoretical centrality (equation (6)) in AA collisions. It is also seen that comparison using the measured experimental centrality measure, deviates from the two others, as expected.

B. Non-flow effects
The v 2 and v 3 measured by Alice in Pb-Pb collisions, at √ s NN = 2.76 TeV as a function of centrality, is shown in Figure 11 together with comparisons to Pythia 8.2/Angantyr simulations [51]. The simulations were obtained using two different approaches. First the event plane method, where equation (7) is evaluated directly, giving v n = cos (n (φ − Ψ)) . The used value for the event plane angle Ψ[52] can be evaluated in simulations, but not in data. Secondly the Generic Framework (GF) implementation in Rivet, described in Section III B, is used to estimate v 2 and v 3 with two-particle correlations as it is done in the experiment.
In the former case, the Pythia 8.2 simulations are consistent with 0. This is expected, as the typical mechanisms associated with the non-flow (jets, resonance decays) are not correlated with the event plane, and thus any coincidental correlations are averaged out. On the other hand, this is not true for the simplest case of two-particle correlations. In particular, back-to-back jets give rise to strong correlations that are unrelated to the hydrodynamical evolution of the system. As a result, Pythia 8.2 predicts non-zero v 2 and v 3 values calculated with two-particle correlations, as seen in Figure 11. We note that the non-flow effects are more pronounced for v 2 and increase for peripheral Pb-Pb collisions, where the dN ch /dη is lower and therefore the different non-flow sources get less averaged out. It is noted that the non-flow contribution is here shown to be sizeable for v 2 {2} even with a required minimum separation in η between the two correlated particles.

VI. COLLECTIVE EFFECTS IN PP COLLISIONS
The discovery of heavy-ion-like effects in pp collisions has spurred a lot of interest from the proton-proton Monte-Carlo authors to introduce models for such effects [53][54][55][56][57][58][59][60]. The inclusion of heavy-ion features in Rivet will aid this development, as  such features can be used directly in new analyses for pp as well. As an example, the seminal results from the Cms experiment on two-and multi-particle correlations [61] and from the Alice experiment on strangeness enhancement [5] has been implemented in Rivet analyses CMS_2017_I1471287 and ALICE_2016_I1471838.
Results from the former shown in Figure 12.
The analysis is implemented with the flow framework presented in Section III B.
In Figure 12 the cumulant c 2 {2} as a function of charged-particle multiplicity (left, equation (13)), and the corresponding flow coefficient as a function of transverse momentum (right) are shown.

VII. CONCLUSION AND FUTURE WORK
Extensions to the Rivet framework to facilitate comparisons between Monte-Carlo event generators and data have been presented. The Rivet framework is a standard tool for preservation of Lhc analyses of proton-proton collisions, and with the presented extensions, the framework is ready to start taking a similar role for heavy-ion physics.
There are, however, areas where the updated framework is foreseen to be further extended in order to allow for better comparisons to jet physics studies, reproduction of global fits of QGP properties, as well as allowing for comparison of simulation tools developed for heavy-ion physics.
One avenue of further development is jet observables. Rivet includes an interface to the popular FastJet package [62] for jet reconstruction, which already includes some functionality for sub-jet observables geared towards heavy-ion collisions [63]. A more substantial problem for preservation of jet analyses is that of handling the large "underlying event" of a heavy-ion collision. A preliminary Rivet implementation to perform background subtraction and QGP back-reactions for specific models is available [64], but no truly model independent prescription exist. Due to large differences between models, it may even be such that experimental background subtraction cannot be fully reproduced for models unless the model also attempts to describe the full underlying event. In such case a better option might be to publish data without background subtraction for model comparisons.
Another future direction involves expanding the possibility for post-processing beyond what re-entrant finalization allows. Specifically to add the possibility to perform fits used in experimental analyses, such as for example reinterpretation in terms of simple thermal models like Boltzmann-Gibbs Blast Wave for AA [65] or Lévy-Tsallis for pp [24,66]. A technically similar development for high energy physics is realized in the Contur package [67], which constrains BSM physics using Standard Model measurements implemented in Rivet. Reinterpretation of heavy-ion analyses could be performed in a technically similar way, possibly also as an additional software package. Such an extension could also cater to theoretical models not suitable for direct event-by-event analysis.

VIII. ACKNOWLEDGEMENTS
We thank the Discovery Center at the Niels Bohr Institute for hosting a successful workshop in August 2018, consolidating the efforts described in this paper, and in