Measurement of multi-particle azimuthal correlations in pp, p + Pb and low-multiplicity Pb + Pb collisions with the ATLAS detector

Multi-particle cumulants and corresponding Fourier harmonics are measured for azimuthal angle distributions of charged particles in pp\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$pp$$\end{document} collisions at s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{s}$$\end{document} = 5.02 and 13 TeV and in p\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p$$\end{document} + Pb collisions at sNN\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{s_{_\text {NN}}}$$\end{document} = 5.02 TeV, and compared to the results obtained for low-multiplicity Pb+Pb\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{Pb}~+~\mathrm{Pb}$$\end{document} collisions at sNN\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{s_{_\text {NN}}}$$\end{document} = 2.76 TeV. These measurements aim to assess the collective nature of particle production. The measurements of multi-particle cumulants confirm the evidence for collective phenomena in p\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p$$\end{document} + Pb and low-multiplicity Pb+Pb\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{Pb}~+~\mathrm{Pb}$$\end{document} collisions. On the other hand, the pp\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$pp$$\end{document} results for four-particle cumulants do not demonstrate collective behaviour, indicating that they may be biased by contributions from non-flow correlations. A comparison of multi-particle cumulants and derived Fourier harmonics across different collision systems is presented as a function of the charged-particle multiplicity. For a given multiplicity, the measured Fourier harmonics are largest in Pb+Pb\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{Pb}~+~\mathrm{Pb}$$\end{document}, smaller in p\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p$$\end{document} + Pb and smallest in pp\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$pp$$\end{document} collisions. The pp\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$pp$$\end{document} results show no dependence on the collision energy, nor on the multiplicity.


Introduction
One of the signatures of the collective behaviour of the hot, dense medium produced in heavy-ion collisions is the azimuthal anisotropy of produced particles. This anisotropy results from spatial asymmetry in the initial interaction region in off-centre ion-ion collisions. The initial asymmetry activates strong pressure gradients along the shorter axis of the overlap region, leading to increased production of particles within the reaction plane, defined by the impact parameter vector (the vector separation of the barycentres of the two nuclei) and the beam axis. The azimuthal anisotropy is commonly characterized by Fourier harmonics v n , referred to as single-particle harmonic flow coefficients: v n = cos[n(φ − R )] , where φ is the azimuthal angle of a produced particle e-mail: atlas.publications@cern.ch and R is the azimuthal angle of the reaction plane [1]. This anisotropic, collective enhancement of particle production is a global long-range phenomenon extending over a wide pseudorapidity range.
The anisotropy of charged-particle azimuthal angle distributions in A + A collisions has been a subject of extensive experimental studies at RHIC [2][3][4][5][6][7] and at the LHC [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22]. In non-central heavy-ion collisions, the large and dominating v 2 coefficient is mainly associated with the elliptic shape of the nuclear overlap. The v 2 coefficient in ultra-central collisions and other v n coefficients in all collisions are related to various geometric configurations arising from fluctuations of the nucleon positions in the overlap region [23,24]. The reported results are consistent with model calculations based on a hydrodynamic description of the system evolution and provide conclusive evidence that the hot and dense matter produced in A + A collisions behaves collectively in accordance with a hydrodynamic flow and has properties resembling those of a nearly perfect fluid [25][26][27][28].
The study of p + A collisions was thought to provide information on cold nuclear matter effects, relevant for understanding the hot and dense system produced in A + A collisions. In p + A collisions, the size of the produced system is small compared to the mean free path of its constituents. Therefore, it might be expected that the collective flow, if any, generated in p + A collisions is much weaker than in heavyion interactions. Contrary to these expectations, significant v n coefficients, only about 40% smaller in magnitude than those obtained in Pb + Pb collisions, have been measured in p + Pb collisions at the LHC energy of √ s NN = 5.02 TeV [29][30][31][32][33][34][35][36][37][38]. Observations of azimuthal anisotropies were also reported recently for d + Au [39,40] and 3 He+Au [41] collisions at the RHIC energy of √ s NN = 200 GeV.
Interestingly, long-range two-particle azimuthal correlations have also been observed in high-multiplicity pp collisions at the LHC energies [42][43][44][45][46]. It was found that the measured azimuthal correlations, which extend over a wide range in pseudorapidity, can be explained by the cos(nφ) modulation of the single-particle azimuthal angle distribution. The extracted Fourier harmonics v n for n = 2-4 [46] are generally much smaller than those measured in p + Pb and Pb + Pb collisions, and show no dependency on the chargedparticle multiplicity. On the other hand, they display a similar dependence on particle transverse momenta, suggesting that the same underlying mechanism may be responsible for the long-range azimuthal correlations. These observations in pp collisions, together with the results from the p + A system described above, are among the most challenging and pressing problems in the domain of soft quantum chromodynamics. Various models have been proposed to explain the source of the observed long-range correlations in small collision systems [47][48][49][50][51][52][53][54][55][56][57][58][59][60][61][62][63], but the origin of the effect is still under intense debate. It is not yet known whether the mechanism responsible for the observed collective behaviour in A + A collisions is also relevant for the smaller systems. The main purpose of this paper is to contribute to this debate by providing new experimental results.
Several differing analysis methods are applied to measure Fourier harmonics in high-energy collisions. They differ principally in their sensitivity to correlations not related to the initial collision geometry (referred to as non-flow correlations), which can result from resonance decays, jet production, Bose-Einstein correlations or energy-momentum conservation. For small collision systems and low-multiplicity final states, the most common method uses two-particle correlation functions [29][30][31]33,[35][36][37][38][42][43][44][45][46]64]. In this method, the non-flow correlations are suppressed by requiring a large pseudorapidity separation, | η|, between particles forming a pair. This requirement eliminates most of the shortrange correlations including intra-jet correlations. The jet-jet correlations are subtracted from the two-particle correlation function using the correlations measured in low-multiplicity events (see e.g. [43,46]).
The multi-particle cumulant method [65][66][67] was proposed to suppress the non-flow correlations. The method aims to measure correlations between a large number of particles, from which the correlations between a small number of particles are subtracted. Since non-flow correlations typically involve a low number of particles, they are suppressed in many-particle cumulants. The drawback of the method is the statistical limitation in calculating the cumulants of more than two particles. Furthermore, the multi-particle cumulants in small collision systems, derived from correlations between low number of particles, can be biased by non-flow jet and dijet correlations, which dominate the azimuthal correlation signal. The cumulant method has been applied to measure global correlations and Fourier harmonics in Pb + Pb and p + Pb collisions [18,20,32,33,36]. Recently, the four-and six-particle cumulants were also measured by the CMS Collaboration in pp collisions at 5, 7 and 13 TeV [45].
In this paper, the ATLAS measurements of multi-particle cumulants are presented for pp collisions at 5.02 and 13 TeV and for p + Pb collisions at √ s NN = 5.02 TeV. For comparison, the results for low-multiplicity (peripheral) Pb + Pb collisions at √ s NN = 2. 76 TeV are also shown. The results are averaged over large ranges in p T and pseudorapidity. Results obtained from different collision systems are compared as a function of the charged-particle multiplicity. The paper is organized as follows. The analysis method is described in the next section, followed by the description of the detector (Sect. 3) and presentation of the analysed data samples and event and track selections in Sects. 4 and 5. The analysis details are given in Sect. 6 while Sect. 7 contains a discussion of systematic uncertainties and cross-checks. The results for cumulants and the corresponding Fourier harmonics are shown in Sect. 8. A summary and concluding remarks are given in Sect. 9.

Multi-particle cumulants
The multi-particle cumulant method is useful in studying the global nature of correlations observed in azimuthal angles of particles produced in high-energy collisions. The cumulant method involves the calculation of 2k-particle azimuthal correlations, corr n {2k}, and cumulants, c n {2k}, for nth Fourier harmonics, where n = 2, 3, 4 and k = 1, 2, 3, 4 for the analysis presented in this paper. The corr n {2k} are defined as [65,67]: where the brackets " " denote double averaging, performed first over particles in an event and then over all events within a given event class. For every event, the average is taken over all possible of the combinations of the azimuthal angles φ i (i = 1, . . . , 8) of the 2k particles.
With the calculated multi-particle azimuthal correlations, the cumulants c n {2k} are obtained after subtracting the correlations between 2(k − 1) particles according to the following formulae [65,67]: The Q-cumulant method [67], used in this analysis, relies on the idea of expressing the multi-particle correlations in terms of powers of the flow vector Q n . This approach allows multi-particle correlations and cumulants to be calculated in a single pass over data events. The flow vector is defined for each collision event with multiplicity M as: where the subscript n denotes the order of the flow harmonic, j is the power of the flow vector, and the sum runs over all particles in an event with w i being the weight of the ith particle. The weight accounts for detector effects including the tracking efficiency and is defined in Sect. 6. If the measured c n {2k} cumulants are free of non-flow correlations, they can be used to estimate Fourier harmonics v n . Furthermore, assuming that the event-by-event fluctuations of v n are negligibly small, the Fourier harmonics denoted by v n {2k} can be determined [65]: From the above definitions it is evident that determination of real values of Fourier harmonics requires negative (positive) c n {4} and c n {8} (c n {2} and c n {6}) values.

ATLAS detector
The data were collected with the ATLAS detector [68]. 1 The detector consists of three main systems: an inner tracking detector (ID) surrounded by a thin superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer. The ID is immersed in a 2T axial magnetic field and provides charged-particle tracking in the range |η| < 2.5. It consists of silicon pixel, silicon microstrip (SCT), and strawtube transition radiation tracking detectors. Since 2015 the pixel detector includes an additional layer at smaller radius, the "insertable B-layer" (IBL) [69,70]. The calorimeter system covers the pseudorapidity range up to |η| = 4.9. The muon spectrometer surrounds the calorimeters and is based on three large air-core toroid superconducting magnets with 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the z-axis along the beam pipe. The x-axis points from the IP to the centre of the LHC ring, and the y-axis points upwards. Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis. The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2). eight coils each. The field integral of the toroids ranges between 2 to 6 T m across most of the detector. Measurements presented in this document use signals from the ID while other components are used for triggering. Events are selected with a trigger system [71]. The firstlevel (L1) trigger is implemented in hardware and uses a subset of the detector information. For this analysis the information from calorimeters, minimum bias trigger scintillator (MBTS) counters (covering the range 2.1 < |η| < 3.8) and zero degree calorimeters (ZDCs) with the range |η| > 8.3 is used at L1. The L1 trigger is followed by two software-based trigger levels: level-2 (L2) and Event Filter (EF). In pp datataking in 2015, the L2 and EF trigger levels are combined in a common high-level trigger (HLT) framework.

Data sets
The √ s = 5.02 TeV pp data were recorded in November 2015 and correspond to an integrated luminosity of about 28 pb −1 . The average number of additional interactions in the same bunch crossing, μ, ranges from 0.4 to 1.3. For the low-multiplicity event selections, three minimum-bias triggers were used: the first required a hit in at least one MBTS counter, the second required a hit in at least one MBTS counter on each side, and the third required at least one reconstructed track at the HLT seeded by a random trigger at L1. In order to enhance the number of high-multiplicity events, dedicated high-multiplicity triggers (HMTs) were implemented. Three HMTs required at L1 more than 5, 10 and 20 GeV in the total transverse energy ( E T ) recorded in the calorimeters, and at the HLT more than 60, 90 and 90 reconstructed charged-particle tracks with p T > 0.4 GeV and |η| < 2.5, respectively.
The √ s = 13 TeV pp data were taken over two running periods in June and August of 2015. For the first running period, μ varied between 0.002 and 0.03, while for the second μ ranged from 0.05 to 0.6. The total integrated luminosity collected over these two periods is approximately 0.075 pb −1 . In addition to the minimum-bias event trigger, HMTs were implemented seeded by a L1 requirement of E T > 10 GeV. For the low-μ running period, the requirement of more than 60 reconstructed charged-particle tracks at the HLT was imposed. For the moderate-μ data (the second data-taking period), two requirements on the number of online reconstructed charged-particle tracks at the HLT, of more than 60 and 90, were employed.
The p + Pb data were collected during the LHC run at the beginning of 2013. The LHC operated in two configurations during this running period, by reversing the directions of the proton and lead beams. The proton beam with the energy of 4 TeV collided with a Pb beam of energy 1.57 TeV per nucleon. This leads to √ s NN = 5.02 TeV in the nucleon-nucleon centre-of-mass frame, which is shifted by 0.465 in rapidity in the proton direction. The total integrated luminosity corresponds to approximately 0.028 pb −1 . The data were recorded with the minimum-bias trigger and several HMTs, seeded by L1 thresholds on the total transverse energy recorded in the forward calorimeters ( E FCal T , 3.1 < |η| < 4.9) and HLT thresholds on the number of online reconstructed charged-particle tracks, N online ch [72]. Six different combinations of the L1 and HLT thresholds were implemented: ( E FCal T [GeV] >, N online ch >) = (10,100), (10,130), (50,150), (50,180), (65,200) and (65,225). More details can be found in Ref. [35]. For the p + Pb data, μ ≈ 0.03.
The √ s NN = 2.76 TeV Pb + Pb data set used in this analysis consists of the data collected in 2010 and then reprocessed in 2014 with the same reconstruction software as used for p + Pb data. The number of additional interactions per bunch crossing is negligibly small, of the order of 10 −4 . Monte Carlo (MC) simulated event samples are used to determine the track reconstruction efficiency (Sect. 5) and to perform closure tests, as described in Sect. 7. For the 13 and 5.02 TeV pp data the baseline MC event generator used is Pythia 8 [73] with parameter values set according to the ATLAS A2 tune [74] and with MSTW2008LO parton distribution functions [75]. The Hijing event generator [76] is used to produce p + Pb and Pb + Pb collisions with the same energy as in the data. The detector response is simulated [77] with Geant 4 [78] and with detector conditions matching those during the data-taking. The simulated events are reconstructed with the same algorithms as data events, including track reconstruction.

Event and track selections
Additional event selections are implemented in the offline analysis. Events are required to have a reconstructed vertex. For the p + Pb and Pb + Pb data, only events with a reconstructed vertex for which |z vtx | < 150 mm are selected while for pp data sets this requirement is not applied.
In order to suppress additional interactions per bunch crossing (referred to as pile-up) in pp data sets, only tracks associated with the vertex for which the p 2 T is the largest are used. In addition, all events with a second vertex reconstructed from at least four tracks are disregarded. For the p + Pb data, even though the average number of interactions per bunch crossing is small (∼0.03), it can be significantly larger in events with a high multiplicity. Therefore, events containing more than one interaction per bunch crossing are rejected if they contain more than one good reconstructed vertex, where a good vertex is defined as that with the scalar sum of the tracks transverse momenta p T > 5 GeV. The remaining pile-up events are further suppressed using the ZDC signal on the Pb-fragmentation side, calibrated to the number of recorded neutrons [35]. In order to suppress beam backgrounds in p + Pb and Pb + Pb data, a requirement on the time difference between signals from MBTS counters on opposite sides of the interaction region is imposed, | t| < 10 and <3 ns, respectively.
For the pp data, charged-particle tracks are reconstructed in the ID with the tracking algorithm optimized for Run-2 data [79]. The tracks are required to have |η| < 2.5 and p T > 0.1 GeV. At least one pixel hit is required and a hit in the IBL is also required if the track passes through the active region of the IBL. If a track passes through an inactive area of the IBL, then a hit is required in the next pixel layer if one is expected. The requirement on the minimum number of SCT hits depends on p T : ≥ 2 for 0.1 < p T < 0.3 GeV, ≥ 4 for 0.3 < p T < 0.4 GeV and ≥ 6 for p T > 0.4 GeV. Additional selection requirements are imposed on the transverse, |d 0 |, and longitudinal, |z 0 sin θ |, impact parameters. The transverse impact parameter is measured with respect to the beam line, and z 0 is the difference between the longitudinal position (along the beam line) of the track at the point where d 0 is measured and the primary vertex. Both must be smaller than 1.5 mm. In order to reject tracks with incorrectly measured p T due to interactions with the detector material, the track-fit probability must be larger than 0.01 for tracks with p T > 10 GeV.
For the reconstruction of p + Pb and Pb + Pb data, the same tracking algorithms are used. The track selection requirements are modified slightly from those applied in the pp reconstruction. Specifically, the same requirements are imposed on the impact parameters, although |d 0 | is determined with respect to the primary vertex. To suppress falsely reconstructed charged-particle tracks, additional requirements are imposed on the significance of the transverse and longitudinal impact parameters: |d 0 |/σ d 0 < 3 and |z 0 sin θ |/σ z 0 < 3, where σ d 0 and σ z 0 are the uncertainties in the transverse and longitudinal impact parameter values, respectively, as obtained from the covariance matrix of the track fit.
The tracking efficiencies are estimated using the MC samples reconstructed with the same tracking algorithms and the same track selection requirements. Efficiencies, (η, p T ), are evaluated as a function of track η, p T and the number of reconstructed charged-particle tracks, but averaged over the full range in azimuth. For all collision systems, the efficiency increases by about 4% with p T increasing from 0.3 to 0.6 GeV. Above 0.6 GeV, the efficiency is independent of p T and reaches 86% (72%) at η ≈ 0 (|η| > 2), 83 (70%) and 83% (70%) for pp, p + Pb and peripheral Pb + Pb collisions, respectively. The efficiency is independent of the event multiplicity for N ch > 40. For lower-multiplicity events the efficiency is smaller by a few percent. The rate of falsely reconstructed charged-particle tracks, f ( p T , η), is also esti-mated and found to be small; even at the lowest transverse momenta it stays below 1% (3%) at η ≈ 0 (|η| > 2).
Residual detector defects (not accounted for by tracking efficiencies), which may arise on a run-by-run basis and could lead to a non-uniformity of the azimuthal angle distribution, are corrected for by a data-driven approach, the so-called flattening procedure described in Sect. 6.
The analysis is performed as a function of the chargedparticle multiplicity. Three measures of the event multiplicity are defined based on counting the number of particles observed in different transverse momentum ranges: 0.3 < p T < 3 GeV, 0.5 < p T < 5 GeV and p T > 0.4 GeV (see next section for details). For each multiplicity definition, only events with multiplicity ≥10 are used to allow a robust calculation of the multi-particle cumulants. Furthermore, in order to avoid potential biases due to HMT inefficiencies, events selected by the HMTs are accepted only if the trigger efficiency for each multiplicity definition exceeds 90%. The only exception is the pp 13 TeV data collected in August 2015 with the HMT requiring more than 90 particles reconstructed at the HLT, for which the 90% efficiency is not reached. It was carefully checked that inclusion of this data set does not generate any bias in the calculation of multiparticle cumulants.

Overview of the analysis
For each collision system, the multi-particle cumulants are calculated using the so-called reference particles. Two selections of reference particles are considered, for which the multiplicity M ref in a given event is the number of reconstructed charged particles with |η| < 2.5 and with corresponding p T ranges: 0.3 < p T < 3 GeV or 0.5 < p T < 5 GeV. Figure 1 shows the uncorrected M ref multiplicity distributions for the reconstructed charged-particle tracks with 0.3 < p T < 3 GeV for all collision systems. The observed discontinuities reflect the offline selection requirement of at least 90% efficiency for the HMT thresholds. Event weights are introduced to account for the trigger efficiency and the trigger prescale factors [35].
Particle weights (see Eq. (1)) are applied to account for detector effects via w φ (η, φ), the tracking efficiency (η, p T ) and the rate of fake tracks f (η, p T ), and are defined as: The tracking efficiencies and fake rates are determined as described in Sect. 5. The weights w φ (η, φ) are determined from the data by the procedure of azimuthal-angle flatten- ing in order to correct for non-uniformity of the azimuthal acceptance of the detector. The flattening procedure uses the η-φ map of all reconstructed charged-particle tracks. For each small interval (δη, δφ), a "flattening" weight is calculated as w φ (η, φ) = N (δη) /N (δη, δφ) where N (δη) is the event-averaged number of tracks in the δη slice, averaged over the full range in φ, while N (δη, δφ) is the number of tracks within this interval. The cumulants and corresponding Fourier harmonics are studied as a function of the charged-particle multiplicity. Two ways of selecting events according to the event multiplicity are considered. The first one is to select events with a given M ref , which is referred to as EvSel_M ref . An alternative way (EvSel_N ch ) is to apply the event-selection on the basis of the number of reconstructed charged particles with p T > 0.4 GeV, N rec ch , and then for such selected events calculate the cumulants using reference particles. For both event selections, the cumulants are calculated in unit-size bins in either M ref or N rec ch , which are then combined into broader, statistically significant multiplicity intervals by averaging the cumulants, c n {2k}.
For the purpose of a direct comparison of results obtained with different event selections, the standard multiplicity variable measuring the event activity is used. The N ch ( p T > 0.4 GeV) multiplicity, corrected for tracking efficiency and the rate of falsely reconstructed charged-particle tracks as well as for trigger efficiencies, is used to present the results. When selecting events according to M ref multiplicity, the correlation between M ref and the N ch ( p T > 0.4 GeV) is employed. Figure 2 shows mean N ch ( p T > 0.4 GeV) multiplicities calculated in M ref intervals, which are used in the analysis. The correlation is shown for each collision system and for two p T ranges of reference particles. In the case of EvSel_N ch , a similar mapping of N rec ch intervals into The two event selections differ in their sensitivity to eventby-event multiplicity fluctuations and are biased in a different manner by contributions from non-flow correlations. In the selection based on M ref , by construction, multiplicity fluctuations are eliminated. This is not the case for the selection using Even when using an event selection free of multiplicity fluctuations, the cumulants calculated with a small number of particles can be contaminated by non-flow correlations.
For two-particle cumulants, c n {2}, the non-flow correlations can be reduced by requiring a large separation in pseudorapidity between particles forming a pair. As in the analysis of two-particle correlations [31,35,43,46], the requirement of | η| > 2 is implemented in calculating the cumulants Fig. 4 for all collision systems. A strong reduction of the cumulant values can be seen after requiring | η| > 2, which is the most significant at low multiplicities and for pp collisions, where the short-range two-particle non-flow correlations dominate. Unfortunately, such a requirement on | η| cannot be applied in the calculation of cumulants of more than two particles in the standard cumulant approach applied in this analysis. This has to be taken into account when interpreting the results obtained for c n {4}. It is known (from Pythia [80] and Hijing simulations) that jet and dijet  production can generate correlations between four particles, especially in collision systems (e.g. pp) where collective flow effects are expected to be small. Measurements of multi-particle cumulants and the corresponding flow harmonics require very large event samples, especially when considering cumulants and correlations between more than two particles. This analysis uses the twoparticle cumulants with a rapidity gap of | η| > 2 to determine c n {2, | η| > 2} for n = 2, 3 and 4 for all collision systems. Four-particle cumulants can be reliably determined for all collision systems only for c 2 {4}. A statistically significant measurement of higher-order cumulants and harmonics, n = 3, 4, with more than two-particle correlations is not possible with the current data sets. Statistical limitations are particularly severe for six-and eight-particle cumulants measured in pp collisions. The statistical uncertainty of the pp data sets used in this analysis is significantly larger than the expected magnitude of the six-and eight-particle cumulants, preventing reliable measurements of these observables. Therefore, the measurements of c 2 {6} and c 2 {8} and the corresponding Fourier harmonics are reported only for p + Pb and Pb + Pb collisions.

Systematic uncertainties and cross-checks
The systematic uncertainties are estimated for c n {2, | η| > 2} (n= 2, 3 and 4) and c 2 {4}, for all collision systems, and for c 2 {6} and c 2 {8} only for p + Pb and Pb + Pb data. The two ranges in p T of reference particles are considered: 0.3 < p T < 3 GeV and 0.5 < p T < 5 GeV. The c n uncertainties are then propagated to the corresponding v n . Details on the contributions to systematic uncertainties from different sources are collected in tables included in the Appendix.
The following systematic uncertainties are considered: Track-quality selections The systematic uncertainties resulting from different track selection requirements are estimated as differences between the nominal results and the results obtained with modified track selection criteria. For pp data, the requirements on the impact parameters are varied from the nominal value of |d 0 | < 1.5 mm and |z 0 sin θ | < 1.5 mm, to the tight selection, |d 0 | < 1 mm and |z 0 sin θ | < 1 mm, and to the loose selection, |d 0 | < 2 mm and |z 0 sin θ | < 2 mm. For p + Pb and Pb + Pb collisions the nominal selection requirements defined by the cuts on the impact parameters and the cuts on the significance of impact parameters (|d 0 | < 1.5 mm, |z 0 sin θ | < 1.5 mm, |d 0 /σ d 0 | < 3 and |z 0 sin(θ )/σ z | < 3) are changed to the loose ones: The tight selection requirements are: Tracking efficiency Systematic uncertainty in the track reconstruction efficiency results from an imperfect detector geometry description in the simulations. It affects the particle weights determined using the MC-derived tracking efficiency, (η, p T ). For pp collisions, the efficiency uncertainty depends on η and p T , as derived from the studies with the varied detector material budget [81]. It is found to vary between 1 and 4%, depending on η and p T . For p + Pb and Pb + Pb collisions, the efficiency uncertainty is assumed to vary with p T up to 4%, independently of η. The systematic uncertainty of the multi-particle cumulants is estimated by repeating the analysis with the tracking efficiency varied up and down by its corresponding uncertainty. The systematic uncertainty is taken as the largest deviation of the nominal result from the result obtained assuming a higher or lower efficiency. It is estimated for each bin in the charged-particle multiplicity.

Pile-up
The pile-up effects may be important for the analysis of pp data. The pile-up is significantly reduced by removing events with a second vertex reconstructed from at least four tracks. Furthermore, in the analysis the M ref and cumulants are always calculated using the tracks associated with the primary vertex. As a result the pile-up effects should not play a significant role. The exception might be due to events where the pile-up vertex is so close to the primary vertex that the two are merged. To assess the pile-up effect on the cumulants calculated for 13 TeV pp data, the results for the low-μ June data (μ < 0.03) and the moderate-μ August data (μ ∼ 0.6) are compared and the differences are found to be negligible.
However, such pile-up studies for pp collisions are strongly affected by statistical fluctuations, which arise due to the small number of data events with low or high pile-up as well as to the smallness of the measured signal. This is particularly true for four-particle cumulants as well as higher-order cumulants c 3 {2, | η| > 2} and c 4 {2, | η| > 2}, for pp collisions. Therefore, an alternative approach is also considered, where different criteria are used to reduce the pile-up. In the nominal approach, all events with a second vertex containing at least four tracks are removed. Here, the removal of events with a second vertex reconstructed from at least two or six tracks is also considered and the results for these two selections of events are compared to the nominal results. The maximum difference between the nominal measurement and the cumulants obtained from the data set with higher pile-up or lower pile-up is taken as a systematic uncertainty.
For p + Pb results, the pile-up effects are studied by comparing the nominal results, for which events with the second vertex with p T > 5 GeV are removed, to the results obtained without removing the pile-up events. The maximum difference between the nominal measurement and the cumulants obtained without removing the pile-up events is taken as a systematic uncertainty.
For low-multiplicity Pb + Pb collisions the pile-up is negligibly small (μ ≈ 10 −4 ) and not considered to contribute to the systematic uncertainty. Comparison of results for p + Pb and Pb + p For p + Pb data the comparison is made between the results obtained during two running configurations with reversed beams directions, p + Pb and Pb + p. The results obtained from two running periods are consistent and give a negligible contribution to the systematic uncertainty.
The systematic uncertainty of the measured cumulants across all systems and the two p T ranges of reference particles is not dominated by a single source. However, in most cases the largest contribution is from the track selection uncertainty, which mostly dominates uncertainties for higher-order harmonic cumulants. A sizeable contribution to the total uncertainty is also due to the tracking efficiency uncertainty, and this uncertainty is the largest for low multiplicities. The pile-up effects also give sizeable contributions to uncertainties in 5.02 TeV pp cumulants. The total systematic uncertainty is obtained by adding all individual contributions in quadrature. Table 1 lists the total systematic uncertainties of the measured cumulants in different collision systems for reference particles with 0.3 < p T < 3 GeV. The listed systematic uncertainties are averaged over the N ch range. For reference particles in the higher transverse momentum range, 0.5 < p T < 5 GeV, the total systematic uncertainties are included in Table 2. The total systematic uncertainty of the cumulants is then propagated to the sys- (2)-(5). Several cross-checks are also performed to validate the analysis method, but are not included in the systematic uncertainty. To account for the detector imperfections and to make the analysed azimuthal angle distribution uniform, data-determined weights w φ (η, φ) are used, as described in Sect. 6. To verify the robustness of the weighting procedure, the nominal results for cumulants are compared with those obtained with all weights w φ (η, φ) set to 1. The difference between the two measurements relative to the nominal results is found to be negligibly small.
Changing the trigger efficiency from 90% to 95% is also found to have negligible impact on the measured cumulants.
The global correlation effects should be independent of the charge sign of the produced particles. However, in reality the non-flow contributions may differ for same-sign and opposite-sign charged particles. To verify whether the results reported here depend on the charge of particles, the analysis is performed separately for same-sign charged particles only and compared to the results for all charged particles. In all cases, no systematic difference is observed when comparing the cumulants for all charged particles with those obtained using only same-sign charged particles.

Second-order multi-particle cumulants and Fourier harmonics
The comparison between different collision systems is made for the cumulants calculated in M ref -bins, where the p T range of reference particles is 0.3 < p T < 3.0 GeV and 0.5 < p T < 5.0 GeV. A direct comparison of c 2 {2, | η| > 2} for different collision systems is shown in Fig. 5 as a function of N ch ( p T > 0.4 GeV) . An ordering in the magnitude of cumulants, with the largest for Pb + Pb, and then decreasing for smaller collision systems, is observed. Interestingly, for the three systems the N ch -dependence changes from a clear  There is no dependence on the collision energy for pp data.
Four-particle cumulants, as shown in Fig. 6, follow the ordering |c 2 {4}| p+Pb < |c 2 {4}| Pb+Pb for N ch ( p T > 0.4 GeV) >100. The magnitude of v 2 {4} derived from c 2 {4} is larger for Pb + Pb collisions than for p + Pb events with the same N ch ( p T > 0.4 GeV). For pp collisions, the cumulants depend weakly on the collision energy, although systematically larger cumulant values are measured at 13 TeV than at 5.02 TeV at low N ch ( p T > 0.4 GeV). At higher multiplicities, this systematic dependence is reversed. Over the full range of particle multiplicities, the cumulants are positive or consistent with zero at 5.02 TeV for both p T ranges and at 13 TeV for 0.5 < p T < 5.0 GeV. For the 13 TeV pp data, the cumulants for 0.3 < p T < 3.0 GeV also have positive values over the large range of multiplicities, with the exception of N ch from 130 to 150, where c 2 {4} is negative but less than 1-2 standard deviations from zero. Therefore, these measurements of c 2 {4} cumulants in pp collisions, based on the event selection that suppresses the event-by-event fluctuations in the number of reference particles, do not allow determination of the Fourier harmonics. This indicates that the c 2 {4} obtained with the standard cumulant method used in this paper, even though free of multiplicity fluctuations, may still be biased by non-flow correlations.
A comparison of results for c 2 {4} obtained with two p T ranges for reference tracks is shown in Fig. 7. For p + Pb and Pb + Pb collisions, in the region where c 2 {4} < 0, the |c 2 {4}| is larger for higherp T reference particles, as expected due to the rise of v 2 with p T . For all collision systems, it is observed that for c 2 {4} > 0, c 2 {4} is larger for higherp T reference particles. This indicates the influence of non-flow, jet-like correlations.
The six-and eight-particle c 2 cumulants are compared for p + Pb and Pb + Pb collision systems in Fig. 8 (4)). For Pb + Pb data, the c 2 {8}, obtained for both p T ranges of reference particles have negative values, and as such permit the evaluation of v 2 {8}; however, for p + Pb this requirement is only satisfied for a limited range of very high multiplicities.
The second-order Fourier harmonics, v 2 , is obtained from c 2 , following Eqs.  Fig. 10 for p + Pb and low-multiplicity Pb + Pb collisions for the two p T ranges of reference particles. All derived v 2 harmonics in Pb + Pb collisions have magnitudes larger than those in p + Pb collisions with the same multiplicity. For both systems, v 2 {2k} are similar for k = 2, 3 and 4 while v 2 {2, | η| > 2} are systematically larger. However, compared to almost degenerate values of v 2 {2k}, k > 1, a larger v 2 derived from two-particle cumulants is also predicted by models assuming fluctuation-driven initial-state anisotropies in small collision systems, either in the context of hydrodynamics as in Ref. [59] or in the effective theory of quantum chromodynamics in the regime of weak coupling [82,83]. Figure 11 shows the ratio v 2 {2k}/v 2 {2k − 2} for p + Pb and low-multiplicity Pb + Pb collisions as a function of charged-particle multiplicity. Interestingly, for Pb + Pb collisions all three ratios are independent of N ch ( p T > 0.4 GeV) beyond 120, independent of the p T range of reference particles. These observations are qualitatively consistent with the predictions of the model of fluctuating initial geometry from Ref. [59], and provide further constraints on the initial state.

Higher-order multi-particle cumulants and Fourier harmonics
Calculations of c 3 and c 4 multi-particle cumulants are statistics-limited and statistically significant results can only be obtained using two-particle cumulants with the superimposed | η| > 2 gap. Figure 12 shows a comparison   For pp data, the c 3 {2, | η| > 2} values are either negative or consistent with zero over the whole range of N ch ( p T > 0.4 GeV), except for the two highest multiplicities measured for pp collisions at 13 TeV. Therefore, for N ch ( p T > 0.4 GeV) < 100, the v 3 signal in pp collisions is undefined or zero within the errors. A positive c 3 signal is obtained from p + Pb and Pb + Pb data, except for the charged-particle multiplicities below about 50. The magnitude of c 3 is comparable for Pb + Pb and p + Pb collisions when reference particles with 0.3 < p T < 3.0 GeV are selected, and systematically slightly larger for Pb + Pb than for p + Pb for reference particles with 0.5 < p T < 5.0 GeV. The fourth-order cumulants, c 4 , have positive values of c 4 {2, | η| > 2} even for the pp data, and their magnitude is comparable to that for p + Pb and Pb + Pb collisions in the overlapping range of N ch . For N ch ( p T > 0.4 GeV) > 120, where only the measurements for p + Pb and Pb + Pb are accessible, the c 4 cumulants measured at the same charged-particle multiplicity are larger for Pb + Pb than for p + Pb.
The third-and fourth-order flow harmonics, v 3 and v 4 , calculated with two-particle cumulants with the | η| > 2 requirement are shown in Fig. 13. For p + Pb and Pb + Pb collisions the v 3 {2, | η| > 2} values are similar for reference particles with 0.3 < p T < 3.0 GeV, and much larger than for the 13 TeV pp data. For higherp T reference particles, the Pb + Pb v 3 is systematically larger than v 3 in p + Pb collisions with the same multiplicity. The v 3 increases with increasing multiplicity. A weaker increase is seen for v 4 {2, | η| > 2}, but at high multiplicities the values observed in Pb + Pb collisions are systematically larger than in p + Pb, for two p T ranges of reference particles. For multiplicities below 100, where the v 4 {2, | η| > 2} can also be obtained from pp collisions, no system dependence is seen.  For p + Pb and Pb + Pb collisions, the results for v 2 harmonics obtained with multi-particle cumulants agree very well with the CMS data [36], as shown in Fig. 15. Figure 16 shows a similar compability of ATLAS and ALICE [18] measurements of v 2 {4} in p + Pb collisions. For Pb + Pb collisions, the ALICE results on v 2 {4} are slightly above those measured by ATLAS.
A comparison of flow harmonics measured with distinct analysis methods, which primarily differ in their sensitivity to non-flow correlations, is also of interest. The method, commonly used to measure flow harmonics in small collision systems, is the two-particle correlation function method (2PC). This method was used by ATLAS to measure v n har-monics in pp and p + Pb collisions [46]. In that measurement, the non-flow correlations were suppressed by requiring the | η| > 2 gap, as in this analysis. However, additional procedures were undertaken in Ref. [46] to also suppress the jet-jet correlations. The ATLAS results for flow harmonics obtained using the two-particle correlation function method, v n {2PC}, are compared with the results reported here, obtained with two-particle cumulants, in Fig. 17. For the v 2 harmonic the two measurements give consistent results for p + Pb collisions. For pp collisions the cumulant method gives v 2 values larger than those obtained with the 2PC method, suggesting that the former are contaminated by the non-flow correlations not removed by the | η| > 2 requirement. The fact that in contrast to v 2 {2PC} the v 2 harmonic cannot be determined from the measurement of the c 2 {4} cumulant reported here for pp collisions clearly indicates that this cumulant is biased by non-flow correlations. In the case of the third-order flow harmonic, v 3   for the cumulant measurements denote statistical and systematic uncertainties, respectively. For the two-particle correlation function results, the error bars indicate statistical and systematic uncertainties added in quadrature elimination of the jet-jet non-flow correlations by the additional procedure supplementing the | η| > 2 gap in the 2PC method. The two methods give consistent results for the v 4 harmonic measured in p + Pb collisions at 5.02 TeV as well as in pp collisions at 13 TeV, indicating that the aforementioned differences between the two analysis methods have a negligible impact on v 4 .

Summary and conclusions
Multi-particle cumulants and corresponding Fourier harmonics are measured by the ATLAS experiment at the LHC for azimuthal angle distributions of charged particles in pp collisions at √ s = 5.02 and 13 TeV and in p + Pb collisions at √ s NN = 5.02 TeV, and compared to the results obtained from low-multiplicity Pb + Pb collisions at √ s NN = 2.76 TeV. The results are presented as a function of charged-particle multiplicity for two ranges of the particles' transverse momenta: 0.3 < p T < 3 GeV and 0.5 < p T < 5 GeV. For the same charged-particle multiplicity the second-order cumulants and harmonics (c 2 {2, | η| > 2} and v 2 {2, | η| > 2}), derived from two-particle correlations with the | η| > 2 gap, have larger magnitudes in Pb + Pb collisions than in p + Pb collisions. The smallest signal is observed in pp collisions. The latter show no energy or multiplicity dependence while the cumulants and the second-order harmonic increase with increasing multiplicity in p + Pb and Pb + Pb collisions.
Four-particle cumulants, c 2 {4}, show that |c 2 {4}| in p + Pb collisions is less than |c 2 {4}| measured for Pb + Pb data. For charged-particle multiplicities above 100, the c 2 {4} cumulants have negative values in p + Pb and Pb + Pb collisions, confirming the collective nature of multi-particle correlations in these collision systems. The derived magnitude of the v 2 {4} harmonic is larger in Pb + Pb collisions than in p + Pb collisions with the same multiplicity. In pp collisions, over the full range of particle multiplicities, the cumulants are positive or consistent with zero at 5.02 TeV for both p T ranges. In the 13 TeV pp data, the cumulants measured for charged particles with lower p T (0.3 < p T < 3 GeV) also have positive values over the large range of multiplicities. Therefore, these measurements of four-particle cumulants in pp collisions, based on a method that suppresses the nonflow correlations associated with event-by-event fluctuations in the number of reference particles, generally do not satisfy the requirement of being negative. This indicates that c 2 {4} cumulants obtained from the standard procedure used in this paper may still be biased by residual non-flow correlations. The c 2 {4} cumulant in 13 TeV pp collisions measured by CMS is smaller over the full range of collision multiplicities than the c 2 {4} cumulant obtained by ATLAS with the nominal event selection (EvSel_M ref ) while it is consistent with the ATLAS measurement obtained with the EvSel_N ch event selection.
Six-and eight-particle c 2 cumulants can be obtained with sufficient statistical precision only for p + Pb and Pb + Pb collisions. All derived v 2 harmonics have larger magnitudes for Pb + Pb collisions than for p + Pb collisions with the same multiplicity. For both systems, v 2 {2k} are similar for k = 2, 3 and 4 while v 2 {2, | η| > 2} is systematically larger than the second-order Fourier component calculated with cumulants of more than two-particles. Compared to the almost degenerate values of v 2 {2k}, k > 1, a larger v 2 derived from two-particle cumulants is also predicted by models assuming fluctuation-driven initial-state anisotropies in small collision systems. Interestingly, the ratios v 2 {2k}/v 2 {2k−2} for p + Pb and low-multiplicity Pb + Pb collisions are independent of the charged-particle multiplicity for N ch > 120, regardless of the p T range of particles used to calculate the cumulants and Fourier harmonics.
Higher-order cumulants, c 3 and c 4 , are measured only using two-particle cumulants with an imposed | η| > 2 gap. For pp data c 3 {2, | η| > 2} values are either negative or consistent with zero over almost the full range of N ch multiplicities, except at the highest multiplicities measured in pp collisions at 13 TeV. Therefore, the v 3 signal for pp collisions is undefined or zero within the errors. A positive c 3 signal is obtained for p + Pb and Pb + Pb data, except for the charged-particle multiplicities below ∼120. The magnitude of c 3 and the corresponding v 3 {2, | η| > 2} harmonic are comparable for Pb + Pb and p + Pb collisions when particles with 0.3 < p T < 3.0 GeV are considered, and systematically slightly larger for Pb + Pb than for p + Pb for particles with 0.5 < p T < 5.0 GeV. The fourth-order cumulants, c 4 , have positive values of c 4 {2, | η| > 2} even for the pp data, and their magnitude is comparable to that for p + Pb and Pb + Pb collisions in the overlapping range of N ch . For N ch > 120, where only the measurements for p + Pb and Pb + Pb are accessible, the c 4 cumulants measured at the same charged-particle multiplicity are larger for Pb + Pb than for p + Pb.
The ATLAS results are compared to measurements reported by CMS and ALICE. An agreement across the experiments is observed for p + Pb and Pb + Pb collisions. The comparison with the ATLAS results obtained for pp and p + Pb collisions with the two-particle correlation method shows some differences, which can be explained by the additional requirements applied in the two-particle correlation method in order to reduce the jet-jet correlations.
The comprehensive data on multi-particle cumulants presented in this paper provide insights into the origin of azimuthal angle anisotropies in small collision systems, and as such can be used to constrain the theoretical modelling of the underlying mechanism.
The following tables list individual contributions to the systematic uncertainty for the following collision systems: pp at 5.02 TeV, pp at 13 TeV, p + Pb at 5.02 TeV and Pb + Pb at 2.76 TeV. Table 3 lists contributions to the systematic uncertainty from the track selection requirements, taken as the maximum difference between the base measurement and the results obtained with loose or tight track selections in a given range of N ch , for reference particles with 0.3 < p T < 3 GeV. Table 4 includes systematic uncertainties for reference particles with 0.5 < p T < 5 GeV. Table 5 lists contributions to the systematic uncertainty due to the uncertainty in the tracking efficiency, taken as the largest difference between the base measurement and the results obtained with the efficiency varied up or down for reference particles with 0.3 < p T < 3 GeV. The maximum deviations in a given range of N ch are listed. Table 6 includes systematic uncertainties for reference particles with 0.5 < p T < 5 GeV. Table 7 lists contributions to the systematic uncertainty from the pile-up effects, taken as the maximum difference between the base measurement and the results obtained with the higher or lower pile-up in pp collisions and with the higher pile-up in the case of p + Pb collisions in a given range of N ch , for reference particles with 0.3 < p T < 3 GeV. Table 8 includes systematic uncertainties for reference particles with 0.5 < p T < 5 GeV.
Tables 9 and 10 list contributions to the systematic uncertainty related to the two running periods for p + Pb collisions at 5.02 TeV, taken as the maximum difference between the base measurement and the results obtained for the two running periods in a given range of N ch , for reference particles with 0.3 < p T < 3 GeV and 0.5 < p T < 5 GeV, respectively.