A Large-$N$ Expansion for Minimum Bias

Despite being the overwhelming majority of events produced in hadron or heavy ion collisions, minimum bias events do not enjoy a robust first-principles theoretical description as their dynamics are dominated by low-energy quantum chromodynamics. In this paper, we present a novel expansion scheme of the cross section for minimum bias events that exploits an ergodic hypothesis for particles in the events and events in an ensemble of data. We identify power counting rules and symmetries of minimum bias from which the form of the squared matrix element can be expanded in symmetric polynomials of the phase space coordinates. This expansion is entirely defined in terms of observable quantities, in contrast to models of heavy ion collisions that rely on unmeasurable quantities like the number of nucleons participating in the collision, or tunes of parton shower parameters to describe the underlying event in proton collisions. The expansion parameter that we identify from our power counting is the number of detected particles $N$ and as $N\to\infty$ the variance of the squared matrix element about its mean, constant value on phase space vanishes. With this expansion, we show that the transverse momentum distribution of particles takes a universal form that only depends on a single parameter, has a fractional dispersion relation, and agrees with data in its realm of validity. We show that the constraint of positivity of the squared matrix element requires that all azimuthal correlations vanish in the $N\to\infty$ limit at fixed center-of-mass energy, as observed in data. The approach we follow allows for a unified treatment of small and large system collective behavior, being equally applicable to describe, e.g., elliptic flow in PbPb collisions and the"ridge"in pp collisions.


Introduction
At a high energy hadron or ionized nucleus collider, the dynamics of the collision is overwhelmingly dominated by quantum chromodynamics (QCD). Because of the Landau pole and confinement of QCD at energies comparable to hadron masses, the momentum exchanged in these hadronic collisions is typically small compared to center-of-mass energies, leading to a production of a large number of observed particles with momenta not far above the QCD scale. With the high luminosities of the experiments at the Large Hadron Collider (LHC) or the Relativistic Heavy Ion Collider (RHIC), it is not currently technically feasible to record all collision events for later analysis so triggers are employed that flag an event as interesting. While passing triggers does introduce some bias in the events that are recorded, consideration of all events that pass the various triggers provides an ensemble of events that, as close as possible, probes low-energy QCD in a high-energy collision environment. Developing an understanding of these so-called minimum bias events has been an active area of research for the past several decades [1][2][3][4][5]. Despite significant research efforts, a first-principles understanding of minimum bias events is lacking precisely because it exists at the scale for which QCD becomes strongly interacting. Thus, minimum bias events are typically described by models with many parameters, like that used in modern general-purpose event simulation programs like Pythia [6] or Herwig [7]. A hydrodynamics limit is often used to model and interpret collisions of heavy ions which result in extremely high particle multiplicities [8][9][10], providing evidence for production of the quark-gluon plasma (QGP) [11][12][13][14]. However, with a model, experimental results can only be interpreted within the context of that model and without a more general framework may obscure other consistent descriptions of data. What is more, such a description relies on unphysical or unmeasurable parameters (like temperature or chemical potential), and properties of collisions are often described in terms of the unmeasurable impact parameter or overlap of the nuclei. In this paper our philosophy is to propose and explore a framework in which certain collider observables may be captured in a model independent fashion, in terms of purely physical/measurable quantities.
One instance where not being tied to a particular model(s) may be particularly useful is in the study of the relatively newly discovered phenomena of collective behavior in small systems such as pp collisions at the LHC and pA collisions at RHIC (see, e.g., Ref. [15] for a review). For instance, in order to disentangle evidence of the production of small 'droplets' of QGP from other physics that leads to particle correlations, it would be desirable to have a framework that can interpolate between different system sizes, and collision energies.
The basic idea of the model independent approach we explore is to work directly with the measured particles, and constrain the S-matrix of the minimum bias events. This philosophy aligns with a bootstrap approach to QCD which is actively being pursued for low-multiplicity processes in quantum field theory [16][17][18][19][20][21][22][23][24][25][26][27][28]. The nature and high-multiplicity of minimum bias events offers some simplifications by appealing to the principle of ergodicity: a given particle is representative of any particle in the event. Further, events with N and N + 1 particles have approximately equal descriptions, and thus we can expect to be able to expand in the multiplicity parameter N . In the absence of a first-principles understanding, we look for a detector-level effective description of minimum bias for which there exists concrete power counting and symmetries that guide the description and has parameters that can be fit to data that are then ultimately described in QCD.
At a hadron collider, there are always particles that are unobservable as they go straight down the beampipe. We show that integrating out these particles from the description of events has the effect of smearing the phase space of the observed particles in the detector in a universal fashion, reproducing the flat-in-(pseudo)rapidity distribution argued by Feynman [29]. The expansion of the matrix element of the observed particles then provides fluctuations about this flat distribution.
Importantly, the approach is applicable to observables that are binned in N . That is, it does not provide a first-principles description of fluctuations in multiplicity. Thus the kind of statements that can be made are, for example, how normalized distributions, binned in N , may change as a function of N . We also assume a fixed collider energy, Q, when taking the large N limit, in the sense that we do not consider a 't Hooft coupling-like limit of taking Q, N → ∞ at fixed Q/N ; statements about the scaling with Q, however, are obtained.
Our aim is to explore concrete, quantitative predictions that must follow from consistency of the above-described approach to minimum bias. We first define the power counting scheme for minimum bias events and identify its symmetries exclusively in terms of measurable quantities, just like the starting point of any effective field theory (EFT). Through this procedure, the relevant expansion parameter we identify is novel and unlike familiar EFTs. We formally take the number of observed particles N 1, which is similar to a hydrodynamics limit. From the general expression for the cross section for production of N final state particles in minimum bias events, we expand the squared matrix element in symmetric polynomials of the phase space coordinates, ordered in powers of 1/N . Among predictions that we validate in collider data include: • In the N → ∞ limit the symmetries of minimum bias events and the central limit theorem require that the squared matrix element is exclusively a function of the squared total energy of the observed final state particles.
• Inspired by the soft and collinear singularities of perturbative QCD, we show that the distribution of particle transverse momentum is universal and depends on a single parameter in the large-N limit. Further, the transverse momentum distribution implies that particles in minimum bias events have a fractional dispersion relation ω ∝ k 2/3 ⊥ .
• By positivity of the squared matrix element, we demonstrate that all pairwise particle azimuthal correlations necessarily vanish as N → ∞ at fixed center-of-mass collision energy.
• Long-distance pairwise particle pseudorapidity correlations are a consequence of factorization of the squared matrix element as N → ∞.
• Non-trivial pairwise particle azimuthal correlations are highly suppressed in minimum bias events in e + e − collisions because of the full Lorentz invariance of the observed final state particles.
These results and others suggest that development of an effective theory for minimum bias would describe a wide range of phenomena and could be matched to perturbative QCD or hydrodynamics. This paper is organized as follows. In Sec. 2, we establish the power counting for minimum bias in identical hadron or heavy ion collisions and the corresponding symmetries that those events enjoy. We show how to construct the squared matrix element in terms of symmetric polynomials of the phase space coordinates. In Sec. 3, we use this effective form of the minimum bias cross section to interpret collider data. We show that this large-N expansion can provide quantitative and precise predictions that agree with data in its realm of applicability. In Sec. 4, we describe the power counting and symmetries for different collider environments. Because minimum bias events in e + e − collisions enjoy extensive symmetry, we show that these symmetries strongly suppress pairwise particle azimuthal correlations in the large-N limit. In Sec. 5, we summarize our findings and discuss future avenues to further develop an effective theory of minimum bias.

Power Counting and Symmetries of pp/AA Collision Min-Bias
We first establish the power counting and symmetries we employ to describe minimum bias events. This will correspondingly precisely define what we theoretically mean by "minimum bias", which is distinct from the experimental definition. We will focus on minimum bias events ⌘ ! 1 < l a t e x i t s h a 1 _ b a s e 6 4 = " 5 s D 6 P 2 6 w d 6 B u z 0 3 2 t C E C n 4 U n I S k = " > A A A B 9 X i c b V B N S 8 N A E N 3 U r 1 q / q h 6 9 L B b B U 0 l E 0 W P R i 8 c K 9 g O a W D b b T b t 0 s x t 2 J 0 o I / R 9 e P C j i 1 f / i z X / j t s 1 B W x 8 M P N 6 b Y W Z e m A h u w H W / n d L K 6 t r 6 R n m z s r W 9 s 7 t X 3 T 9 o G 5 V q y l p U C a W 7 I T F M c M l a w E G w b q I Z i U P B O u H 4 Z u p 3 H p k 2 X M l 7 y B I W x G Q o e c Q p A S s 9 + A w I 9 k H 5 X E a Q 9 a s 1 t + 7 O g J e J V 5 A a K t D s V 7 / 8 g a J p z C R Q Q Y z p e W 4 C Q U 4 0 c C r Y p O K n h i W E j s m Q 9 S y V J G Y m y G d X T / C J V Q Y 4 U t q W B D x T f 0 / k J D Y m i 0 P b G R M Y m U V v K v 7 n 9 V K I r o K c y y Q F J u l 8 U Z Q K D A p P I 8 A D r h k F k V l C q O b 2 V k x H R B M K N q i K D c F b f H m Z t M / q 3 n n 9 4 u 6 8 1 r g u 4 i i j I 3 S M T p G H L l E D 3 a I m a i G K N H p G r + j N e X J e n H f n Y 9 5 a c o q Z Q / Q H z u c P h B G S i Q = = < / l a t e x i t > ⌘ ! 1 < l a t e x i t s h a 1 _ b a s e 6 4 = " 6 J 8 4 w c S N X D h Z F N j e T q d + 1 l 3 i n Y g = " / Y 3 C p v V 3 Z 2 9 / a r 9 s F h R 8 e p o q x N Y x G r X k A 0 E 1 y y N n A Q r J c o R q J A s G 4 w u Z 3 5 3 U e m N I / l A 2 Q J 8 y M y k j z k l I C R B n b V Y 0 C w B / G 5 x 2 U I 2 c C u O X V n D r x K 3 I L U U I H W w P 7 y h j F N I y a B C q J 1 3 3 U S 8 H O i g F P B p h U v 1 S w h d E J G r G + o J B H T f j 4 / f I p P j T L E Y a x M S c B z 9 f d E T i K t s y g w n R G B s V 7 2 Z u J / X j + F 8 N r P u U x S Y J I u F o W p w B D j W Q p 4 y B W j I D J D C F X c 3 I r p m C h C w W R V M S G 4 y y + v k s 5 F 3 W 3 U L + 8 b t e Z N E U c Z H a M T d I Z c d I W a 6 A 6 1 U B t R l K J n 9 I r e r C f r x X q 3 P h a t J a u Y O U J / Y H 3 + A G k K k v E = < / l a t e x i t > ⌘ = ⌘ max < l a t e x i t s h a 1 _ b a s e 6 4 = " P Z 4 M Q 0 m p s x v l z c r W 9 s 7 u n l 3 d 7 6 g 4 l Z i 0 c c x i 2 Q u R I o w K 0 t Z U M 9 J L J E E 8 Z K Q b j q 9 z v 3 t P p K K x u N O T h P g c D Q W N K E b a S I F d 9 Y h G 8 D L v Q e Z x 9 D g N 7 J p T d 2 a A y 8 Q t S A 0 U a A X 2 l z e I c c q J 0 J g h p f q u k 2 g / Q 1 J T z M i 0 4 q W K J A i P 0 Z D 0 D R W I E + V n s 9 O n 8 N g o A x j F 0 p T Q c K b + 3 s g Q V 2 r C Q z P J k R 6 p R S 8 X / / P 6 q Y 4 u / I y K J N V E 4 P l D U c q g j m G e A < l a t e x i t s h a 1 _ b a s e 6 4 = " r T 9 C U M d k O A e 0 i + N P V + h q w j s 2 c y g = " > A A A B + 3 i c b V D L S g M x F M 3 U V 6 2 v s S 7 d B I v g x j I j F d 0 I R T c u K 9 g H d I Y h k 2 b a 0 C Q z J B l p G e Z X 3 L h Q x K 0 / 4 s 6 / M W 1 n o a 0 H L v d w z r 3 k 5 o Q J o 0 o 7 z r d V W l v f 2 N w q b 1 d 2 d v f 2 D + z D a k f F q c S k j W M W y 1 6 I F G F U k L a m m p F e I g n i I S P d c H w 3 8 7 t P R C o a i 0 c 9 T Y j P 0 V D Q i G K k j R T Y V Y 9 o B G / O Z y 3 I P I 4 m e W D X n L o z B 1 w l b k F q o E A r s L + 8 Q Y x T T o T G D C n V d 5 1 E + x m S m m J G 8 o q X K p I g P E Z D 0 j d U I E 6 U n 8 1 v z + G p U Q Y w i q U p o e F c / b 2 R I a 7 U l I d m k i M 9 U s v e T P z P 6 6 c 6 u v Y z K p J U E 4 E X D 0 U p g z q G s y D g g E q C N Z s a g r C k 5 l a I R 0 g i r E 1 c F R O C u / z l V d K 5 q L u N + u V D o 9 a 8 L e I o g 2 N w A s 6 A C 6 5 A E 9 y D F m g D D C b g G b y C N y u 3 X q x 3 6 2 M x W r K K n S P w B 9 b n D 3 Y l l B Y = < / l a t e x i t > p ? ⇠ Q/N < l a t e x i t s h a 1 _ b a s e 6 4 = " k r 5 1 k G B 5 R f R N / 0 t R 2 0 9 H u 8 y R E r E = " > A A A B + X i c b V D L S g M x F M 3 U V 6 2 v U Z d u g k V w V W e k o s u i G 1 f S g n 1 A Z x g y a a Y N T T I h y R T K 0 D 9 x 4 0 I R t / 6 J O / / G t J 2 F V g 9 c O J x z L / f e E 0 t G t f G 8 L 6 e 0 t r 6 x u V X e r u z s 7 u 0 f u I d H H Z 1 m C p M 2 T l m q e j H S h F F B 2 o Y a R n p S E c R j R r r x + G 7 u d y d E a Z q K R z O V J O R o K G h C M T J W i l x X R o E k S s J A U w 5 b F w + R W / V q 3 g L w L / E L U g U F m p H 7 G Q x S n H E i D G Z I 6 7 7 v S R P m S B m K G Z l V g k w T i f A Y D U n f U o E 4 0 W G + u H w G z 6 w y g E m q b A k D F + r P i R x x r a c 8 t p 0 c m Z F e 9 e b i f 1 4 / M 8 l N m F M h M 0 M E X i 5 K M g Z N C u c x w A F V B B s 2 t Q R h R e 2 t E I + Q Q t j Y s C o 2 B H / 1 5 b + k c 1 n z 6 7 W r V r 3 a u C 3 i K I M T c A r O g Q + u Q Q P c g y Z o A w w m 4 A m 8 g F c n d 5 6 d N + d 9 2 V p y i p l j 8 A v O x z d 4 / Z L o < / l a t e x i t > Figure 1: Schematic illustration of detector (cylinder), its pseudorapidity range η max , the beam regions around η = ±∞, and the characteristic transverse momentum of detected particles.
at a hadron collider, in particular, proton-proton (pp) or identical heavy ion (AA) collisions. We assume that the lab frame is also the center-of-mass frame of the collisions. The center-ofmass collision energy Q is assumed to be much larger than the masses of the initial, colliding particles. Throughout this paper, we assume that Q is an arbitrary, but fixed, energy, but will occasionally comment on the dependence of distributions of observables as a function of Q. Additionally, we assume that the experimental detector cannot measure all particles and that there is an unmeasureable beam region. Concretely, we assume that there is a maximum pseudorapidity η max of the detector and particles with pseudorapidity |η| > η max are lost down the beampipe. Finally, we assume that our detector can only measure particle momenta, but has perfect angular and energy resolution.
With these assumptions, we can establish power counting that defines minimum bias events and the corresponding symmetries of our collision experiment and of the detected final state particles. The power counting that we take is as follows.
1. The beam is a small angular region outside the detection apparatus and we restrict our description of the event to far from the beam region, where detected particle pseudorapidity satisfies |η| ∼ 1 η max .
2. We assume that the mass of the particles is irrelevant and so detected particle transverse momentum p ⊥ is parametrically larger than the QCD scale or pion mass, p ⊥ m π .
3. The momentum lost down the beam region is an order-1 fraction of the center-of-mass energy Q.
4. The number of detected particles N for which their pseudorapidity |η| η max is large: 5. We assume that the mean transverse momentum of the detected particles is representative of all particles' momenta and so the mean and the root mean square momenta are comparable: p ⊥ ∼ p 2 ⊥ . We will see below how these power counting assumptions then imply that our expansion parameter is 1/N ∼ p ⊥ /Q 1. 1 Additionally, these power counting assumptions imply that the minimum bias events we consider satisfy something like an ergodic hypothesis. That is, we assume that any individual particle in an event is representative of all particles in an event.
Thus, averages over particles in an event are equivalent to averages of events over an ensemble. This ergodic hypothesis will have important implications for the structure of the squared matrix element in the N → ∞ limit. An illustration of the physical configuration established by these power counting assumptions is provided in Fig. 1.
Hadron collision events that satisfy these power counting requirements then enjoy the following symmetries: Most of these symmetries should be apparent as a consequence of the familiar cylindrical shape of particle detectors. Because we assume only particle momenta are measured, no distinguishing information of particles (like electric charge) is collected, and hence there is complete permutation symmetry. By our power counting, we assume that the beam region is defined by η max 1. Taking the formal limit η max → ∞, a translation in η by a finite amount ∆η can never move particles out of the detected region into the beam region, or vice-versa.
We then define minimum bias events in hadron collisions as those that satisfy the power counting assumptions, and therefore inherit the symmetries. Experimentally, minimum bias events are typically defined as those that pass a minimal trigger threshold for, say, activity in a forward calorimeter. As the name suggests, passing a trigger does bias the event selection somewhat, but this bias can potentially be reduced by performing other cuts on these events. For example, to force an event into the regime described by our power counting, one could require that a fixed fraction of particles have a transverse momentum within some range about the mean transverse momentum. Or, one could require that an event does not have any identified jets with transverse momentum larger than some fraction of the center-of-mass energy. We will return to jets below, and explicitly demonstrate that the existence of hard jets in the final state violates our power counting.
Because of the high luminosity of modern hadron colliders, truly zero bias events can collected by exploiting pile-up, or secondary hadron collisions per bunch crossing. A hard hadron collision that produces jets or high-energy leptons, for example, would pass the triggers, but then secondary hadron collision events in the bunch crossing would have no requirements placed on them, and still be measured. The challenge with extracting these zero bias events is then pushed to the ability to distinguish the point of collision where different particles were produced. Any number of our power counting assumptions can be relaxed to provide a more realistic description of these events; however, we will find that even these strongly constraining assumptions will be able to explain and understand a wide breadth of data in various collider environments.
(Q/p ⊥cut ) 1 2 in this scaling relation, with p ⊥cut the experimental cut on transverse momentum. This factor is fixed over an ensemble of events and we absorb it into '∼' here and in the following section when evaluating the scaling in 1/N of terms in the expansion.

Effective Form of the Cross Section
We will use the power counting and symmetries to provide an effective description of the scattering cross section for these minimum bias events exclusively in terms of properties of the measured particles. To derive this description, we start from the completely general formula for the cross section, considering 2 → N + N B 1 + N B 2 scattering events, where N B 1 + N B 2 particles are beam remnants; i.e., N B 1 particles have pseudorapidity η > η max , N B 2 particles have −η > η max , and N particles have |η| < η max . The cross section σ for this scattering can be expressed in generality as In writing this expression, we have ignored overall factors of the center-of-mass energy Q and powers of 2π. In the first line, we implicitly write the cross section as the integral of the squared matrix element over N +N B 1 +N B 2 -body phase space, and then in the remaining lines, expand out the massless Lorentz-invariant integration measure and momentum-conserving δ-functions. k + i and k − i are the plus and minus components of particle i's lightcone momentum with respect to the beam axis (the z axis). The momentum-conserving δ-functions fix the lab frame to be the center-of-mass frame of the collisions. Now, with the assumption that η max 1, we will formally take η max → ∞, so the beam remnants live at ±∞ pseudorapidity. With that assumption, their transverse momenta are 0 and one component of their lightcone momenta is also 0, depending on whether the particles are in beam 1 or beam 2. These assumptions simplify the momentum conserving δ-functions and the cross section becomes It may seem like the assumption that the beam remnants carry no net transverse momentum is too constraining. However, assuming that the number of detected particles N is large and the correlations between the detected particles are relatively weak, the net transverse momentum of the detected particles is necessarily small. The detected particles in an individual event can be imagined to randomly walk in the transverse momentum plane and the average net transverse momentum in an ensemble of events will be 0. As a random walk, the standard deviation of the net transverse momentum in this ensemble of events scales like 1/ √ N at fixed collision energy Q. Then, to leading approximation in the large-N limit, we assume that the net transverse momentum of the detected particles is 0. It would be interesting to determine the consequences of relaxing this assumption, but we leave that to future work. Because we do not measure the particles in the beam region, we want to integrate over them. With the assumption of permutation symmetry of the particles and the fact that we assume that all beam remnants travel in the exact same direction down either pipe, the only quantity that we can be sensitive to in experiment is their total momentum. That is, the squared matrix element can only depend on a function of the total plus or minus lightcone momentum that goes down the beampipes: Here, f is some function of the total lightcone momentum that goes in either beam direction.
In the final line, we have expressed squared matrix element with dependence on the measured particles 1, . . . , N and dependence on the total lightcone momentum in the beam regions is left implicit. We will return to the explicit form of this matrix element later. The η → −η symmetry of the event requires that the function f is symmetric in its arguments: f (x, y) = f (y, x). Further, by the η → η +∆η translation symmetry, the argument of the function f must be the product of the total plus and minus lightcone momenta, because lightcone momentum components transform homogeneously under boosts along the z axis. Then, the squared matrix element can be expressed as By momentum conservation and choosing the collision frame to be the center-of-mass frame, we also know that the total lightcone momentum lost down the beam regions is related to the lightcone momentum of the detected particles. In particular, Thus, we can equivalently express the function f exclusively in terms of directly measurable quantities. We define so that the argument of the function f is just the product k + k − , by the same reasoning above.
That is, the matrix element can be expressed as for some function f (k + k − ) and our power counting assumes that the momentum lost down the beam regions (or, the z-axis boost of the detected particles), is of a comparable size to the center-of-mass energy, Q 2 . Then, with the power counting and symmetries enforced on the form of the matrix element, in the cross section we can integrate over the momentum lost down the beams. The cross section can then be expressed as In doing the integrals over the unobservable beam remnant particles, we have ignored overall factors of the center-of-mass energy Q, and additional dependence on the product k + k − has been absorbed in the definition of f (k + k − ). Because f (k + k − ) is a physical squared matrix element, we make the reasonable assumption that it is finite and analytic on its domain. It can therefore be expanded in an appropriate basis of orthonormal polynomials, and this sum can be truncated for approximation to fit data. In analysis that we will present later, it will be useful to know the volume of this N -body phase space smeared by a boost along the beam axis. For a center-of-mass energy Q, the volume of N -body phase space is Setting the function f (k + k − ) and the squared matrix element |M(1, 2, . . . , N )| 2 to unity, the volume of this smeared phase space is therefore We also note that the topology of the smeared phase space is that of a (3N − 2)-ball, found by integrating over two of the dimensions of the N -body phase space manifold [30].

Expansion of the Squared Matrix Element
Having established the form of the cross section for minimum bias events expressed exclusively in terms of observable particle momenta, we now use the power counting and symmetries to expand the squared matrix element |M(1, 2, . . . , N )| 2 . The natural expansion parameter in which to do this according to our power counting is the number of detected particles N 1. Namely, we would want to determine the momentum dependence of the squared matrix element systematically at every power of 1/N , with undetermined parameters that can be fit to data. We will do this in a few steps. First, we will present an expansion of the squared matrix element in inverse powers of the center-of-mass energy Q, for which terms can be established order-by-order through use of momentum conservation. From this expansion, we can identify the scaling with N for each of the terms and reorganize the expansion according to our power counting rules.
We first note that, with this construction, the leading approximation of the squared matrix element is that it is constant. As we can fix overall normalization later, the leading approximation of squared matrix element in the 1/Q expansion is A linearly-independent set of momentum-dependent terms at higher powers in 1/Q can be established using momentum conservation. The transverse and longitudinal momentum conservation equations of the detected particles are: On the right, we have expanded the equalities in terms of permutation-symmetric polynomials of the phase space coordinates. The symmetries of minimum bias events forbid terms at order 1/Q (and actually any odd power of this) and these fundamental momentum conservation equations define a linearly-independent set of terms at order 1/Q 2 . We can then write where c 1 is some dimensionless, numerical coefficient. We will discuss the potential dependence of c 1 on the number of particles N later. To construct linear relationships between terms at higher mass dimension, we can multiply the momentum conservation equations by symmetric polynomials of the minimum bias phase space coordinates. For example, two identities that can be used to establish terms at order 1/Q 4 are: Again, on the right, we have expanded the identities in terms of independent symmetric polynomials. Then, through the first few orders, the squared matrix element can be expanded as: The c (4) i are dimensionless numerical coefficients. Linear relationships from conservation of momentum have already been accounted for and (hyperbolic) trigonometric identities have been used. Note also that any terms that exclusively depend on the product k + k − can be absorbed into the function f (k + k − ) and are not included.
The numerical coefficients c in general depend on the number of detected particles N . For the most part, we will be agnostic as to what their precise scaling with N is, but we will make the following weak, but constraining, assumption. We assume that, in the N → ∞ limit, the dependence of the coefficients c (n) i on N is such that the entire corresponding term in the squared matrix element remains finite. For example, consider the term at O(Q −2 ). With our power counting assumption that p ⊥ ∼ Q/N and ergodicity, the kinematic dependence of this term scales like 1 because there are N terms in the sum and each term scales like 1/N 2 . Thus, the coefficient c 1 can scale at worst linear in N as N → ∞ and the whole term will still be finite in the squared matrix element. 2 Further, the assumption of ergodicity implies that in the limit that N → ∞, the entire matrix element becomes a constant, independent of the event's dynamics. Our ergodic assumption implies that the symmetric sums over terms that appear in the squared matrix element reduce to the corresponding mean values for N → ∞. For example, consider the term at O(Q −2 ) again. The sum of the squared transverse momentum of all particles in the event returns the mean square transverse momentum p 2 ⊥ times N , as N → ∞. Because the sum consists of N terms, the variance of the sum is also of order N and so the whole expression takes the form In the strict N → ∞ limit, the variance of the sum vanishes, reducing to a constant value on all of phase space. Every term in the squared matrix element reduces to its mean as N → ∞ by ergodicity, and so the whole squared matrix element itself becomes a constant on the N -body phase space as N → ∞. We will exploit this limit of the matrix element in our predictions of the following section. While constructive, this 1/Q expansion isn't exactly the expansion that is natural with our power counting. Terms at a fixed order in 1/Q do not in general have a homogeneous scaling with 1/N . For example, at O(Q −4 ), the term 19) as N → ∞. The particular form of higher-order terms in 1/N is subtle to determine because, in general, terms that arise in the 1/Q expansion don't have a homogeneous scaling in 1/N . For example, assuming that k + k − is homogeneous in powers of N with k + k − ∼ N 0 , then from Eq. (2.12) the term with hyperbolic cosine is not homogeneous in N : This follows because of the assumption that η i ∼ 1 and so cosh(η i − η j ) ∼ 1, the transverse momentum p ⊥i ∼ 1/N and that the sum consists of N 2 − N terms. We do not attempt to determine the general form of the 1/N expansion of the squared matrix element here, although a systematic approach could be pursued along the lines of Refs. [31][32][33][34][35][36][37]. However, what we will do in Sec. 3, when using the power counting and symmetries to understand collider data, 2 In matching this expansion to a short-distance description from QCD, the coefficients c (n) i can generically also depend on the ratio of collision energy Q to the QCD scale ΛQCD or pion mass mπ. Explicitly doing the matching or resumming large logarithms of this ratio requires an effective field theory for minimum bias, which we do not construct here and leave to future work. is identify the 1/N expansion for particular subsets of terms in the squared matrix element that are relevant for the specific measurements of interest that we consider in this work.
One important point to make at this stage is that we are considering fixed, but large, number of particles N . That is, our expansion can make predictions for observables that survive in the N → ∞ limit or for observables that are conditioned on N . We will see a number of examples of such observables in Sec. 3. However, with our present formulation of the cross section for minimum bias, we cannot make predictions for which the relative probability of different numbers of detected particles is important. Perhaps the simplest such observable is just the multiplicity N of detected particles itself. We leave an understanding of particle multiplicity within this framework to future work.

Where are Jets?
Jets, while the most ubiquitous phenomena of high-energy QCD, are not described within the framework of our minimum bias expansion of the cross section. From the perspective of our goals with this expansion, this is a good thing, as it implies that the expansion focuses on the physics of interest, namely the low-energy dynamics of QCD. Specifically, the presence of high-energy jets in the detected final state particles violates the power counting assumption that the mean particle transverse momentum, p ⊥ , is representative of all particles' transverse momentum. Jets are collimated streams of particles which carry an order-1 fraction of the jet's energy. Very low energy particles fill in the regions at wide angles from the jets. The distribution of particle transverse momentum then has two dominant populations corresponding to high energy or very low energy particles. Of course, for any distribution of N particles with total energy Q, the mean transverse momentum p ⊥ ∼ Q/N . However, when jets are present in the final state, the root mean square transverse momentum p 2 ⊥ will be dominated by the high energy particles of the jets, and so we find the hierarchy that p 2 ⊥ p ⊥ ∼ Q/N . This then means that jets are not described by any low-order truncation of either the 1/Q or 1/N expansion of the squared matrix element.
Of course, this is not surprising as jets arise because of the approximate scale invariance of QCD at high energies, in addition to the small coupling α s . Even approximate scale invariance is not a symmetry of our definition of minimum bias, and so is not exploited in the expansion of the squared matrix element. Further, the smallness of α s for controlling jet dynamics means that jets can be described by a perturbative effective field theory, known as soft-collinear effective theory [38][39][40][41]. For minimum bias, because the only relevant energy scale is the mean transverse momentum p ⊥ ∼ Q/N Q, we cannot assume that α s is small so as to formulate a perturbative effective field theory.
Nevertheless, even in the presence of collimated jets, the low energy particles that fill the detector away from the jets should exhibit many of the properties we have already discussed. Namely, as long as the number N of low energy particles not in the jets is large, we still expect their net transverse momentum to be approximately 0, up to the relative 1/ √ N standard deviation of a random walk. However, jets in the final state introduce new, relevant directions that spontaneously break the symmetries of minimum bias as well as introduce correlations that would manifest themselves in the form of the squared matrix element. As discussed, such correlations would not be captured to low orders in our expansion of the cross section.

Interpretation of Data
In this section, we exploit this expansion of the cross section for minimum bias events to understand and re-interpret collider data from this perspective. Because the expansion as we have developed it thus far can describe any identical hadron or nucleus scattering, we apply it to understand data from both pp and heavy ion collisions. Particularly in the case of heavy ion collisions, experimental data are very often interpreted or expressed in terms of strictly unobservable quantities as established in some model of the collision, like the centrality or the number of participating nucleons. By contrast, our expansion is expressed exclusively in terms of directly observable quantities, like the total number of detected particles or the total observed final state energy.

Pseudorapidity Distributions
The first observable that we consider is the pseudorapidity distribution of observed final state particles. This can be calculated directly from the form of the minimum bias cross section we established in Eq. (2.8). From the permutation symmetry of the particles, every particle has the same pseudorapidity distribution p(η), so we can just fix the particle of interest to be particle 1. Then, we have To proceed, we will make a number of simplifying assumptions. First, we will work in the N → ∞ limit in which the squared matrix element |M(1, 2, . . . , N )| 2 reduces to a constant on phase space, as discussed earlier. Because this constant is just an overall scaling, we can take it to be 1, for simplicity. With this assumption, we can then determine the pseudorapidity distribution on flat phase space p flat (η), in the large-N limit. We have To get this expression, we used the large-N limit of the volume of phase space from Eq. (2.9) and ignore overall factors. The normalized probability distribution of particle pseudorapidity on flat phase space is then In the large-N limit, this can also be directly derived from assuming that all particles have a distribution flat in cos θ, the polar angle from the beam. Note that this large-N limit ignores momentum conservation in transforming to the exponential integrand. We can then use this result to determine the observed pseudorapidity distribution p(η) smeared against the function f (k + k − ). We have In the final line, we have set xQ 2 = k + k − and integrated over the pseudorapidity of the system of final state particles in the lab frame. Written in this way, f (x) is itself a probability distribution whose normalization is inherited from its expression in terms of k + and k − : We also note that because f (x) ≥ 0 on x ∈ [0, 1], the pseudorapidity distribution p(η) monotonically decreases as |η| increases away from η = 0. The integral in the final line of Eq. (3.4) is also dominated by the region for 0 ≤ x 1/ cosh(2η).
To go further, we need an explicit form for the function f (x). As we noted earlier, f (x) should be analytic on x ∈ [0, 1] because it is a physical squared matrix element. Also, because of the collinear singularity of QCD, we expect a very flat distribution of pseudorapidity over a wide range. From our expression for the pseudorapidity distribution p(η) in Eq. (3.4), we note that if f (x) = δ(x), then p(η) is flat, but not normalizable. So, combining these observations suggests that f (x) should be highly-peaked at x = 0 and analytic. This motivates the following form: where n 1 ensures strong peaking at x = 0, H n is the harmonic number and γ E is the Euler-Mascheroni constant. 3 We discuss the independence of our results on the precise form of this function shortly. 3 We have currently described the function f (k + k − ) as a function that corresponds to smearing the flat phase space distribution. However, it can equivalently be interpreted as a squared matrix element |M| 2 for the N detected particles using the relationship from Eq. (2.12). With this interpretation, the squared matrix element is to leading order in 1/N in the exponent. Of course, regardless of the interpretation, we find the same predictions for the desired measured quantities.

Pseudorapidity Distribution
���+����� ���� ������� ����� ����� ���� ����� ����� To determine the value n, we must use data. Pseudorapidity distributions have been measured extensively in the ATLAS [42][43][44][45], CMS [46][47][48][49][50][51][52] and ALICE [53][54][55][56][57][58][59][60][61][62][63] experiments at the LHC, in both pp and heavy ion collisions. From those data, we can determine the parameter n by noting the following. The mean value of x from Eq. (3.6) is roughly 1/n, which is the region that dominates the integral. We call η 1/2 where the pseudorapidity distribution is approximately half of its value at η = 0 which occurs when 2x cosh(2η 1/2 ) ∼ 1; or when n = 2 cosh(2η 1/2 ) . (3.8) From the 8 TeV LHC pp experiment data that follows, for instance, the value of η 1/2 ∼ 6, and so n ∼ 1.6 × 10 5 . The parameter n has implicit Q dependence, at least by the connection to maximum value of η. We emphasize that in what follows that the fit above to the tail around |η| η 1/2 is not important for the observables that are captured in our approach; that is, any step-like function with a cutoff around η 1/2 would suffice. Kinematic observables binned in N and at fixed Q in the region of validity of this effective framework are insensitive as n → ∞; we will see this explicitly below with the single particle p ⊥ distribution, where we also give a quantitative estimate for the validity condition. In order to correctly capture scalings with Q, the value of n is important in that it captures the value of η 1/2 , which scales with Q. Any other step-like function, however, would capture the same scaling and, at least at large enough Q, this scaling should be independent of the precise form of the tail.
The above established value of n can then be used to compare our predicted pseudorapidity distribution to data. This is done in Fig. 2 where we compare our prediction to data from the CMS and TOTEM experiments [49] of the charged particle number density as a function of pseudorapidity, dN/dη, in 8 TeV pp collisions. Our prediction for the pseudorapidity distribution is just a normalized probability distribution, so we multiply by a factor to fit data. As mentioned earlier, our present formulation of the expansion of the minimum bias cross section does not enable us to predict the multiplicity distribution, so we can't predict the overall normalization here. In general, we find good agreement between our prediction and data, with a noticeable lack of a dip in our prediction near η = 0. Hadrons are of course massive particles and there is a distinction between their rapidity and pseudorapidity which is manifest as the dip in the pseudorapidity distribution. We assumed that the transverse momentum of particles is much larger than their mass, so we ignore the pseudo/rapidity distinction. However, the data include particles with transverse momentum above 40 MeV which includes hadrons with transverse momentum comparable to their mass.

Transverse Momentum Distributions
We now turn to understanding transverse momentum distributions in pp collisions. The setup for this analysis will be the same as that for pseudorapidity. We take the squared matrix element for the detected particles |M| 2 = 1 and take the large-N limit of phase space. The first steps are therefore very similar to that for pseudorapidity, so we won't repeat them here. From Eq. (3.2), the flat phase space distribution of the transverse momentum is where K 0 (z) is a modified Bessel function. The unit normalized distribution is With this result, the distribution smeared with the function f (k + k − ) is, in general, Unlike the pseudorapidity distribution, the transverse momentum distribution depends explicitly on the number of detected particles N . This number fluctuates event-by-event and we do not predict the multiplicity distribution. So, instead, we will re-write this distribution in terms of the mean transverse momentum, which we can calculate and is unique over an ensemble of collision events. This mean is Substituting this expression for N , the transverse momentum distribution is then With f (x) from Eq. (3.6), the mean transverse momentum is (3.13) in the n → ∞ limit. The p ⊥ distribution with this form of f (x) is (3.14) Data of the transverse momentum distribution are often displayed as a number density per phase space volume, and so what we will plot is actually Note that once we determine f (x) from pseudorapidity data, the prediction of the transverse momentum distribution only depends on a single parameter, p ⊥ . Additionally, a cut on the maximum pseudorapidity of particles that contribute to this distribution can be incorporated, as such a cut is always imposed in data. However, any such cut that is relevant experimentally has an exceedingly small effect on the transverse momentum distribution, so we will not include it in what follows.
The reason for effective independence on a pseudorapidity cut is as follows. In the large n limit, the transverse momentum distribution of Eq. (3.14) is itself independent of n. The Bessel function has an asymptotic form of for z 1. With this approximation, the distribution is ignoring overall constant factors. Now, with n 1, we can saddle-point approximate the exponential. The value of x for which the exponent factor is minimized is Just setting x in the integrand equal to this minimum value, taking n → ∞ and ignoring non-exponential factors, we have As the value of n in turn determines the maximum value of pseudorapidity according to Eq. (3.8), as long as n is large enough, any dependence on a pseudorapidity cut is eliminated. This probability distribution is like the Boltzmann factor for the "gas" of N detected particles. Hence, they have a dispersion relation of ω ∝ k 2/3 ⊥ . Fractional dispersion relations are very strange, but can arise from integrating out a gapless subsystem of a larger system [64]. This is essentially what we did here: the beam remnants were gapless, but we exclusively wanted an effective description of the detected particles. Integrating out gapless modes introduces non-locality in the effective theory, which is manifested as a non-analytic dispersion relation. In particular, note that the non-analyticity is not present in the flat phase space distribution of Eq. (3.10). The asymptotic form of the Bessel function implies that ignoring overall non-exponential factors. No gapless modes are integrated out on flat phase space, hence the dispersion relation ω ∝ k ⊥ is analytic. The particular form of the dispersion relation implied by Eq. (3.19) may provide a hint towards the construction of an effective field theory for minimum bias based on spontaneously broken symmetry breaking patterns, but we leave that to future work. We can also verify the consistency of our transverse momentum prediction, within the framework of our assumed power counting. The second moment of the transverse momentum is where the final equality is the large n limit of the second moment of the distribution of Eq. (3.14). Then, the root mean square and expectation value of the transverse momentum are related by satisfying our power counting of p 2 ⊥ ∼ p ⊥ and demonstrating consistency of our prediction.
With this prediction for the transverse momentum distribution, we can compare it to collider data. Fig. 3 shows the transverse momentum distribution of charged particles in minimum bias events in √ s = 7 TeV pp collisions at CMS from Ref. [47]. A pseudorapidity cut of |η| < 2.4 is also imposed, but, as mentioned earlier, that will not matter for our prediction. This figure also plots the asymptotic form of our prediction from Eq. (3.19) where we have set the parameter p ⊥ = 0.65 GeV, as well as the transverse momentum distribution on flat N -body phase space. Our prediction is in remarkably good agreement with the data over the plotted range of transverse momentum. We want to emphasize that once the form of the smearing function f (x) is determined from pseudorapidity data, our prediction for the transverse momentum distribution depends on only one parameter, the mean value of the transverse momentum. While we are unable to directly predict the multiplicity distribution from this framework, we can nevertheless demonstrate consistency between parameters in the expansion of the cross section with measured values of the mean multiplicity. From Eq. (3.13), the expected value of N with our choice for the form of the function f (x) is While this is a very simple and crude prediction, it is nevertheless in the same order-ofmagnitude as the number of observed charged particles from Fig. 2, for example. In that figure, the number density of charged particles with transverse momentum greater than 40 MeV is roughly 6.5 per unit pseudorapidity for |η| 5.5. So, there are roughly 72 charged particles in each event. Our prediction is about a factor of 3 smaller, which is likely accounted for by the transverse momentum cut. 40 MeV is less than the mass of the pion, and so violates an assumption of our power counting. Increasing this cut would correspondingly decrease the number of detected particles, while not affecting our fit value for p ⊥ .

Transverse Momentum Distribution ��� ���� ������� ����� ����� ���� ����� �����
Further, the expression for multiplicity N from Eq. (3.23) implies a non-trivial dependence on the center-of-mass collision energy. First, note that the value of n in the form of the function f (x) is related to the collision energy Q through Eq. (3.8): Here, p ⊥cut is the experimental lower bound on detected particle transverse momentum. Then, as long as the dependence of the mean transverse momentum p ⊥ on the collision energy Q is relatively weak, the multiplicity scales with a fractional power of Q: This particular fractional power scaling is a prediction of the Fermi-Landau model [65][66][67][68]. In the context of our analysis here, we note that it is a consequence of our particular choice of the function f (x) in the smeared cross section. Additionally, the inclusion of a squared matrix element with non-trivial dependence on the detected particles will affect this scaling. Data prefer a slightly smaller power-law scaling of the multiplicity, e.g., Ref. [61] in which a power law of s 0.11 fits the charged particle multiplicity over decades of collision energies. Nevertheless, this simple result within the context of our large-N expansion suggests that an appropriate squared matrix element could fit the data. We leave a more detailed analysis of the collision energy dependence of the multiplicity to future work. Data of transverse momentum distributions in minimum bias are often compared to Tsallis distributions [69][70][71] that assume there are fluctuations in the N particle final state that are quantified by a non-extensive form of entropy. This is an intriguing interpretation and the success of such models may point to fundamental fractal-like structure of particles produced in minimum bias collisions. While not inconsistent with this interpretation, our expansion of the minimum bias cross section has a more mundane understanding as a consequence of the symmetries of these collision events. Further, the Tsallis distribution reduces to a power law at large transverse momentum, while our smeared prediction is exponential, though dependent on a fractional power of transverse momentum. A detailed study to distinguish the consequences of these two (or other) models of minimum bias dynamics may reveal the microscopic description of these events and lead to an effective field theory in which precision calculations can be performed.

Limit of Large-N Expansion
With data that extends to higher values of transverse momentum, we can see the limit of our prediction, with the assumptions we have made thus far. In particular, we have assumed that the squared matrix element is just 1, which is its expression at lowest order in the 1/N expansion. However, at higher transverse momentum, higher order terms in the squared matrix element become more important and may be necessary to describe the distribution. In Fig. 4 for parameters n and T . In the plot, we have set the parameters in our prediction and the Tsallis distribution to be p ⊥ = 0.5 GeV, n = 6.6 and T = 0.12 GeV. At low transverse momentum, both our prediction and the Tsallis distribution follow the data extremely well, but around 5 GeV, our prediction exponentially drops, while the Tsallis distribution largely follows the power-law distribution at high transverse momentum. It would be interesting to study the effect of maintaining momentum conservation in our prediction and non-trivial terms in the matrix element to see if they can reproduce the high-p ⊥ tail, but we leave that to future work. However, with the approximations that we have made thus far, we can estimate where our description of the transverse momentum distribution should break down. In the derivation of our results, an important approximation that we made in the large-N limit was that momentum conservation could be ignored, which in turn resulted in an exponential appearing in the calculation of distributions on flat phase space in Eq. (3.2). By carefully analyzing this approximation, we can identify where it breaks down. First, the large-N flat phase space factor can be written as

Transverse Momentum Distribution
So far, this is exact, but higher-order terms in the exponent can be safely ignored only if (3.29) Recall that k + k − is the total squared energy of the detected particles. In coordinates that we used earlier, we expressed 30) and so the limit on transverse momentum is , (3.31) where x ∈ [0, 1] and To set an upper bound for the transverse momentum, we first maximize over x and ∆η, for which x = 1 and ∆η = 0: That is, the energy of an individual particle must be parametrically less than the collision energy. We had found that for pp collisions with energy of several TeV, the maximum pseudorapidity was about η max ∼ 6. For Q = 2.76 TeV from Fig. 4, the transverse momentum then must be smaller than This estimate of the limit agrees well with Fig. 4, in which our prediction diverges from data well before 14 GeV. Conversely, for a fixed transverse momentum p ⊥ , the maximum pseudorapidity is logarithmically related to the collision energy Q, which is well known [29,72].

Azimuthal Correlations
In this section, we discuss azimuthal correlations between pairs of particles produced in pp or heavy ion collisions. We will focus on the ellipticity and the long pseudorapidity-distance correlations or "ridge" phenomena later, but we will first determine the form of the probability distribution of the pairwise azimuthal angle difference within the context of our minimum-bias expansion. Non-trivial azimuthal correlations require a non-trivial squared matrix element. In general, the form of the terms in the squared matrix element relevant for azimuthal correlations are for some coefficient functions g n (k + k − , N ). On the second line, we have absorbed multiplicative factors in the expansion of ( p ⊥i · p ⊥j ) n into a redefinition of g n (k + k − , N ). All other terms in the squared matrix element that are independent of azimuthal differences would just contribute to the constant "1" term and would therefore just be an overall normalization. With our power counting assumption that p ⊥ ∼ p 2 ⊥ ∼ Q/N , we replace p ⊥ ∼ Q/N to establish the scaling with the number of observed particles N . Our squared matrix element then becomes The maximal scaling with N of the coefficient functions g n (k + k − , N ) can be established by demanding that the squared matrix element is non-negative. To do this, we note that, in the N → ∞ limit, transverse momentum conservation is trivially satisfied and so azimuthal angles are uncorrelated and uniformly distributed on flat phase space: Thus, on flat phase space, the mean of the sum over the cosine factors of the difference of azimuthal angles is 0: On the other hand, the central limit theorem states that the variance σ 2 of the sum of cosine factors is as there are N 2 terms in the sum, in the N → ∞ limit. To ensure that the squared matrix element is non-negative, the sum over cosine factors at each value of n must not be significantly negative to overwhelm the constant "1" term. Therefore, we must enforce that the bulk of the possible values of the sum over cosines is less than 1: That is, the coefficient functions g n (k + k − , N ) are required to scale with N no greater than This property will have important consequences for the large-N predictions of azimuthal correlations. 4 Continuing, we can integrate over the smeared phase space to establish the probability distribution for the pairwise azimuthal angle difference, ∆φ. We have Here, we have used the permutation symmetry of particles to just define ∆φ as the difference of the azimuthal angles of particles 1 and 2. The Fourier coefficients d n (N ) are defined to be Note that the maximal N scaling of the coefficients d n (N ) is inherited from the form of g n (k + k − , N ) established above by positivity; that is, This scaling means that the Fourier coefficients at each n necessarily vanish in the N → ∞ limit: The distribution on the final line of Eq. (3.42) is normalized on ∆φ ∈ [0, 2π). 5 The particular scaling of the coefficient d 1 (N ) with N can also be established by momentum conservation. Transverse momentum conservation states that Now, with p ⊥ ∼ Q/N , this relationship implies the scaling With our ergodic assumption, the term on the left is just the mean value of cos(∆φ) in the large-N limit and so This mean value can also be calculated from the probability distribution p(∆φ). We have from Eq. (3.42). If this is to scale like −1/N , the coefficient d 1 (N ) must scale like exactly as predicted from positivity. 6 Then, the distribution of the azimuthal angle difference can be expressed as 5 Recall that we are working at fixed center-of-mass energy Q. It is known that azimuthal correlations have a finite, non-zero value at fixed centrality as both Q and N increase (see, e.g., [75]). This is not inconsistent with this analysis because the Fourier coefficients defined in Eq. (3.43) have implicit dependence on the center-of-mass energy Q and other mass scales in the system, like the pion mass mπ. 6 Yet another way to prove this scaling with N follows from demanding that the squared matrix element is finite in the N → ∞ limit.
where now d 1 > 0 is some constant value in the N → ∞ limit. A similar relationship for the n = 1 coefficient of the Fourier expansion using momentum conservation was established in Ref. [76], but to the best of our knowledge, the scaling with the number of particles N is novel.

Ellipticity
Especially in collisions of heavy ions, azimuthal correlations amongst particles are used as evidence for collective flow phenomena and the production of exotic states of QCD matter. The first non-trivial azimuthal correlation is referred to as elliptic flow and quantifies particle correlations with respect to the reaction plane of the collision, the plane about which particle production is maximized in the plane and minimized orthogonal to it. As a proxy for the elliptic flow, a pairwise azimuthal correlation moment is often measured instead. Here, we will focus on one such moment, the average over all N particles in an event denoted as 2 and defined to be where φ j is the azimuthal angle of particle j. As above, the ergodic assumption establishes that the sum over cosine factors is just the mean value of cosine of the azimuthal angle difference in the large N limit, and so When further averaged over an ensemble of events, this is often denoted as 2 or c 2 (2). The notation c 2 (2) is common in the literature with the parenthetical 2 referencing the pairwise azimuthal correlation and the subscript 2 referencing the 2 in the argument of cosine in Eq. (3.53). Azimuthal correlations have been extensively measured in heavy ion collisions for the past few decades, in experiments at ATLAS [77][78][79], CMS [80][81][82][83][84][85][86][87][88][89][90], and ALICE [75,[91][92][93][94] at the LHC and at STAR [95][96][97][98][99][100][101][102][103][104][105] and PHENIX [106,107] at RHIC. However, outside of an explicit hydrodynamic model, measurements of azimuthal correlations are typically challenging to interpret. Azimuthal correlations are often binned in unphysical or unobservable quantities that are fit from models of the ion collisions, like the number of nucleons participating in the collision or the centrality. 7 Few of the listed references plot azimuthal correlations as a function of directly observable quantities, such as the number of charged particles in the event [77,82,83,91,93,102]. In what follows we will show how the features of the observables can in fact be elucidated purely in terms of observable quantities, by appealing to the expansion of the minimum bias cross section derived above.
As mentioned earlier, our expansion of the minimum bias cross section is defined for a fixed collision energy Q and fixed number N of observed particles. With our current formulation, we are not able to predict the dependence of the azimuthal correlations either as a function of Q or N . Thus, we will focus on predictions for fixed Q binned in the number of observed particles. Nevertheless, we will be able to make a number of concrete predictions. This form immediately identifies two distinct contributions. First, the explicit 1/N term is completely independent of any of the dynamics of the collision (i.e., the squared matrix element and smearing). Such a contribution is sometimes called "non-flow" in the literature. The second term, by contrast, is only non-zero if there are non-trivial azimuthal correlations; otherwise, the integral over phase space of the sinusoidal function vanishes. Such a contribution is sometimes called "flow". Whatever its short-distance interpretation, we can make concrete predictions of the flow contribution within our expansion. First, the azimuthal correlation c 2 (2) vanishes in the N → ∞ limit. This follows directly from the results established in Eq. (3.45). Additionally, this large-N limit is also the limit of small centrality (the quantile of highest multiplicity in an ensemble of collision events). Thus, azimuthal correlations should also vanish in the limit of small centrality. In the nucleusoverlap model of heavy ion collisions, it is also predicted that azimuthal correlations vanish in the low centrality limit because a head-on, perfectly overlapping nucleus collision is completely rotationally symmetric and has no preferred particle production axis. However, we stress that this prediction in our formulation makes no reference to the unobservable nucleus overlap and relies instead exclusively on our power counting of the relevant observable quantities.
This prediction is observed in data from heavy ion collisions. In Fig. 5, we plot data of the azimuthal correlation c 2 (2) from PbPb collisions at the ALICE experiment, as a function of the number of charged particles N ch in the pseudorapidity range |η| < 1.0 and transverse momentum in the range 0.2 < p ⊥ < 3.0 GeV [93]. Only charged particles are used to compute c 2 (2), but as discussed earlier, as long as the number of particles is large, we still expect our expansion to accurately describe these events. For significantly large N ch the azimuthal correlation decreases toward 0, as our expansion predicts. Though we do not show a plot here, data of the azimuthal correlation binned in centrality also vanishes as centrality decreases (see, e.g., Ref. [92]), corresponding to the highest multiplicity events in an ensemble.
The plot presented in Fig. 5 also illustrates another feature described in our expansion. The four data sets in that plot correspond to the azimuthal correlation of pairs of particles with a minimal pseudorapidity difference, |∆η|. As the pairwise pseudorapidity difference increases, the azimuthal correlation at small multiplicity N ch significantly decreases. At small multiplicities, momentum conservation induces significant correlations between pairs of particles that is imprinted on c 2 (2), especially when further restricted to long-range pseudorapidity correlations. However, at large values of N ch the data with different pseudorapidity differences converge onto a universal curve. This is expected within our expansion. At large N ch , correlations induced by momentum conservation are eliminated. Further, the pseudorapidity distribution of particles is quite flat over the acceptance region of the detector, and we argued that the transverse momentum distribution of particles effectively independent of a pseudorapidity cut. So, for sufficiently large number of particles, there is nothing special about the particular pseudorapidity of a particle and its correlation to other particles. Therefore, in the large N ch limit, neighboring particles and far away particles in pseudorapidity will exhibit the same azimuthal correlations.

Two-Particle Angular Correlations; the "Ridge"
A more general analysis of two-particle correlations can be studied through measurement of the cross section differential in both pairwise azimuthal differences ∆φ as well as pairwise pseudorapidity differences ∆η. Such double differential cross sections have been measured at RHIC [108,109] and other experiments before the LHC, but became exciting evidence of potential collective phenomena in pp collisions in measurements at CMS [110] in 2010. Since then, CMS [82,86,90,[111][112][113], ATLAS [77,[114][115][116], and ALICE [117,118] have measured pairwise particle correlations in pp and PbPb ion collisions. In heavy ion collisions, it is well-established that there are long range in ∆η correlations, which is interpreted as collective phenomena due to the high-density medium produced in collision. Extending out to ∆η ∼ 4, the distribution of azimuthal correlations in ∆φ is nearly independent of ∆η and exhibit an over-density in the distribution around ∆φ = 0, the so-called "ridge". A long distance factorization of the ∆η and ∆φ distributions and the ridge are challenging to observe in pp collisions because of the overwhelming dominance of short-distance correlations, or jet phenomena. However, with significantly restrictive cuts at high particle multiplicity, the CMS analysis from 2010 and later measurements identified a small, but non-negligible ridge in pp collisions suggesting that a dense QCD medium was produced.
In this section, we will analyze these angular correlations and the ridge from the perspective of our expansion of high multiplicity, minimum bias events. First, the quantity that is typically measured and plotted to identify the ridge is the cross section ratio . (3.55) Here, N pair is the number of pairs of particles from which their pseudorapidity difference ∆η and azimuthal angle difference ∆φ are measured. S(∆η, ∆φ) is the double differential cross section for "signal" pairs of particles drawn from the same event. To eliminate trivial correlations to some extent, this is divided by the "background" distribution B(∆η, ∆φ) which consists of pairs of particles drawn from different events. Thus, the paired particles that contribute to B(∆η, ∆φ) are truly uncorrelated. Finally, B(0, 0) is a convenient normalization factor. Starting with the signal distribution S(∆η, ∆φ), it is defined to be S(∆η, ∆φ) = d 2 N same d∆η d∆φ , (3.56) where N same is the number of pairs of particles drawn from the same event. In the limit that the number of detected particles in the event N → ∞, correlations between pairs of particles due to momentum conservation are eliminated. Further, we expect that the pseudorapidity and azimuthal angle differences are uncorrelated, or only very weakly correlated, because the squared matrix element approaches a constant in this limit. Any such correlations have to come from terms in the expansion in Eq. (2.15) of at least order Q −4 , and so the signal distribution factorizes to leading power in 1/N as Here, p(∆φ) is the probability distribution for azimuthal angle differences established in Eq. (3.51), and dN pair /d∆η is the pairwise ∆η distribution.
Continuing to the background distribution B(∆η, ∆φ), we expect no correlation between ∆η and ∆φ for pairs of particles from different events, so this distribution would naturally factorize. Further, the azimuthal difference should be flat because the experiment is symmetric about the colliding beams and each event spontaneously breaks this rotational symmetry, but independent of all other events. We also expect that the pseudorapidity difference distribution of the background is identical to that of signal because in the large N limit, correlations between pairs of particles in the same event vanish. Thus, the pseudorapidity difference distribution for same-event or distinct-event pairs is just determined by drawing from the single particle pseudorapidity distribution twice, to leading power in 1/N . That is, the background distribution factorizes to leading power in 1/N as 58) where N mix is the number of pairs of particles from distinct or "mixed" events. With these assumptions, our prediction for the double differential pairwise correlation distribution is independent of ∆η and just determined by the azimuthal difference distribution: The large-N expansion of the minimum bias cross section immediately implies that there are long range in ∆η correlations, through the interpretation that the ∆φ distribution at disparate values of ∆η are, to first order, the same. While these long-range correlations are observed in data and are a requirement for existence of a ridge, it doesn't by itself predict an over-density of pairs of particles with ∆φ = 0. That requires the distribution p(∆φ) to have a maximum at ∆φ = 0. One can perform an analysis of the coefficients of the Fourier expansion of p(∆φ), and compare to data. Recall that our large-N expansion of minimum bias predicts the form where d 1 > 0 and the d n (N ) coefficients scale at worst like N 2n−1 in the large N limit. Additionally, from ellipticity, we know that d 2 (N ) > 0, at least for minimum bias events in PbPb collisions. For existence of a ridge, we would additionally need that The vanishing of the first derivative at ∆φ = 0 is guaranteed by the symmetries of minimum bias collisions. The concave down requirement constrains the Fourier coefficients, but without a short-distance theory from which to predict the d n (N ) values, we can't say much more within the context of our minimum bias expansion. However, the fact that the distribution p(∆φ) is physical highly constrains the coefficients d n (N ). Assuming that a physical distribution is infinitely differentiable and the Fourier series of every derivative is absolutely convergent, the d n (N ) must vanish as n → ∞ faster than any power of n. For example, the Fourier series could actually just be finite for which there is a maximum n at which d n (N ) is non-zero, or d n (N ) might vanish exponentially fast, like d n (N ) ∼ e −n . Infinite differentiability implies that there are only a finite number of terms in the Fourier expansion that can have an appreciable effect on the first and second derivatives of p(∆φ) at ∆φ = 0. This seems to be borne out in data, e.g., figure 6 of Ref. [111], in which the d 2 (N ), d 3 (N ), and d 4 (N ) coefficients would be positive, but d 5 (N ) appears to be vanishingly small. Of course, with any finite dataset, measuring higher Fourier modes is problematic, but the trend is consistent with expectations from differentiability.

Min-Bias in Other Colliders
While our focus in this paper has been the application of the large-N expansion of minimum bias events to understand identical hadron or identical heavy ion collisions, the power counting and symmetries can be appropriately modified to describe nearly any collision environment.
In this section, we will just briefly describe extension to minimum bias at other colliders. Minimum bias in electron-positron collisions will be significantly different that we will provide more details about its symmetries and predictions.

pA Collisions and Electron-Ion Collisions
For different types of colliders, we must modify our power counting and symmetry assumptions as established in Sec. 2. Here, we will just discuss how the power counting and symmetries are modified for proton-ion (pA) and electron-ion collisions, leaving a detailed analysis of unique predictions from our minimum bias expansion in those environments to future work. First, for pA collisions, as the initial state still consists of hadronic matter, there are still unmeasurable beam regions in which all we can determine is the total momentum lost down the beam. Thus, all power counting assumptions for minimum bias in pA collisions remain identical to that of pp/AA collisions. The only change is to the symmetries, in which pA collisions lack the beam reflection η → −η symmetry of pp/AA collisions. As a consequence, the expansion of the corresponding matrix element is not reflection symmetric about the beams and so can have terms that consist of the hyperbolic sine of pseudorapidity differences, as well as hyperbolic cosine. For example, the squared matrix element for the N detected particles of pA collisions can now have terms of the form For electron-ion colliders, the power counting and symmetries are even more different. Now, because the electron is not hadronic, there is only one unmeasurable beam region, in the direction of the momentum of the initial ion. This isn't necessarily to say that there is perfect experimental resolution in the direction of the electron's momentum, it is instead that because the electron is not hadronic, its direction of momentum is not special for establishing beam remnants that take away an order-1 energy fraction at very high pseudorapidity. As there is only one unmeasurable beam region, we only smear over one component of lightcone momentum, k + , say. This also means that the pseudorapidity translation symmetry η → η+∆η is broken because we can effectively measure all particles at arbitrary pseudorapidities in the direction of the electron's momentum. Additionally, as the colliding particles are not identical, there is no beam reflection symmetry, η → −η. Thus in this case, the cross section for minimum bias events takes the form: Here, f (k + ) is some function purely of the total k + momentum lost down the beampipe in the direction of the initial ion's momentum and |M| 2 is the squared matrix element of the N detected particles. Decreasing the constraining symmetries makes the expansion of the squared matrix element significantly more involved, so we leave predictions for electron-ion collisions to future work.

Electron-Positron Collisions
At high energies in electron-positron collisions, the total hadronic cross section is dominantly due to collisions of emitted, collinear photons. In these events, the electron and positron are generally only scattered at low angle and are therefore lost down the beam. Because of the soft singularity of QED, the emitted photons typically have a small energy compared to the center-of-mass collision energy, and so the observed energy in the detector is small. This configuration of minimum bias e + e − collision events therefore is very similar in experimental signature as minimum bias events in hadron collisions. Specifically, in experimental analyses of these secondary γγ → hadrons events, they are identified by an anti-tag of energetic electron and positrons [119,120], requiring that the colliding particles are lost down the beam, and a limit on the total energy to remove annihilation events. With these considerations, the minimum bias events in e + e − collisions would enjoy the same power counting and symmetries as described in Sec. 2 and the expansion of the cross section would also follow from that presented there.
Here, instead, we will analyze the structure of events from e + e − collisions at or near a resonance, like the Z pole, m Z . At a center-of-mass energy near m Z , the cross section is dominated by the Z boson resonance. This configuration of final state particles is very different than minimum bias in hadron collisions, or the effective γγ collisions away from the resonance. As such, the power counting and symmetries of these resonance events are distinct, and produce different predictions for the structure of events and the correlations between particles. We will take these conditions to define events that we term 'resonance-decay minimum bias'.
The description of resonance-deay, hadronic minimum bias events in electron-positron (e + e − ) collisions is sufficiently constraining and simple that we will explicitly construct the large-N expansion and make some predictions. First and foremost, because electrons are color-singlet, fundamental particles, there are no unmeasurable beam regions whatsoever, so (in principle) all final state particles can be detected. Consequently, because the initial state is the hadronic vacuum, we assume full Lorentz invariance of the final state, so natural phase space coordinates are the energies of particles and their location on the celestial sphere. Further, as with our pp/AA collision analysis, we assume that only momenta are measured; no information on electric charge, etc., is recorded for particles.
With these assumptions, the power counting for resonance-decay hadronic minimum bias events in e + e − collisions are: 1. All N particles produced can in principle be detected. Only their four-momenta are measured; no charge information is determined.
With these symmetries, the cross section for minimum bias events can just be expressed in the textbook form of Fermi's Golden Rule with four-momentum conservation in the centerof-mass frame: where |M| 2 is the Lorentz-invariant squared matrix element for N final state particles, E i is the energy of particle i, and θ i and φ i are its polar and azimuthal angle coordinates on the celestial sphere. As it is fully Lorentz invariant, the squared matrix element can be expanded in symmetric polynomials of the Mandelstam invariants s ij = 2p i · p j , for two particles' fourmomenta p i and p j . Up through mass dimension 4 terms, the squared matrix element can be expanded in symmetric polynomials as: The c (n) i factors are constants that may in general have dependence on the number of particles N . The only constraint we impose on them is that in the N → ∞ limit, their scaling with N produces a finite contribution to the squared matrix element and that they are not sufficiently negative to make the squared matrix element negative. For example, assuming that the energy of particle i scales like E i ∼ Q/N and so the second non-trivial term scales with N as As discussed in the expansion of the squared matrix element for pp/AA collisions, in the N → ∞ limit, the various terms in the squared matrix element relax to their ensemble mean with 0 variance, with our assumption of ergodicity and the central limit theorem. So, in the N → ∞ limit, our minimum bias power counting implies that the matrix element is just a constant, corresponding to N free final state particles only constrained by momentum conservation.

Suppression of a Ridge in Resonance Decay Min Bias
Within this hadronic resonance decay minimum bias expansion for e + e − collision events, we will concretely demonstrate that this predicts a strong suppression of possible ridge phenomena in azimuthal correlations of pairs of particles. A ridge in two-particle correlations was searched for in e + e − collisions at the Z pole from archived ALEPH data recently [121], but no evidence for such a ridge was found. Lack of azimuthal correlations over a long distance in pseudorapidity in e + e − collisions is perhaps not surprising because the initial state is the QCD vacuum and there are no special directions in which particles are produced at arbitrarily high pseudorapidities. Nevertheless, it is satisfying to see that the power counting and symmetries of e + e − collisions immediately predict that such correlations, if they do exist, are very suppressed in the large-N limit.
To demonstrate this suppression of azimuthal correlations, we will focus on terms in the squared matrix element that directly correlate pairs of particles: Here, the c n are some constants that possibly depend on N but are constrained by finiteness and on the right, we have expanded the Mandelstam invariant s ij in terms of cylindrical detector coordinates. Now, we would like to simplify this expression, focusing on the scaling with N and the azimuthal correlations explicitly, so we will make some replacements. First, the transverse momentum p ⊥ = E sin θ and on flat phase space, most particles are located around the "equator" of the celestial sphere, where θ = π/2. So, as E ∼ Q/N , so too does p ⊥ ∼ Q/N in e + e − collisions. Further, because most particles lie near the equator, the hyperbolic cosine of the pairwise pseudorapidity difference is close to 1, so we can take the scaling cosh ∆η ∼ 1.
With these scaling assumptions, the squared matrix element takes the form: Now, with this squared matrix element, we can determine the probability distribution of the pairwise azimuthal angle difference ∆φ, in the large-N limit. As in our analysis of pp/AA collisions, correlations from momentum conservation become trivial in the N → ∞ limit, and the probability distribution can be calculated from p(∆φ) ∝ We have used permutation symmetry to define ∆φ = φ 1 − φ 2 , and the result in the final line is the exact integral over azimuthal angles of the second line. In this final form, it is obvious that any azimuthal dependence is suppressed by a relative factor of N 2 in the large-N limit. This explicitly follows from Lorentz invariance of the final state, enforcing that the transverse and longitudinal momentum components to the beam must appear in the expansion of the squared matrix element with the same coefficient.
On the other hand, the minimum bias expansion for the hadronic events arising from secondary γγ collisions discussed at the beginning of this section predicts no such suppression of a ridge, so it would be of interest to redo the analysis of [121] away from the Z pole.

Outlook
In this paper, we elucidated a power counting scheme to describe minimum bias events at colliders. The main tenets of this approach are a principle of particle ergodicity-each particle is representative of any other particle and averages over particles in an event are equivalent to averages of events over an ensemble-and the assumption of large particle multiplicity, N , in an event which provides the expansion parameter 1/N . Under these conditions, the variance of the QCD squared matrix element vanishes as 1/N , and as N → ∞, this effectively reduces all physical observables to their flat phase space limits, with a smearing factor in p or A collisions to account for beam losses.
We showed that the above assumptions do indeed lead to a remarkably good description of minimum bias within a realm of validity that we quantified, performing an explicit comparison to minimum bias collider data. Notably, the (smeared) flat phase space values reproduce kinematic distributions well. We also studied observable features that can be described by an expansion in higher order kinematic harmonics about flat phase space, focusing in particular on azimuthal correlations, specifically elliptic flow in heavy ion collisions and the ridge phenomena in pp collisions. We showed that the conditions of positivity and momentum conservation fix the sign and scaling with N of the leading harmonic of two particle azimuthal difference ∆φ beyond the flat limit, and argued that the different symmetries of e + e − collisions alone imply additional suppression 1/N 2 of any ridge that could be observed there.
The results of the above study are promising, in that they provide confidence that the S-matrix framework to describe minimum bias can be developed. Benefits of this approach are that it equally applies to small and large collision system sizes at high and low energies, and it does not depend on any underlying model or unphysical parameters. This could be utilized to elucidate the nature of small scale collective phenomena of QCD that were newly discovered at the LHC, and which are the subject of some debate. In particular, it would be interesting to study the emergence of jets in our picture, and see whether this can address the question of jet quenching in small (potentially QGP phase) systems. Another avenue would be exploring the use of this formalism at low-energy heavy ion beam scans that aim to hit the QCD critical point.
Theoretically, there are a number of interesting directions to pursue. A systematic study of the harmonic expansion of the phase space manifold could be undertaken; power spectra of minimum bias collisions can be envisaged from this, and could be an efficient way of determining flow vs. non-flow physics. It would be interesting and potentially very natural to incorporate detector resolution into this language via, e.g., a maximum harmonic sensitivity.
We formally took a limit N → ∞ while working at fixed Q, while ignoring the mass of the QCD pions. This approximation agrees with data well, giving confidence that it is a good approximation to low-energy QCD dynamics in the minimum bias regime. However, strictly this limit does not exist, a maximum multiplicity being ensured by the explicit breaking of chiral symmetry and the pion mass, N max = Q/m π . The obvious connection between the limit we worked in and broader studies of strongly-coupled conformal field theories at colliders [122] would be interesting to pursue. It might be possible to address whether the Q and N dependence of the coefficients the appeared in our expansion be fixed by, e.g., consistency conditions that arise from a non-trivial limit of a distribution as N → ∞.
Finally, by working purely with the S-matrix of final particles, the hydrodynamic description of the unobservable QGP was sidestepped. Nevertheless one might wonder how hallmarks of transport coefficients and, e.g., the theoretical lower bound on specific shear viscosity may show up in our coarse-grained approach.