Non-Bessel-Gaussianity and Flow Harmonic Fine-Splitting

Both collision geometry and event-by-event fluctuations are encoded in the experimentally observed flow harmonic distribution $p(v_n)$ and $2k$-particle cumulants $c_n\{2k\}$. In the present study, we systematically connect these observables to each other by employing Gram-Charlier A series. We quantify the deviation of $p(v_n)$ from Bessel-Gaussianity in terms of flow harmonic fine-splitting. Subsequently, we show that the corrected Bessel-Gaussian distribution can fit the simulated data better than the Bessel-Gaussian distribution in the more peripheral collisions. Inspired by Gram-Charlier A series, we introduce a new set of cumulants $q_n\{2k\}$ that are more natural to study distributions near Bessel-Gaussian. These new cumulants are obtained from $c_n\{2k\}$ where the collision geometry effect is extracted from it. By exploiting $q_2\{2k\}$, we introduce a new set of estimators for averaged ellipticity $\bar{v}_2$ which are more accurate compared to $v_2\{2k\}$ for $k>1$. As another application of $q_2\{2k\}$, we show we are able to restrict the phase space of $v_2\{4\}$, $v_2\{6\}$ and $v_2\{8\}$ by demanding the consistency of $\bar{v}_2$ and $v_2\{2k\}$ with $q_2\{2k\}$ equation. The allowed phase space is a region such that $v_2\{4\}-v_2\{6\}\gtrsim 0$ and $12 v_2\{6\}-11v_2\{8\}-v_2\{4\}\gtrsim 0$, which is compatible with the experimental observations.


I. INTRODUCTION
It is a well-established picture that the produced matter in the heavy ion experiment has collective behavior. Based on this picture, the initial energy density manifests itself in the final particle momentum distribution. Accordingly, as the main collectivity consequence, the final particle momentum distribution is extensively studied by different experimental groups in the past years. As a matter of fact, the experimental groups at Relativistic Heavy Ion Collider (RHIC) and Large Hadron Collider (LHC) measure the flow harmonics v n [1][2][3][4][5][6][7][8], the coefficients of the momentum distribution Fourier expansion in the azimuthal direction [9][10][11]. All these observations can be explained by several models based on the collective picture.
The reaction plane angle, the angle between the orientation of the impact parameter and a reference frame, is not under control experimentally. Additionally, the Fourier coefficients cannot be found reliably due to the low statistic in a single event. These enforce us to use an analysis more sophisticated than a Fourier analysis. There are several methods to find the flow harmonics experimentally, namely Event-Plane method [12], multiparticle correlation functions [13,14] and Lee-Yang zeros [15,16].
The most recent technique to find the flow harmonics is using the distribution of flow harmonic p(v n ). This distribution has been obtained experimentally by employing * hadi.mehrabpour@ipm.ir † s.f.taghavi@ipm.ir the unfolding technique [17].
We know that the initial shape of the energy density depends on the geometry of the collision and the quantum fluctuations at the initial state. As a result, the observed flow harmonics fluctuate event by event even if we fix the initial geometry of the collision. In fact, the event-by-event fluctuations are encoded in p(v n ) and experimentally observed flow harmonics as well. It is worthwhile to mention that the observed event-by-event fluctuations are a reflection of the initial state fluctuations entangled with the modifications during different stages of the matter evolution, namely collective expansion and the hadronization. For that reason, exploring the exact interpretation of the flow harmonics is crucial to understand the contribution of each stage of the evolution on the fluctuations. Moreover, there is no well-established picture for the initial state of the heavy ion collision so far. The interpretations of the observed quantities contain information about the initial state. This information can shed light upon the heavy ion initial state models too.
According to the theoretical studies, flow harmonic v n {2k} obtained from 2k-particle correlation functions are different, and their difference is sourced by nonflow effects [13,14] and event-by-event fluctuations as well [18]. We should say that experimental observations show that v 2 {2} is considerably larger than  [19,20]. Alternatively, the distribution p(v n ) is approximated by Bessel-Gaussian distribution (corresponds to a Gaussian distribution for v n fluctuations) as a simple model [10,21]. Based on this model, the difference between v 2 {2} and v 2 {4} is related to the width of the v 2 arXiv:1805.04695v1 [nucl-th] 12 May 2018 fluctuations. However, this model cannot explain the difference between other v 2 {2k}.
In the past years, several interesting studies about the non-Gaussian v n fluctuations have been done [22][23][24][25][26][27]. Specifically, it has been shown in Ref. [26] that the skewness of v 2 fluctuations is related to the difference v 2 {4} − v 2 {6}. Also the connection between kurtosis of v 3 fluctuations and the ratio v 3 {4}/v 3 {2} has been studied in Ref. [27]. It is worth noting that the deviation of v n fluctuations from Gaussian distribution immediately leads to the deviation of p(v n ) from Bessel-Gaussian distribution.
In the present work, we will introduce a systematic method to connect v n {2k} to distribution p(v n ). In Sec. II, we have an overview of the known concepts of cumulants, flow harmonic distributions and their relation with the averaged flow harmonicsv n . Sec. III is dedicated to the Gram-Charlier A series. Using its concepts, we find an approximated flow harmonic distribution in terms of c n {2k}. Specifically for the second harmonics, we show that the deviation of p(v 2 ) from Bessel-Gaussianity is quantified by the fine-splitting v 2 {2k} − v 2 {2 } where k, ≥ 2 and k = . These studies guide us to define a new set of cumulants q n {2k} where they depend on the event-by-event fluctuations only. In Sec. IV, we use the new cumulants to introduce more accurate estimations for the average ellipticity. As another application of new cumulants, we use q 2 {2k} to constrain the v 2 {4}, v 2 {6} and v 2 {8} phase space in Sec. V. We show that the phase space is restricted to a domain that v 2 {4} − v 2 {6} 0 1 and 12v 2 {6} − 11v 2 {8} − v 2 {4} 0. We present the conclusion in Sec. VI. The supplementary materials can be found in the appendices. We would like to emphasize that in the Appendix C, we found a one-dimensional distribution for p(v n ) which is different from that mentioned in Sec. III. Additionally, an interesting connection between p(v n ) expansion in terms of cumulants c n {2k} and a relatively new concept of the multiple orthogonal polynomials in mathematics is presented in the Appendix D.

II. FLOW HARMONIC DISTRIBUTIONS AND 2k-PARTICLE CUMULANTS
This section is devoted to an overview of already known concepts of cumulant and its application to study the collectivity in the heavy ion physics. We present this overview to smoothly move forward to the flow harmonic distribution and its deviation from Bessel-Gaussianity.

A. Correlation Functions vs. Distribution
According to the collective picture in the heavy ion experiments, the final particle momentum distribution is a consequence of the initial state after a collective evolution. In order to study this picture quantitatively, the initial anisotropies and flow harmonics are used extensively to quantify the initial energy density and final momentum distribution.
The final momentum distribution is studied by its Fourier expansion in the azimuthal direction, In the above, v n and ψ n are unknown parameters that can be found easily viav n ≡ v n e inψn = e inφ . Here, the averaging is obtained by using the distribution 1 N dN dφ in a given event. The parameterv n is called flow harmonic. Similar to initial anisotropies, sometimes we use the flow harmonics in Cartesian coordinates, v n,x = v n cos nψ n , v n,y = v n sin nψ n .
Even if the events are in the same centrality class, the initial energy density is different from one event to the other. One of the main sources of this difference is the quantum fluctuations in the distribution of the nucleons in the nucleus wave-function. It means that although we fix the global features of different events, we find different values for (ε n,x , ε n,y ) and consequently flow harmonic in different events. Suppose the reaction plane angle Φ RP is zero. Then, we can quantify the event-by-event fluctuations in terms of a two dimensional distribution p(ε n,x , ε n,y ) and p(v n,x , v n,y ) for the initial anisotropies and flow harmonics, respectively.
The initial anisotropies ε n are non-zero at each event due to the quantum fluctuations and the collision geometry. Although we expect that the contribution of the quantum fluctuations to the averaged initial anisotropies are vanishing, the collision geometry contribution to the averaged ε n can be non-zero. For instance, considering a set of non-central events in the same centrality class with Φ RP = 0, we expect ε 2,x = 0 and ε 2,y = 0 where the average is performed with respect to p(ε 2,x , ε 2,y ) (similarly v 2,x = 0 and v 2,y = 0) 2 . Note that ε 2,x specifies how much the initial energy density is elliptic while ε 2,y indicates its orientation with respect to the reaction plane. As a result, due to the geometry of the initial energy density, we have ε 2,x = 0 ( v 2,x = 0). Strictly speaking, the averaged ellipticityv 2 ≡ v 2,x is a manifestation of the geometrical initial ellipticity for events in a given centrality class irrespective of the fluctuations. In general, we are able to define averaged flow harmonic v n ≡ v n,x too. In odd harmonics, however, this average would be zero for spherical ion collisions with same sizes.
Furthermore, we should emphasize that the reaction plane angle is not experimentally under control. As a result, if we consider ψ n for the case Φ RP = 0, we should replace ψ n by ψ n + Φ RP where Φ RP is a random number in the range [0, 2π). The average e in(φ−ΦRP) (and consequentlyv n ) cannot be obtained due to the low multiplicity in a single event and the unknown value of Φ RP . This complication has been resolved in Refs. [13,14] by considering multiparticle correlation functions. Let us consider a two-particle distribution function dN/(dφ 1 dφ 2 ) and compute e in(φ1−φ2) . Here φ 1 and φ 2 are angles of two different particles in a single event. Using this combination of particle angles, we are able to perform the average of quantity e in(φ1−φ2) first in a single event and alternatively over many events in same centrality class. We can generalize the two-particle averaging to 2k-particles, and show it by It is called 2k-particles correlation functions. Using these moments, one can define 2k-particle cumulants, c n {2k}. Accordingly, one can estimate v n by using c n {2k} which is shown by v n {2k}. They are related to c n {2k} as v 2k n {2k} ∝ c n {2k} (see Eq. (12)). Now, we would like to look at c n {2k} from another point of view and rephrase the above picture of 2kparticle cumulants in terms of the flow harmonic distributions, p(v n,x , v n,y ). A practical way to study a distribution function p(r) is using two dimensional cumulants, where r stands for a generic two dimensional random variable (it can be either (ε n,x , ε n,y ) or (v n,x , v n,y )). Consider G(k) as the characteristic function of the probability distribution p(r). The characteristic function is, in fact, the Fourier transformed p(r), Consequently, we can define the cumulative function as 2 In fact, one can show that all vn,x for even n are non-zero for the ellipse-like distributions. For an analytical example see section one, footnote 6 in Ref. [29]. 3 The correlation functions e in(φ 1 +···+φ k −φ k+1 −···−φ k+k ) with k = k is not invariant under the shift φ i → φ i + α for an arbitrary angle α. As a result, the correlations function vanishes due to the randomness of the reaction plane angle. C(k) = log G(k). Two dimensional cumulants are obtained from where A mn and C m,n are the 2D cumulants in Cartesian and polar coordinates, respectively. The two dimensional cumulants have already been used by Teaney and Yan in Ref. [28] to quantify the initial energy density shape 4 . In the second line in the above, we used k = k 2 x + k 2 y , ϕ k = atan2(k x , k y ). In Appendix A, we study 2D cumulants in these two coordinates and their relations in more details.
The cumulants A mn and C m,n have been employed in studying p(v n,x , v n,y ) previously in the literature. Recall that we considered Φ RP = 0 for all events in defining p(v n,x , v n,y ). In this case, the cumulant 3 is related to the skewness of p(v n,x , v n,y ). For the case n = 2, this quantity is found for the first time in Ref. [26], and it is argued that In other words, A 30 is related to the fine-splitting between v 2 {4} and v 2 {6}. In Ref. [27], the kurtosis of p(v 3,x , v 3,y ) in the radial direction (A 40 + 2A 22 + A 04 ) has also been calculated, and it is shown it is proportional to v 4 3 {4} 5 . For events with zero reaction plane angle, the distribution p(v 2,x , v 2,y ) is not rotationally symmetric with respect to the origin due to the non-zero value of the averaged ellipticityv 2 = v 2,x . However, p(v 3,x v 3,y ) is rotationally symmetric because all the non-zero value of (v 3,x v 3,y ) at each event is due to the fluctuations. Nevertheless, the random Φ RP rotates the point (v n,x , v n,y ) with a random phase in the range [0, 2π) at each event. Considering p(v n,x v n,y ) as a normalized histogram, this randomness shuffles the points in histogram and make it rotationally symmetric if it has not this symmetry from the beginning. After shuffling flow harmonics, we show the corresponding distribution byp(v n,x , v n,y ) which is a rotationally symmetric for any harmonics and in any centralities.
We should say that the distributionp(v n,x , v n,y ) is experimentally accessible. In Ref. [17], the unfolding technique has been used by ATLAS collaboration to remove the statistical uncertainty (sourced by low statistics at 4 Here we use the two dimensional cumulants to study flow harmonic distributions. In Ref. [28], Amn and Cmn have been shown by W n,ab where n = 1, 2, . . . and a, b ∈ {x, y} in Cartesian coordinates and W 0,n , W s 0,n and W c 0,n in polar coordinates. 5 The coefficients of the proportionality in both skewness and radial kurtosis are also functions of vn{2k}. We should note that the skewness (radial kurtosis) vanishs is equal to zero (see Refs. [26,27] for the details).
each event) and non-flow effects from a distribution of "observed" flow harmonics (v obs n,x , v obs n,y ). In this case, the only unknown parameter to find an accurate p(v n,x v n,y ) is the reaction plane angle. Since there is no information in the azimuthal direction ofp(v n,x , v n,y ), we can simply average this direction out to find a one dimensional distribution 6 , Note that we can interchangeably usep(v n,x , v n,y ) or p(v n,x , v n,y ) in the above because obviously the effect of the random reaction plane angle and the azimuthal averaging are the same.
In polar coordinates, we havep(v n,x , v n,y ) ≡p(v n ). As a result, the characteristic function of the distributioñ p(v n,x , v n,y ) in polar coordinates is given by where we used Eq. (5) in the above. Here, J 0 (x) is the Bessel function of the first kind. Also, · · · 2D means averaging with respect top(v n,x , v n,y ) while · · · 1D specifies the averaging with respect to p(v n ). The Eq. (6) indicates that we can study the radial distribution p(v n ) instead ofp(v n,x , v n,y ) if we use G(k) = J 0 (kv n ) as the characteristic function of p(v n ) 7 .
The cumulants of p(v n ) can be found by expanding the cumulative function log J 0 (kv n ) in terms of ik. The coefficients of ik in this expansion (up to some convenient constants) are the desired cumulants, The cumulants c n {2m} in the above, are those obtained from 2k-particle correlation functions [14], One may wonder that the distribution p(v n ) contains more information than c n {2k} because the odd moments of p(v n ) are absent in the definitions of c n {2k} [30]. However, it is worth mentioning that the moments can be 6 We simply use the notation ϕ ≡ nψn. 7 We ignore the subscript 1D or 2D when it is not ambiguous.
found by expanding the characteristic function G(k) in terms of ik, for the radial distributions like p(v n ). Since the above series is convergent 8 , we can find the characteristic function G(k) by knowing the moments v 2k n . Having characteristic function in hand, we immediately find p(v n ) by inversing the last line in the Eq. (6) 9 , It means that by assuming the convergence of the series in Eq. (9) the distribution p(v n ) can be found completely by using only even moments. Equivalently, we can use the following argument: for the distributionp(v n,x , v n,y ), one simply finds that the only non-vanishing moments are v 2k nx v 2 ny . It means that in the polar coordinates only v 2(k+ ) n are present. Additionally, by finding the two dimensional cumulants of p(v n,x , v n,y ) in polar coordinates C m,n (Eq. (4)), we obtain that the only non-zero cumulants are C 2k,0 ∝ c n {2k} (see Appendix A).
As a result, in the presence of random reaction plane angle, c n {2k} are all information which we can find from the original p(v n,x , v n,y ), whether we use 2k-particle correlation functions or obtain it from the unfolded distribution p(v n ) principally. However, we should say that the efficiency of the two methods in removing single event statistical uncertainty and non-flow effects could be different which leads to different results in practice.
Furthermore, the whole information about the fluctuations are not encoded in p(v n,x , v n,y ).
In fact, the most general form of the fluctuations are encoded in a distribution as p(v 1,x , v 1,y , v 2,x , v 2,y , . . .). It is worth mentioning that the symmetric cumulants, which have been introduced in Ref. [32] and have been measured by ALICE collaboration [33], are nonvanishing. Additionally, the event-plane correlations (which are related to the moments v q m (v * n ) q m/n ) have been obtained by the ATLAS collaboration [34,35]. They are non-zero too. These measurements indicate that p(v 1,x , v 1,y , v 2,x , v 2,y , . . .) cannot be written as n p(v n,x , v n,y ). In the present work, however, we do not focus on the joint distribution and leave this topic for the studies in the future. Let us point out that moving forward to find a generic form for the moments of the flow harmonic distribution was already done in Ref. [36]. 8 It is an important question that is it possible to determine p(vn) uniquely from its moments [31] (see also Ref. [32])? Answering to this question is beyond the scope of the present work. Here we assume that p(vn) is M-determinate which means we can find it from its moments v 2q n in principal. 9 We use the orthogonality relation ∞ 0 k Jα(kr)Jα(kr ) dk = δ(r− r )/r.

B. Approximated Averaged Ellipticity
A question arises now: how much information is encoded in p(v n ) from the original p(v n,x , v n,y )? In order to answer this question, we first focus on n = 3. Unless there is no net triangularity for spherical ion collisions, the non-zero triangularity at each event comes from the fluctuations. Hence, we havev 3 = 0 for such an experiment. In this case, the event-by-event randomness of Φ RP is similar to the rotation of the triangular symmetry plane due to the event-by-event fluctuations. It means that p(v 3,x , v 3,y ) itself is rotationally symmetric, and the main features of p(v 3,x , v 3,y ) andp(v 3,x , v 3,y ) are same. As a consequence, p(v 3 ) or equivalently c 3 {2k} can uniquely reproduce the main features of p(v 3,x , v 3,y ).
However, it is not the case for n = 2 due to the non-zero averaged ellipticityv 2 . The distribution p(v 2,x , v 2,y ) is not rotationally symmetric and reshuffling (v 2,x , v 2,y ) leads to information loss from the original p(v 2,x , v 2,y ). Therefore, there is at least some information in p(v 2,x , v 2,y ) that we cannot obtain it from p(v 2 ) or c 2 {2k}. Nevertheless, it is still possible to find some features of p(v 2,x , v 2,y ) approximately. For instance, we mentioned earlier in this section that the skewness of this distribution in the v 2,x direction is proportional to The other important feature of p(v 2,x , v 2,y ) isv 2 which is not obvious how we can find from c 2 {2k}. In fact, we are able to approximately findv 2 in terms of c 2 {2k} by approximating p(v 2,x , v 2,y ). The most trivial approximation is a two dimensional Dirac delta function located at (v 2 , 0), This corresponds to the case that there are no fluctuations, and the only source for ellipticity is coming from a very clean initial geometry. Considering Eq. (5), the moments v 2q 2 can be easily obtained as follows v 2q Now by using Eq. (8), we find 6 2 , etc. It is common in the literature to show the averaged ellipticityv 2 which is approximated by c 2 {2k} as v 2 {2k}. Note that the quantity v 2 {2k} is defined for the case that the flow harmonic distribution is considered as a delta function. Furthermore, we can assume an ideal case that for any harmonics the distribution p(v n,x , v n,x ) has a sharp and clean peak around v n . By the mentioned assumption, we have [14], Nevertheless, we know thatv n can be non-zero for odd n when the collided ions are not spherical or have different sizes 10 .
Using delta function approximation for p(v 2,x , v 2,y ) is not compatible with the experimental observation. In this case, we have  [19,20]. We can improve the previous approximation by replacing the delta function with a Gaussian distribution. In this case, we model the fluctuations by the width of the Gaussian distribution. Let us assume that Using above and Eq. (5), one can simply find p( is the well-known Bessel-Gaussian distribution [10,21], Here, I 0 (x) is the modified Bessel function of the first kind. Now, we are able to find the moments v 2q 2 by using this approximated p(v 2 ). According to the relations in Eq. (8), we find Considering Eq. (12), we used the notation v 2 {2k} instead of c 2 {2k} in the above. This result explains the large difference between v 2 {2} and v 2 {2k} for k > 1. In fact, the presence of fluctuations is responsible for it. This description for the difference between v 2 {2} and other flow harmonics was argued first in Ref. [21]. It is found that the splitting between v 2 {2} and other flow harmonics contains information from the two dimensional distribution [21].
The above two examples bring us to the following remarks: • In order to relatev n to c n {2k}, one needs to estimate the functionality of p(v n ) wherev n is implemented in this estimation explicitly. We show this estimated distribution by p(v n ;v n ).
• The accuracy of the estimated distribution can be checked by studying the fine-splitting v n {2k} − v n {2 } and comparing it with the experimental data.
We should say that the first remark is very strong and we can estimatev n by a weaker condition. Obviously, if we estimate only one moment or cumulant of p(v n ) as a function ofv n , in principle, we can estimatev n by comparing the estimated moment or cumulant with the experimental data. But the question is how to introduce such a reasonable estimation practically. In the following sections, we introduce a method to estimatev n from a minimum information of p(v n ). One notes thatv 2 = v 2 {4} = · · · is true only if we approximate p(v 2 ) by Bessel-Gaussian distribution. In the next section, we find an approximated distribution around Bessel-Gaussianity.

III. RADIAL-GRAM-CHARLIER DISTRIBUTION AND NEW CUMULANTS
In Sec. II B, we argued that the quantityv n , which is truly related to the geometric features of the collision, can be obtained by estimating a function for p(v n,x , v n,y ). We observed that the Dirac delta function choice for p(v n,x , v n,y ) leads tov n = v n {2k} for k > 0, while by assuming the distribution as a 2D Gaussian located at (v n , 0) (the Bessel-Gaussian in one dimension) we find v n = v n {2k} for k > 1. One notes that in modeling the flow harmonic distribution by delta function or Gaussian distribution, the parameterv n is an unfixed parameter which is eventually related to the v n {2k}. In any case, the experimental observation indicates that v n {2k} are split, therefore, the above two models are not accurate enough.
Instead of modeling p(v n,x , v n,y ), we will try to model p(v n ) with an unfixed parameterv n , namely p(v n ;v n ). In this section, we introduce a series for this distribution such that the leading term in this expansion is the Bessel-Gaussian distribution. The expansion coefficients would be a new set of cumulants that specifies the deviation of the distribution from the Bessel-Gaussianity. In fact, by using these cumulants, we would be able to model p(v n ;v n ) more systematically.
It is well-known that a given distribution can be approximated by Gram-Charlier A series which approximate the distribution in terms of its cumulants (see Appendix B). Here we use this concept to find an approximation for p(v n ;v n ) in term of cumulants c n {2k}. One of the formal methods of finding the Gram-Charlier A series is using orthogonal polynomials. In addition to this well-known method, we will introduce an alternative method which is more practical for finding the series of a one dimensional p(v n ;v n ) around the Bessel-Gaussian distribution.
A. Gram-Charlier A series: 1D Distribution with Support R Before finding the approximated distribution around Bessel-Gaussian, let us practice the alternative method of finding Gram-Charlier A series by applying it to a onedimensional distribution p(x) with support (−∞, ∞) 12 . This method will be used in the next section to find the Radial-Gram-Charlier distribution for arbitrary harmonics.
The characteristic function for a one-dimensional distribution is e ikx , and the cumulants κ n of such a distribution is found from log e ikx = n=1 (ik) n κ n /n!. The first few cumulants are presented in the following, where the averages are performed with respect to p(x) in the right-hand side. Now, consider an approximation for the original distribution where its cumulants are coincident with the original p(x) only for a few first cumulants. We show this approximated distribution by p q (x) where the cumulants κ n for 1 ≤ n ≤ q are same as the cumulants of the original p(x).
Assume the following ansatz for this approximated distribution, where In the above, a i,k (except a 0,0 ) are unknown coefficients. Note that p 0 (x) is nothing but a Gaussian distribution located at the origin. One can find the unknown coefficients a i,k by using equations in (17) together with the normalization condition iteratively. In what follows we show how it works: let us present the moments obtained from p q (x) as x m q . Also, assume that the first moment (which is the first cumulant too) is zero. At the end, we recover the first moment by applying a simple shift. Now for the first iteration (q = 1) we have 1 1 = 1 + a 1,0 = 1 from the normalization condition and x 1 = σ 2 a 1,1 = κ 1 = 0 from the Eq. (17a). It is a linear two dimensional system of equations and the solution is a 1,0 = a 1,1 = 0. In the next step (q = 2), we have three equations (one normalization condition and two Eq. (17a) and Eq. (17b)). By considering κ 2 = σ 2 , we find a 2,0 = a 2,1 = a 2,2 = 0. However, the third iteration is non-trivial. The equations are The above equations can be solved easily, a 3,0 = a 3,2 = 0 and a 3, Here, He n (x) is the probabilistic Hermite polynomial defined as He n (x) = e x 2 /2 (−d/dx) n e −x 2 /2 . We are able to continue the iterative calculations to any order and find (20) In the above, h 0 = 1 and h 1 = 0 together with where γ n are the standardized cumulants defined as Note that in the Eq. (20), we arbitrarily shifted the distribution to the case that the first moment of p(x) is κ 1 . In addition, we assumed that the width of the Gaussian distribution is exactly equal to κ 2 . The Eq. (20) is the well-know Gram-Charlier A series for the distribution p(x). One could consider the Eq. (20) as an expansion in terms of Hermite polynomials. Using the fact that He n (x) are orthogonal with respect to the weight we can find the coefficients h n in Eq. (21) (see the textbook [37]). To this end, we change the coordinate as x → (x − κ 1 )/σ. As a result, we have By using the series form of the Hermite polynomial, we find h n as a function of p(x) moments. Rewriting moments in terms of cumulants (reverting the equations in (17)), lead to Eq. (20).

B. Radial-Gram-Charlier Distribution
Using standard methods, we can extend one dimensional Gram-Charlier A series (20) to two dimensions (see Appendix B 2), (23) where h mn are written in terms of two dimensional cumulants A mn (see Eqs. (B12)-(B14)), and N (r) is a two dimensional Gaussian distribution similar to Eq. (14) located at (µ x , µ y ) and σ x = σ y . It is worth noting that the concept of 2D Gram-Charlier A series has been employed in heavy ion physics first in Ref. [28] by Teaney and Yan. They used this series to study the energy density of a single event 13 . However, we use this to study flow harmonic distribution in the present work. Now let us consider a two dimensional Gram-Charlier A series for p(v n,x , v n,y ). By this consideration, one can find a corresponding series for p(v n ) by averaging out the azimuthal direction. We should say that the result of this averaging for the second and third harmonics are different. For n = 3, the distribution p(v 3,x , v 3,y ) is rotationally symmetric and, as we already remarked in the previous section, the whole information of the distribution is encoded in c 3 {2k}. As a result, we are able to rewrite the 2D cumulants A mn in terms of c 3 {2k}. It has been done in Ref. [27], and an expansion for p(v 3 ) has been found. On the other hand, for n = 2 the whole information of p(v 2,x , v 2,y ) is not in c 2 {2k}. Therefore, we are not able to rewrite all A mn in terms of c 2 {2k} after averaging out the azimuthal direction of a 2D Gram-Charlier A series.
For completeness, we studied the azimuthal averaging of Eq. (23) in the most general case in the Appendix C. In this appendix, we showed that the distribution in Ref. [27] is reproduced only by assuming A 10 = A 01 = 0. Also, we discussed about the information we can find from the averaged distribution compared to the two dimensional one. However, the method which we will follow in this section is different from that point out in Appendix C. Consequently, the most general series we will find here is not coincident with the distribution obtained in the Appendix C.
Before finding a Gram-Charlier A series for arbitrary harmonic, let us find the series for odd harmonics (mentioned in Ref. [27]) by employing orthogonal polynomials. The result will be used to find the series for the most general case later. Since we havev 3 = 0 for n = 3, the Bessel-Gaussian distribution reduces to a radial Gaussian distribution as (v 3 /σ 2 )e −v 2 3 /(2σ 2 ) . Moreover, the Laguerre polynomials L n (x) are orthogonal with respect to By changing the coordinate as 3 /(2σ 2 ) dv 3 ; the radial Gaussian distribution appears as the weight for orthogonality of L n (v 2 3 /(2σ 2 )). Hence, we can write a general distribution p(v 3 ) like where the coefficients odd 2n can be found by 14 Considering the series form of the Laguerre polynomial, we immediately find odd n in terms of moments v 2q 3 . Then one can invert the equations in (8) where in the above we defined the standardized cumulants Γ odd 2k as similar to Eq. (22). The expansion (24) together with Eq. (27) is exactly the series found in Ref. [27] which is true for any odd n. This approximated distribution is called the Radial-Gram-Charlier (RGC) distribution in Ref. [27].
It should be noted that it is a series for the case that v n = 0. In the following, we will try to find similar series for p(v n ;v n ) wherev n could be non-vanishing.
In order to find the Gram-Charlier A series for the distribution p(v n , ;v n ) in general case, we comeback to the iterative method explained in Sec. III A where we 14 In Eq. (24), we chose the coefficient expansion as n! for convenience. 15 Refer to the footnote 16 and set A 10 = 0.
found the distribution (20) by considering the ansatz (18) and iteratively solving the equations in (17).
Here, we assume that p q (v n ;v n ) is an approximation of p(v n ) where only the cumulants c n {2k} with 1 ≤ k ≤ q are same as the original distribution. Now suppose an ansatz that has the following properties: • Its leading order corresponds to the Bessel-Gaussian distribution.
Using such an ansatz, we calculate the moments v 2k n with some unknown parameters and find them by solving the equations in (8) iteratively. We introduce the following form for the ansatz, One simply finds that by choosing a 0,0 = 1 the dis- By comparing the above expansion with Eq. (26), we real- ). In order to reproduce the Eq. (24), we choose where 2i are unknown coefficients. Similar to the Sec. III A, we indicate the moments of p q (v n ;v n ) by v 2m n q . For the first iteration, we have the normalization condition 1 0 = 0 = 1. For the second iteration, the normalization condition is trivially satisfied 16 Obviously, there is no one-to-one correspondence between cn{2k} and Amn due to the losing information by averaging. Specifically, one can find cn{2} = A 2 10 + A 2 01 + A 20 + A 02 (see Eq. (C5a)). Note that by assuming Φ RP = 0 we have A 01 = vn,y = 0 and A 10 = vn,x =vn. Also, it is a reasonable assumption to consider that A 20 A 02 (see Refs. [26,27]). By choosing σ = σx = σy = A 20 A 02 , one approximates p(vn,x, vn,y) around a symmetric Gaussian distribution located at (vn, 0) where its width is exactly similar to the distribution p(vn,x, vn,y). In this case, we find cn{2} =v 2 n + 2σ 2 .
For the next iteration, we find that the normalization condition and Eq. (8a) are automatically satisfied, 1 2 = 1, v 2 n 2 = c n {2} while the Eq. (8b) leads to a non-trivial equation for 4 , Using the above equation, we immediately find 4 . In a similar way, we are able to continue this iterative calculation and find 2q from the only non-trivial algebraic equation at each step. A summary of the few first results are as follows: 0 = 1, 2 = 0 together with (35) One may wonder that, similar to two previous cases, are we able to write all the coefficients 2k in terms of some standardized cumulants? These standardized cumulants should smoothly approach to what is mentioned in Eq. (28). In fact, it motivates us to define a new set of cumulants as follows, Now, we can define the standardized form of this new sets of cumulants as follows, Using the above definitions, we can rewrite the coefficients 2k in the following form, which are in agreement with equations in (27) in the limit v n → 0.
Let us to summarize the series in (29) as follows whereQ i (v n ;v n ) is similar to Q i (v n ;v n ) in Eq. (30) up to a numerical factor, Recall that both distributions (20) and (24) could be found by using the orthogonality of He n (x) and L n (x). We can ask that is there any similar approach to find Eq. (39)? Surprisingly,Q i (v n ;v n ) is related to a generalized class of orthogonal polynomials which are called multiple orthogonal polynomials (see Ref. [38]). These generalized version of the polynomials are orthogonal with respect to more than one weight. Specifically, the polynomials related toQ i (v n ;v n ) have been introduced in Ref. [39]. In order to avoid relatively formal mathematical material here, we refer the interested reader to Appendix D where we briefly review the multiple orthogonal polynomials and re-derive Eq. (39) by employing them.
An important point about the distribution (39) is the convergence of its summation. For sure, finding the convergence condition of the infinite sum in Eq. (39) is beyond the scope of the present paper. However, if we find that at least a few first terms in Eq. (39) is sufficient to give a reasonable approximation of p(v n ), there is no concern about the convergence or divergence of this series practically 17 .
In order to show how much the distribution (39) is a good approximation, we need to have a sample for p(v n ) where itsv n is known. To this end, we generate heavy ion collision events by employing a hydrodynamic based event generator which is called iEBE-VISHNU [40]. The reaction plane angle is set to zero in this event generator. Thus, we can simply find p(v n,x v n,y ) and subsequentlȳ v n . The events are divided into sixteen centrality classes between 0 to 80 percent and at each centrality class we generate 14000 events. The initial condition model is set to be MC-Glauber. Let us recover the notation in Eq. (29) and assume that p q (v n ;v n ) is the distribution (39) where the summation is done up to i = q. We first compute the c 2 {2k} andv 2 from iEBE-VISHNU output and plug the results in Eq. (39). After that we can compare the original simulated distribution p(v n ) with estimated p q (v 2 ;v 2 ). The results are presented in the Fig. 1 for the events in 65-70%, 70-75% and 75-80% centrality classes in which we expect the distribution is deviated from the Bessel-Gaussian. In this figure, the black curve corresponds    to the Bessel-Gaussian distribution (p 0 (v n ;v n )) and the red, green and blue curves correspond to p q (v 2 ;v 2 ) with q = 2, 3 and 4, respectively. Recall that q = 1 has no contribution because 2 vanishes. As can be seen in the figure, the black curve shows that the distribution is de-  viated from Bessel-Gaussian and distribution p q (v 2 ;v 2 ) with q = 0 explains the generated data more accurately.

75-80% Centrality
In order to compare the estimated distributions more quantitatively, we plotted χ 2 /NDF which compare the estimated distribution p q (v 2 ;v 2 ) with iEBE-VISHNU output. We plotted the results in Fig. 2 for q = 0, 2, 3, 4, 5, 6 and 7 for the events in 65-70%, 70-75% and 75-80% centrality classes. The value of χ 2 /NDF associated with the Bessel-Gaussian distribution is much greater than others. Therefore, we multiplied its value by 0.878 to increase the readability of the figure.
As the Fig. 2 demonstrates, the Bessel-Gaussian distribution has less compatibility with the distribution. In addition, the quantity χ 2 /NDF becomes more close to one by increasing q 18 . It is relatively close to one for higher values of q. One may deduce from the figure that the series converges because χ 2 /NDF for q = 6 and q = 7 are very close to each other and very close to one. However, we should say that although it is an strong evidence for the series convergence, there is no guarantee that by adding higher terms the series remain stable. Additionally, we checked the convergence of the series for the case interested in heavy ion physics. This convergence might not be true for an arbitrary distribution in general. In any case, what we learn from the above arguments is that at least few first terms in the series (39) gives a good approximation compared to the original flow harmonic distributions.

C. New Cumulants
Let us come back to the cumulants in Eq. (36) and point out their properties. Considering the new cumulants q n {2k}, we note the following remarks: • Referring to Eq. (13), all the cumulants q n {2k} for k ≥ 1 are vanishing for the distribution δ(v n,x − v n , v n,y ).
The above remarks is an indication that q n {2k} contains information originated from the fluctuations only and the effect of the collision geometryv n is extracted from it. It is important to note that although we have found q n {2k} by RGC distribution inspiration, we think it is completely independent of it and there must be a more direct way to find q n {2k} independent of the RGC distribution.
Concerning the difference between c n {2k} and q n {2k} in terms of Gram-Charlier expansion, we should say that the cumulants c n {2k} appears as the coefficients of the expansion when we expand the distribution p(v n ) around a radial-Gaussian distribution (see Eq. (24)) while q n {2k} are those appear in the expansion around the Bessel-Gaussian distribution. Now, if the distribution under study is more Bessel-Gaussian rather than the radial-Gaussian, we need infinitely many c n {2k} cumulants to reproduce the correct distribution. For instance, for second harmonics all v 2 {2k} are non-zero and have approximately close values. It is because we are approximating a distribution which is more Bessel-Gaussian rather than a radial-Gaussian. On the other hand for third harmonics, we expect that the underling distribution is more radial-Gaussian, and practically we see a larger difference between v 3 {2} and v 3 {4} compared to the second harmonics [8]. Based on the above arguments, we deduce that q n {2k} are more natural choice for the case thatv n is non-vanishing.
Nevertheless, the cumulants q n {2k} (unlike c n {2k}) are not experimentally observable because of the presence ofv n in their definition. However, they are useful to systematically estimate the distribution p(v n ) and consequently estimate the parameterv n . This would be the topic of the next section.

IV. AVERAGED ELLIPTICITY AND FLOW HARMONIC FINE-SPLITTING
In this section, we would like to exploit the cumulants q n {2k} to find an estimation forv n . Note that if we had a prior knowledge about one of the q 2 {2k} or even any function of them (for instance g(q 2 {2}, q 2 {4}, . . .)), we could findv n exactly by solving the equation g(q 2 {2}, q 2 {4}, . . .) = 0 in principle. Because the cumulants c n {2k} are experimentally accessible, one would practically solve an equation g(v n ) = 0. We regret that we have not such a prior knowledge about q 2 {2k}, but we are still able to estimatev n approximately by assuming some properties for p(v n ).
Any given distribution can be quantified by q n {2k}. While p(v n ) is approximately Bessel-Gaussian, we can guess that In fact, it is confirmed by the simulation. The cumulants q 2 {2k} are obtained from the iEBE-VISHNU output and presnted in Fig. 3. Therefore, as already remarked, we expect that a few first cumulants q n {2k} is enough to quantify the main features of a distribution near Bessel-Gaussian. Let us concentrate on n = 2 from now on. Recall that q 2 {4} = q 2 {6} = · · · = 0 corresponds to Bessel-Gaussian distribution. This choice of cumulants equivalent to a distribution with v 2 {4} = v 2 {6} = · · · which is not compatible with the splitting of v 2 {2k} observed in the experiment. As we discussed at the beginning of this chapter, we can findv 2 by estimating any function of cumulants q 2 {2k}. Here we use the most simple guess for this function which is g(q 2 {2}, q 2 {4}, . . .) = q 2 {2k}. Therefore, the equation q 2 {2k} = 0 for each k ≥ 1 corresponds to an specific estimation for p(v 2 ).
For k = 1, we have q 2 {2} = 0 which meansv 2 = v 2 {2}. For this special choice, all the Γ 2k−2 in Eq. (37) diverge unless we set all other q 2 {2k} to zero too. As a result, this choice corresponds to the delta function for p(v 2,x , v 2,y ).
The first non-trivial choice is q 2 {4} = 0. Referring to Eq. (36b), we findv where in the abovev 2 {4} refers tov 2 which is estimated from q 2 {4} = 0. This is exactly the assumption that has  been made in Ref. [26] to find the skewness experimentally. By estimatingv 2 , we can find other q 2 {2k}. We present a few first cumulants in the following, where we used the notation for the fine-splitting between different v n {2k}'s. Furthermore, in Eqs. (42) we expanded q 2 {2k} in terms of finesplitting ∆ 2 {2k, 2 }. Note that the above q 2 {2k} are characterizing an estimated distribution p(v 2 ;v 2 {4}). For such an estimated distribution, q 2 {6} is proportional to the skewness introduced in Ref. [26]. Interestingly, q 2 {8} is proportional to ∆ 2 {4, 6} − 11∆ 2 {6, 8} which has been considered to be zero in Ref. [26]. However, here we see that this combination can be non-zero and its value is related to the cumulant q 2 {8}. The equations in (42) indicate that by assuming q 2 {4} = 0 all the other cumulants of p(v 2 ;v 2 {4}) are written in terms of the fine-splitting ∆ 2 {k, }. Therefore, the distribution p(v 2 ;v 2 {4}) satisfies all the fine-splitting structure of v 2 {2k} by construction.
One can simply check that how much the estimation v 2 {2k} is accurate by using simulation. We exploit again the iEBE-VISHNU event generator to compare the true value ofv 2 (v True 2 ) withv 2 {4} = v 2 {4}. The result is depicted in Fig. 4 by brown and green curves forv True 2 andv 2 {4}, respectively. As the figure illustrates,v 2 {4} is not compatible withv True 2 for centralities higher than 50% where we expect that Bessel-Gaussian distribution does not work well. We should note that all other v 2 {2k} for k > 2 never close to the true value ofv 2 in higher centralities because all of them are very close to v 2 {4}.
In order to improve the estimation ofv 2 , we set q 2 {6} = 0 in Eq. (36c). This equation has six roots where only two of them are real and positive. In addition, as can be seen from Fig. 4, the true value ofv 2 is always smaller than v 2 {4} (=v 2 {4}) in higher centralities. In fact, it is true for all v 2 {2k} for k > 2. Based on this observation, we demand the root to be smaller than v 2 {4}. In fact, we have checked that the equation q 2 {2k} = 0 for k = 3, 4, 5 has only one root which is real, positive and smaller than v 2 {2k} for k = 2, 3, 4, 5.
In Fig. 4,v 2 {6} is plotted by a red curve which is obtained by solving q 2 {6} = 0 numerically. As can be seen, it is more close to the real value ofv 2 rather than v 2 {4}. In fact, we are able to find this root analytically too,v (44) which is compatible with the red curve in Fig. 4 with a good accuracy. Using the estimator (44), we find other q 2 {2k} as follows, By comparing Eqs. (42) and (45), we note that q 2 {2k} (except q 2 {10}) are different because the estimated distributions p(v 2 ;v 2 {4}) and p(v 2 ;v 2 {6}) are different. We can go further and estimatev 2 by solving the equation q 2 {8} = 0. The result is plotted by blue curve in Fig. 4. Also, its analytical value can be found as follows, As Fig. 4 indicates, comparingv 2 {6}, the estimator v 2 {8} is a worse estimation ofv 2 (except between the range 30% to 50% centralities). We might expect that because q 2 {8} q 2 {6} (see Fig. 3), the quantityv 2 {8} should be a better approximation thanv 2 {6}. But this argument is not true. In fact, the cumulants q 2 {2k} for the true distribution are small, but they are non-zero at any centralities. Let us rewrite the Eqs. (36b)-(36d) as follows, where The estimatorv 2 {2k} (k = 2, 3, 4) can be found by solving the Eqs. (47a,b,c) where we set δ 2k = 1. Alternatively, by employing the actual value of δ 2k from the simulation we findv True 2 . In fact, the difference betweenv True 2 andv 2 {2k} is a manifestation of the inaccuracy in the setting δ 2k = 1. In other words, demanding q 2 {2k} = 0 is not exactly correct. Looking at the problem from this angle, by referring to Fig. 4, we realize that δ 4 = 1 is the most inaccurate approximation. Also, δ 6 = 1 is more accurate than δ 8 = 1. By using iEBE-VISHNU generated data, we can check the accuracy of the δ 2k = 1 estimation by comparing different values of δ 2k (for k = 2, 3, 4, 5) calculated from simulation. The result is plotted in Fig. 5. This figure confirms the difference between the estimatorsv 2 {2k} discussed above. The quantity δ 4 has the most deviation from unity. Also, we see that δ 8 deviates from unity for centralities above 55% while δ 6 (and δ 10 ) is more close to one up to 65% centrality. Moreover, δ 6 (δ 10 ) is larger than δ 8 for all centralities. This can be considered as a reason for the fact thatv 2 {8} is less accurate thanv 2 {6} (andv 2 {10}).
Furthermore, let us mention that the cumulant q 2 {6} changes its sign (see Fig. 3 and Fig. 5) for the centralities around 60-65%. It means it is exactly equal to zero at a specific point in this range, and we expect thatv 2 {6} becomes exactly equal tov True 2 at this point. This can be seen also in Fig. 4    The result is plotted in Fig. 6. In finding the estimated v 2 , we employed v 2 {2k} reported by ATLAS collaboration in Ref. [17]. The value ofv 2 {4} is exactly equal to v 2 {4} which is plotted by green curve in the figure. By plugging experimental values of v 2 {2k} into the relations (36c), (36d) and (36e) and setting them to zero, we have found numericallyv 2 {6} (red curve),v 2 {8} (blue curve) andv 2 {10} (black curve), respectively 19 . As can be seen, the errors ofv 2 {10} is too large for the present experimental data, and more precise observation is needed to find more accurate estimation. Exactly similar to the iEBE-VISHNU simulation, the value ofv 2 {8} is between v 2 {4} andv 2 {6}. Therefore, we expect that the true value of the averaged ellipticity to be close to the value 19 In details, all the Eqs. (36c)-(36e) were written in terms of moments v 2k 2 . Considering the experimental distribution p(v 2 ) reported in Ref. [17], we are able to produce the covariance matrix associated with statistical fluctuations of the moments v 2k 2 . Using the covariance matrix, we generated 10000 random number by using a multidimensional Gaussian distribution. Employing each random number, we solved the equations (36c)-(36e) numerically and found estimatedv 2 . We obtained the standard deviation of the finalv 2 distribution as the statistical error of thev 2 {2k}. ofv 2 {6} 20 .
In this section, we introduced a method to estimate p(v 2 ,v 2 ). By considering the cumulants c 2 {2k} of the true distribution p(v 2 ), we estimatedv 2 by assuming that the cumulant q 2 {2k} of p(v 2 ,v 2 ) is zero for a specific value of k. These estimations for q 2 {6} = 0 and q 2 {8} = 0 are presented analytically in Eq. (44) and Eq. (46) and also with red and blue curves in Fig. 4 numerically. We exploited hydrodynamic simulation to investigate the accuracy of our estimations. We found thatv 2 {6} is more accurate thanv 2 {8}. Finally, we found the experimental values forv Until now, we considered the cumulants v n {2k} as an input to find an estimation forv n . In the next section, we try to restrict the phase space of the allowed region of v n {2k} by using cumulants q n {2k}. At the first, we consider the Eq. (47b). The quantities v 2 {2k} andv 2 are real valued. Therefore, it is a welldefined and simple question that what are the allowed values of v 2 {4} and v 2 {6} such that the Eq. (47b) has at least one real root. The polynomial in the left-hand side of Eq. (47b) goes to positive infinity forv 2 → ±∞. As a result, it has at least one real root if the polynomial is negative in at least one of its minima. This condition is satisfied for Since δ 6 is unknown, there is no bound on v 2 {4} and v 2 {6}. Although we do not know the exact value of δ 6 , we know that 0.9 δ 1/6 6 1 based on our simulation (see Fig. 5). In this case, if we observe v 2 {6} ≤ v 2 {4} we immediately deduce the inequality Eq. (49). On the other hand, there is a possibility to observe that v 2 {6} is slightly greater than v 2 {4}. This observation means that δ 6 is definitely smaller than unity. Here, we show both cases by an approximate inequality as v 2 {4} v 2 {6} due to the smallness of δ 1/6 6 . 20 Comparing Fig. 4 with Fig. 6, one finds that the values ofv 2 {2k} from simulation is relatively smaller than that obtained from the real data. This deviation is due to the difference in p T range. In Fig. 6, we used the data from Ref. [17] where p T > 0.5 GeV, while the output of the iEBE-VISHNU is in the range p T 4 GeV. For a confirmation of iEBE-VISHNU output, we refer the reader to Ref. [19,20] where v 2 {4} is reported for p T below 3 GeV. The order of magnitude ofv 2 {2k} in our simulation is compatible with that mentioned in Ref. [19,20].  . 7. (Color online) The allowed region of v2{6}-v2{8} phase space for the two fixed values of v2{4} and δ6 = δ8 = 1.
Alternatively, it is known that the initial eccentricity point (ε 2,x , ε 2,y ) is bounded into a unit circle [24], and it leads to a negative skewness for p(ε 2,x , ε 2,y ) in noncentral collisions. By considering the hydrodynamic linear response, the skewness in p(ε 2,x , ε 2,y ) is translated into p(v 2,x , v 2,y ) skewness and condition v 2 {4} > v 2 {6} [26]. However, it is possible that the non-linearity of the hydrodynamic response changes the order in the inequality to the case that v 2 {6} is slightly greater than v 2 {4}. This is compatible with the result which we have found from a more general consideration.
Second, we concentrate on the Eq. (47c). Due to the complications in finding the analytical allowed values of v 2 {2k}, we investigate it numerically. First, we consider the case that δ 6 = δ 8 = 1. In this case, we fix a value for v 2 {4} and after that randomly generate v 2 {6} and v 2 {8} between 0 to 0.15. Putting the above generated and fixed values into Eq. (47c), we can findv 2 numerically. If the equation has at least one real solution, we accept (v 2 {6}, v 2 {8}), otherwise we reject it. The result is presented as scatter plots in Fig. 7. As can be seen from the figure, some region of the v 2 {2k} phase space is not allowed. The condition 11∆ 2 {6, 8} ≥ ∆ 2 {4, 6} (see the square root in (46)) indicates that the border of this allowed region can be identified with v 2 {8} = (12v 2 {6} − v 2 {4})/11 up to order ∆ 2 {2k, 2 }. This is shown by red line in Fig. 7. Alternatively, the numerically generated border of the allowed region slightly deviates from the analytical border line. It happens for the region that v 2 {4} is different from v 2 {6} and v 2 {8} considerably. The reason is that ∆ 2 {2k, 2 } is not small in this region, and the condition 11∆ 2 {6, 8} ≥ ∆ 2 {4, 6} is not accurate anymore.
Let us combine the constraint obtained from Eq. (47b) and Eq. (47c). For a more realistic study, we use the ATLAS data for v 2 {4} as an input. Instead of using a fixed value for v 2 {4}, we generate it randomly with a Gaussian distribution where it is centered around the central value of v 2 {4}, and the width equal to the error of v 2 {4}. The result for 40-45% centralities is presented in Fig. 8(a). For this case, we expect that the Bessel-Gaussian distribution works well. As a result, we assume δ 6 δ 8 1 (see Fig. 5). From ATLAS results For more peripheral collisions, we expect a non-zero value for δ 2k . In the 65-70% centrality class (the most peripheral class of events reported by ATLAS in Ref. [17]), we have v 2 {4} 0.093 ± 0.002. According to our simulation in this centrality class, we expect that the values of δ 6 and δ 8 to be 0.88 and 0.8, respectively. However, here we do not choose fixed values for δ 6 and δ 8 . Instead, we generate a random number between 0.8 to 1 and assign the result to both δ 6 and δ 8 . The result is presented in Fig. 8(b). Referring to this figure, the allowed region is compatible with the experiment similar to the previous case. For non-zero values of δ 6 and δ 8 , the allowed region can be identified by . These two constraints (similar to Fig. 8(a)) are presented by two bands in Fig. 8(b). In this case, the width of the bands is due to the inaccuracy in δ 6 and δ 8 together with v 2 {4}.
By considering the correlation between v 2 {6} and v 2 {8}, the experimental one sigma region of v 2 {6}-v 2 {8} would not be a simple domain. Nevertheless, for the present inaccurate case which is depicted in Fig. 8, we are able to restrict the one sigma domain by comparing it with the allowed region showed by the blue dots.
Let us summarize the constraint on the flow harmonic 0 to find the allowed region of v2{6}-v2{8} phase space. In this case, δ6, δ6 and v2{4} are not completely fixed. The region is compatible with the ATLAS data reported in Ref. [17].
fine-splitting as follows which correspond to the region filled by the blue dots in Fig. 8. Note that we used the approximate inequality in Eq. (50b) because of not only ignoring δ 1/8 8 but also ignoring terms with order O(∆ 2 ).
fied in all centralities. Furthermore, the first inequality in Eq. (51) is saturated while the second inequality is not saturated. Regarding the second inequality in Eq. (51), we should say that a more accurate experimental observation may split the values of v 2 {6}/v 2 {4} and 11v 2 {6}/12v 2 {4} + 1/12, but the black dots in Fig. 9 should not be placed on the top of the red dots with a considerable distance.
Finally we would like to mention that the Eq. (36e) does not lead to any constraint on v 2 {2k} for 0.5 δ 10 1. The reason is thatv 2 = 0 is a minimum of the following relation

VI. CONCLUSION AND OUTLOOK
In this work, we have employed the concept of Gram-Charlier A series to relate the distribution p(v n ) to c n {2k}. We have found an expansion around the Bessel-Gaussian distribution where the coefficients of the expansion was written in terms of a new set of cumulants q n {2k}. We have shown that the corrected Bessel-Gaussian distribution can fit the actual distribution p(v n ) much better than Bessel-Gaussian distribution. The new cumulants q n {2k} was written in terms of c n {2k} and averaged flow harmonicv n . Because the only non-vanishing new cumulants are q n {2} for Bessel-Gaussian distribution, they are more natural choice to study the distributions near Bessel-Gaussian compared to c n {2k}. In fact, the advantage of q n {2k} compared to c n {2k} is that these cumulants only depend on the fluctuations and the effect of the averaged flow harmonicsv n are removed.
By using the cumulants q n {2k}, we could systematically introduce different estimations for p(v n ) and consequently relate the averaged ellipticityv 2 to the flow harmonic fine-splitting v 2 {2k} − v 2 {2 } for k, ≥ 2 and k = .
As an specific example for thev 2 estimator, we have We have used iEBE-VISHNU event generator to compare the true value of thev 2 to the estimated one, also, we have shown that the estimatorv 2 {6} is more accurate than v 2 {2k} for k > 1. As another application of new cumulants, we have constrained the phase space of the flow is zero or greater than zero we need a more accurate observation.
One should note that we have shown the compatibility of allowed phase space of v 2 {2k} with experimental results of (high multiplicity) Pb-Pb collisions. Recently, the flow harmonics are measured for p-p, p-Pb and lowmultiplicity Pb-Pb collisions by ATLAS in Ref. [41]. In the light of q 2 {2k} cumulants, it would be interesting to study the similarity and difference between the splitting of v 2 {2k} in these systems and examine the compatibility of the results with the allowed region comes from q 2 {2k}.
Furthermore, we have only focused on the distribution p(v n ) in the present study. However, based on the observation of symmetric cumulants and event-plane correlations, we expect that a similar systematic study for distribution p(v 1 , v 2 , . . .) can connect this joint distribution to the observations. Such a study would be helpful to relate the initial state event-by-event fluctuations to the observation. This would be a fruitful area for further work.

ACKNOWLEDGMENTS
The authors would like to thank Jean-Yves Ollitrault for useful discussions and comments during "IPM Workshop on Collective Phenomena & Heavy Ion Physics". We also thank Ali Davody for providing us the iEBE-VISHNU data and discussions. We thanks Mojtaba Mohammadi Najafabadi and Ante Bilandzic for useful comments. We would like to thank Hessamaddin Arfaei, Navid Abbasi, Reza Goldouzian, Farid Taghinavaz and Davood Allahbakhshi for discussions. We thank to participants of "IPM Workshop on Collective Phenomena & Heavy Ion Physics". The Cartesian cumulants A mn can be found in terms of the moments x k y by using the first line of the Eq. (4). A few first cumulants can be found as follows, Note that for a fixed number χ we have A mn → χ m+n A mn by replacing x → χx and y → χy. We call m + n as the order of A mn . Consequently, there are m + n + 1 number of cumulants with the order m + n.
In order to find the cumulants C m,n in polar coordinates x = r cos ϕ and y = r sin ϕ, we use the second line of the Eq. (4). Considering the Jacobi-Anger identity we are able to write the Eq. (4) as follows, where in the left-hand side of the above we have used the series form of the Bessel function J n (kr).
In Eq. (A3), the combination (ik) 2m+n e −inϕ k is appeared in the left-hand side. It means that if we have odd |n| then the power of ik is odd and if we have even |n| then the power of ik is even. Therefore, the only non-vanishing C m,n are those both m and |n| are odd or even. The other consequence of the combination (ik) 2m+n e −inϕ k is that for |n| > m we have C m,n = 0. The reason is that in the right-hand side of the Eq. (A3), the combination (ik) m e in ϕ k is appeared. Therefore, in order to have a non-vanishing C m,n we need to have some terms in the left-hand side such that 2m + n = m and n = −n . It immediately leads to m = (m + n )/2 ≥ 0 and m + n = (m − n )/2 ≥ 0. One should note that the (m + n)! in the denominator of Eq. (A3) diverges if m + n < 0. As result ,we deduce that C m ,n can be nonvanishing if |n | ≤ m . Strictly speaking, for m ≥ 0 only cumulants C m,−m , C m,−m+2 , · · · C m,m−2 , C m,m (A4) are non-zero. Also, note that we have C m,−n = C * m,n because the distribution is a real function.
Based on the above considerations one can find all nontrivial values of C m,n in terms of moments r m e inϕ by using Eq. (A3). A few first two dimensional cumulants in polar coordinates are presented explicitly in the following The first cumulant, C 0,0 , is equal to zero by considering the normalization condition of the probability distribution. By redefinition W 0,n = C n,0 , W c n,m = (C m,n ), W s n,m = − (C m,n ), we find the cumulants in polar coordinate which has been obtained in Ref. [28]. In Ref. [28], the translational and rotational invariance have been used to eliminate W c 1,1 , W s 1,1 and W s 2,2 . In our notation, it is equivalent to C 1,1 = C 1,−1 = 0 and C 2,2 = C 2,−2 .
One notes that by replacing r with χr we have C m,n → χ m C m,n . In other words, the order of C m,n is indicated by m in polar coordinates. Referring to Eq. (A4), we find that there are m + 1 cumulants with order m in polar coordinates. As a result, we are able to find a one-toone relation between C m,n and A k with same order by equating two sides of the equation In the left-hand side of the above equality, we re-write m,n ((ik x ) m (ik y ) n A mn /(m!n!)) in polar coordinates. A few first cumulants in polar coordinates can be written in terms of A mn as follows, The cumulants C m,−n can be obtained easily by using the reality condition. One can invert the above equations to find A mn in terms of C m,n too, For rotationally symmetric distributions, only the moments r 2m survive in the polar coordinates. In such a case, the only non-zero polar cumulants are C 2k,0 . For instance, all the cumulants in Eq. (A7) vanish except C 2,0 where it is equal to r 2 /2. A few first C 2k,0 are given by C 6,0 = 5 16 r 6 − 9 r 4 r 2 + 12 r 2 3 , In Sec. III A, we have introduced an iterative method to find the Gram-Charlier A series for a one-dimensional distribution. Also, we shortly explained the standard method of finding the series by using the orthogonal polynomials. Here, we use another standard method of finding the Gram-Charlier A series.
Let us consider the Characteristic Function of a given distribution p(x) as G(t) = dx e itx p(x). Then one can find the cumulants of the distribution by using Cumulative Function, log G(t) = n=1 κ n (it) n /n!. (B1) Now assume that p(x) can be approximated by a known distribution, namely a Gaussian distribution, By setting the mean value µ = 0 and the width σ = 1 for simplicity, the generating function of a Gaussian distribution can be simply find as G N (t) = e −t 2 /2 . Referring to Eq. (B1), we have G(t) = e n=1 κn(it) n /n! . Additionally, we consider that the mean value and the variance of the distribution p(x) are κ 1 = µ = 0 and κ 2 = σ 2 = 1. It means that we are able to write G(t) for a generic distribution p(x) as follows, Now, we are able to find p(x) by using inverse Fourier transformation. By considering the relation dt(it) n e −itx e −t 2 /2 = (−d/dx) n dte −itx e −t 2 /2 , we have p(x) = e n=3 κn(−d/dx) n /n! dte −itx e −t 2 /2 /2π = e n=3 κn(−d/dx) n /n! e −x 2 /2 / √ 2π .

(B3)
By scaling x → x/σ and shifting x → x − µ , we find 21 Recall that the probabilistic Hermite polynomial is defined as He r (x) = e x 2 /2 (−d/dx) r e −x 2 /2 . As a result, one can simply find It is worth noting that the Eq. (B4) is exact. We are able to approximate this exact relation by expanding it in terms of number of derivatives. The result of such an expansion is called Gram-Charlier A series, where h n has presented in Eq. (21) (h 1 = 1, h 2 = 0). 21 Note that after scaling x → x/σ, we have (d/dx) → σ(d/dx). Additionally, κr scales to κr/σ r . These two scalings cancel each other and we find Eq. (B4).

Gram-Charlier A Series: Two Dimensions
Consider the following two dimensional Gaussian distribution, which is more general than Eq. (14). Referring to Eq. (3), we can find the characteristic function of Eq. (B6) as follows Now, assume that the mean values x and y ( · · · = dxdy · · · p(r)) are exactly equal to µ x and µ y , respectively, and alternatively set µ x = µ y = 0 by a shift. Then, the characteristic function of p(r) can be written as follows where in the last line A mn are exactly as A mn except Note that we could choose A 20 = σ x and A 02 = σ y similar to what we did in one dimension. But here we do not use this convention for the future purpose. We can find p(r) by an inverse Fourier transformation, where for finding the above we replaced (ik x ) m (ik x ) n by (−1) m+n (d m+n /dx m dy n ) in Eq. (B8). The Gram-Charlier A series can be found by expanding the exponential in Eq. (B10) in terms of number of derivatives. By recovering µ x and µ y by a reverse shift in r, we find In the above, we have used a trivial extension of Eq. (B5) to two dimensions. The coefficients h mn for m + n ≤ 2 in the Eq. (B11) are given as follows: Additionally, we have For m + n = 6 we have, The other h mn can be found accordingly.

Gram-Charlier A Series and Energy Density Expansion
We should say that the Gram-Charlier A series (Eq. (B11)) is exactly what has been used in Ref. [28] to quantify the initial energy density deviation from a rotationally symmetric Gaussian distribution 22 .
Due to the translational invariance, we can freely choose the origin of the coordinates such that {re ±iϕ } = 0. Using this and referring to Eq. (A5) and Eq. (B15), one finds (B16) 22 If we had considered A 20 = σx and A 02 = σy in Eq. (B9), we would have found the energy density distribution around a Gaussian distribution which is not rotationally symmetric.
It is common in the literature to define initial anisotropies as ε 1 ≡ ε 3,1 , ε 2 ≡ ε 2,2 and ε 3 ≡ ε 3,3 . According to Eq. (B16), initial anisotropies are itself cumulants. In fact, the parameter ε n quantifies the global features of the initial energy density. For instance, ε 1 , ε 2 and ε 3 quantify dipole asymmetry, ellipticity and triangularity of the distribution respectively. The symmetry planes Φ 1 ≡ Φ 3,1 , Φ 2 ≡ Φ 2,2 and Φ 3 ≡ Φ 3,3 quantify the orientation of the dipole asymmetry, ellipticity and triangularity with respect to a reference direction. Furthermore, one can relate two dimensional cumulants in Cartesian coordinates, A mn , to those in polar coordinates, C m,n by using Eq. (A8). Using them, we can find Now, we use Eq. (B11) to find a Gram-Charlier A series for the energy density. The result is the following, ρ(r, ϕ) ∝ e − r 2 {r 2 } 1 + 3 n=1 a n (r)ε n cos n(ϕ − Φ n ) , In the above, we assumed that σ x = σ y = {r 2 }/2. This result coincides with that has been presented in Ref. [28]. Cumulants C m,n , m > 3 cannot be written simply in terms of one moment ε m,n , however, it is always possible to find Gram-Charlier A series Eq. (B18) to any order beyond triangularity.
by averaging out the azimuthal direction of Eq. (B11). We should emphasize that this calculation has already been done in Ref. [27] for the case that p(r) is rotationally symmetric around the origin. However, we will try to find such a radial series for a more general case. For simplicity, we assume that µ y ≡ A 01 = 0 (µ x ≡ A 10 ) together with σ ≡ σ x = σ y = A 20 = A 02 . Also, we consider that A 11 = 0. These considerations are compatible with p(v n,x , v n,y ). For such case, h 20 = h 11 = h 02 = 0.
By averaging out the azimuthal direction of Eq. (B11), we will find a distribution which has the following form, p(r) = r σ 2 e − r 2 +µ 2 x 2σ 2 We can find the polynomials P 1,i (r) and P 2,i (r) by direct calculations. For i = 1 and 2, one simply finds that P 1,1 (r) = 1 and P 2,1 (r) = P 1,2 (r) = P 2,2 (r) = 0. The polynomials for i = 3 are given by, Finding P 1,4 (r) and P 2,4 (r) are straightforward, but they are more complicated than Eq. (C2). For that reason, we decided not to present them considering the limited space. Note that at each P a,i (r) only cumulants A mn (if there is any) with order m + n = i are present 23 . In Sec. II A, we have argued that the one-dimensional distributions like p(v n ) can be characterized by c n {2k}. As a result, we expect to relate A mn to c n {2k} by considering Eq. (C1) (replace r with v n and µ x withv n ). Considering Eq. (8) and using Eq. (C1) up to i = 4, we can find c n {2k} for k = 1, 2, 3 as follows, ). (C3c) The above result is not surprising. In fact, it is a general relation between A mn of the distribution p(v n,x , v n,y ) and c n {2k} of p(v n ). Recall the equality e ivnk cos(ϕ−ϕ k ) 2D = J 0 (kv n ) 1D in Eq. (6) where the average · · · 2D is performed with respect to the distributionp(v n,x , v n,y ). For the case that the average is performed with respect to p(v n,x , v n,y ), this relation should be replaced by 1 2π 2π 0 dϕ k e ivnk cos(ϕ−ϕ k ) 2D = J 0 (kv n ) 1D . (C4) Using above, Eq. (4) and Eq. (7), we find log 1 2π  The relation for c n {6} is more complicated. Note that by setting the simplifications we used in finding Eq. (C1), the above relations reduce to Eq. (C3a) and Eq. (C3b). In general, c n {2k} is written in terms of a large number of A pq , therefore, we are not able to write A pq in terms of cumulants c n {2k}. However, it is possible to ignore the cumulants A pq with p + q ≥ 4 at some cases. For n = 2, this truncation leads to a reasonable approximation for c 2 {4} and c 2 {6}, and we are able to find A 30 in terms of c 2 {2}, c 2 {4} and c 2 {6} which has been done in Ref. [26]. In Ref. [27], however, it has been shown that keeping 2D Cartesian cumulants up to forth order does not lead to a reasonable approximation for c 2 {8}. Therefore, we are not able to find forth order A mn in terms of c 2 {2k} simply.
Referring to Eq. (39), we see that the coefficients 2i (as a function of c n {2k} andv n ) appeared as an overall factors at each order of approximation. In contrast, the coefficients h mn in Eq. (C1) do not appear in the polynomial P a,i (r) as an overall factor. Now, let us compute the limit µ x → 0. A straightforward calculation shows that I 0 rµ x σ 2 P 1,3 (r) + rµ x 2σ 2 I 1 rµ x σ 2 P 2,3 (r) → 0.
As a result, the bracket in Eq. (C1) is equal to one up to i = 3. However, the bracket for i = 4 is non-trivial in the limit µ x → 0, I 0 rµ x σ 2 P 1,4 (r) + rµ x 2σ 2 I 1 rµ x σ 2 P 2,4 (r) The last line in the above can be written as This result is interesting because it shows that the distribution (C1) has simpler form only if we assume µ x ≡ A 10 → 0. Let us again consider the flow harmonic distribution (r → v n and µ x →v n ). By setting A 20 = A 02 = σ 2 and A 10 = A 01 = A 11 = 0, we find from (C5b) that Therefore, the explicit form of Eq. (C1) up to i = 4 is given by which is compatible with Eq. (24) and what has been found in Ref. [27]. We can check that for A 11 = 0 the above result is true too 24 .
It is worth noting that we have not assumed the original distribution to be rotationally symmetric. In fact, the original distribution can have non-zero A 30 or A 12 too, but they are not appeared in c n {2k} or the distribution (C7). It is due to the fact that their information is lost during the averaging.
One could ask that while h 12 and h 30 are present in the polynomials P 1,3 (v n ) and P 2,3 (v n ) in Eq. (C2), it maybe possible to find all h mn from Eq. (C1) by fitting it to an experimental p(v n ). We should say that always a combination of h mn appears in the polynomials P a,i (v n ). As a result, if we assume that the series is convergent and also we have access to an accurate experimental distribution, we can only find a combination of h mn due to the information loss during the averaging. Specifically, in the simple Eq. (C7) case, only A 40 + 2A 22 + A 04 can be found by fitting.
In any case, the distribution (C1) is different from that mentioned in Eq. (39). These two different series refer to two different truncations. Note that q n {2k} can be written in terms of A pq with p + q ≤ 2k. For instance, if we keep A mn up to forth order, all the cumulants q n {2k} are non-zero. Therefore, in principal, the summation (39) goes to infinity while the summation (C1) is truncated up to i = 4.
An example of MOPs can be found in Ref. [39] where the polynomials are orthogonal with respect to a weight vector (w ν,c (x), w ν+1,c (x)). Here, is defined for x > 0, ν > −1 and c > 0. In this study, we are interested in these specific MOPs because they are related toQ i (v n ;v n ) in Eq. (40).
For the present specific case, we have r = 2. As a result, the index n has the form (k, k) or (k + 1, k) referring to Eq. (D2). The explicit form of a few first polynomials P n for ν = 0 are given by, (D6) and