Non-Bessel–Gaussianity and flow harmonic fine-splitting

Both collision geometry and event-by-event fluctuations are encoded in the experimentally observed flow harmonic distribution p(vn)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p(v_n)$$\end{document} and 2k-particle cumulants cn{2k}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_n\{2k\}$$\end{document}. In the present study, we systematically connect these observables to each other by employing a Gram–Charlier A series. We quantify the deviation of p(vn)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p(v_n)$$\end{document} from Bessel–Gaussianity in terms of harmonic fine-splitting of the flow. Subsequently, we show that the corrected Bessel–Gaussian distribution can fit the simulated data better than the Bessel–Gaussian distribution in the more peripheral collisions. Inspired by the Gram–Charlier A series, we introduce a new set of cumulants qn{2k}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q_n\{2k\}$$\end{document}, ones that are more natural to use to study near Bessel–Gaussian distributions. These new cumulants are obtained from cn{2k}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_n\{2k\}$$\end{document} where the collision geometry effect is extracted from it. By exploiting q2{2k}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q_2\{2k\}$$\end{document}, we introduce a new set of estimators for averaged ellipticity v¯2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bar{v}_2$$\end{document}, ones which are more accurate compared to v2{2k}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$v_2\{2k\}$$\end{document} for k>1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k>1$$\end{document}. As another application of q2{2k}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q_2\{2k\}$$\end{document}, we show that we are able to restrict the phase space of v2{4}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$v_2\{4\}$$\end{document}, v2{6}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$v_2\{6\}$$\end{document} and v2{8}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$v_2\{8\}$$\end{document} by demanding the consistency of v¯2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bar{v}_2$$\end{document} and v2{2k}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$v_2\{2k\}$$\end{document} with q2{2k}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q_2\{2k\}$$\end{document} equation. The allowed phase space is a region such that v2{4}-v2{6}≳0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$v_2\{4\}-v_2\{6\}\gtrsim 0$$\end{document} and 12v2{6}-11v2{8}-v2{4}≳0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$12 v_2\{6\}-11v_2\{8\}-v_2\{4\}\gtrsim 0$$\end{document}, which is compatible with the experimental observations.


Introduction
It is a well-established picture that matter produced in a heavy ion experiment shows collective behavior. Based on this picture, the initial energy density manifests itself in the final particle momentum distribution. Accordingly, as the main consequence of this collectivity, the final particle momentum distribution has extensively been studied by different experimental groups in the past years. As a matter of fact, the experimental groups at Relativistic Heavy Ion Collider a e-mail: hadi.mehrabpour@ipm.ir b e-mail: s.f.taghavi@tum.de (RHIC) and Large Hadron Collider (LHC) measure the flow harmonics v n [1][2][3][4][5][6][7][8], the coefficients of the momentum distribution Fourier expansion in the azimuthal direction [9][10][11]. All these observations can be explained by several models based on the collective picture.
Finding the flow harmonics is not straightforward, because the reaction-plane angle (the angle between the orientation of the impact parameter and a reference frame) is not under control experimentally. Additionally, the Fourier coefficients cannot be found reliably due to the low statistic in a single event. These issues enforce us to use an analysis more sophisticated than a Fourier analysis. There are several methods to find the flow harmonics experimentally, namely the event-plane method [12], multiparticle correlation functions [13,14] and Lee-Yang zeros [15,16]. The most recent technique to find the flow harmonics is using the distribution of the flow harmonic p(v n ). This distribution has been obtained experimentally by employing the unfolding technique [17,18].
It is well known that the initial shape of the energy density depends on the geometry of the collision and the quantum fluctuations at the initial state. As a result, the observed flow harmonics fluctuate event-by-event even if we fix the initial geometry of the collision. In fact, the event-by-event fluctuations are encoded in p(v n ) and experimentally observed flow harmonics as well. It is worthwhile to mention that the observed event-by-event fluctuations are a reflection of the initial state fluctuations entangled with the modifications during different stages of the matter evolution, namely collective expansion and hadronization. For that reason, exploring the exact interpretation of the flow harmonics is crucial to understand the contribution of each stage of the evolution on the fluctuations. Moreover, there has not been found a well-established picture for the initial state of the heavy ion collision so far. The interpretations of the observed quantities contain information about the initial state. This information can shed light upon the heavy ion initial state models too.
According to the theoretical studies, the flow harmonics v n {2k} obtained from 2k-particle correlation functions are different, and their difference is due to non-flow effects [13,14] and event-by-event fluctuations [19]. We should point out that experimental observations show that v 2 {2} is considerably larger than v 2 {4}, v 2 {6} and v 2 {8}. Additionally, all the ratios v 2 are different from unity [20,21]. Alternatively, the distribution p(v n ) is approximated by a Bessel-Gaussian distribution (corresponding to a Gaussian distribution for v n fluctuations) as a simple model [10,22]. Based on this model, the difference between v 2 {2} and v 2 {4} is related to the width of the v 2 fluctuations. However, this model cannot explain the difference between the other v 2 {2k}.
In the past years, several interesting studies of the non-Gaussian v n fluctuations have been done [23][24][25][26][27][28]. Specifically, it has been shown in Ref. [27] that the skewness of the v 2 fluctuations is related to the difference v 2 {4} − v 2 {6}. Also, the connection between kurtosis of v 3 fluctuations and the ratio v 3 {4}/v 3 {2} has been studied in Ref. [28]. It is worth noting that the deviation of v n fluctuations from a Gaussian distribution immediately leads to the deviation of p(v n ) from a Bessel-Gaussian distribution. In [29], the quantities v n {4} − v n {6} and v n {6} − v n {8} for a generic narrow distribution are computed.
In the present work, we will introduce a systematic method to connect v n {2k} to the distribution p(v n ). In Sect. 2, we have an overview of the known concepts of cumulants, flow harmonic distributions and their relation with the averaged flow harmonicsv n . Section 3 is dedicated to the Gram-Charlier A series in which we find an approximate flow harmonic distribution in terms of c n {2k}. Specifically for the second harmonics, we show that the deviation of p(v 2 ) from Bessel-Gaussianity is quantified by the fine-splitting v 2 {2k} − v 2 {2 } where k, ≥ 2 and k = . These studies guide us to defining a new set of cumulants q n {2k} where they depend on the event-by-event fluctuations only. In Sect. 4, we use the new cumulants to introduce more accurate estimations for the average ellipticity. As another application of new cumulants, we use q 2 {2k} to constrain the v 2 {4}, v 2 {6} and v 2 {8} phase space in Sect. 5. We show that the phase space is restricted to a domain where v 2 {4} − v 2 {6} 0 1 and 12v 2 {6} − 11v 2 {8} − v 2 {4} 0. We present the conclusion in Sect. 6. The supplementary materials can be found in the appendices. We would like to emphasize that in Appendix C, we found a one-dimensional distribution for p(v n ) which is different from that mentioned in Sect. 3. Additionally, an interesting connection between the expansion of p(v n ) in terms of the cumulants c n {2k} and the relatively new con- 1 In Ref. [27], the constraint v 2 {4} > v 2 {6} is deduced from the initial eccentricity, and the fact that the initial eccentricity is bounded to a unit circle. cept of multiple orthogonal polynomials in mathematics is presented in Appendix D.

Flow harmonic distributions and 2k-particle cumulants
This section is devoted to an overview of already well-known concepts concerning the cumulant and its application to study the collectivity in the heavy ion physics. We present this overview to smoothly move forward to the harmonic distribution of the flow and its deviation from Bessel-Gaussianity.

Correlation functions vs. distribution
According to the collective picture in the heavy ion experiments, the final particle momentum distribution is a consequence of the initial state after a collective evolution. In order to study this picture quantitatively, the initial anisotropies and flow harmonics are used extensively to quantify the initial energy density and final momentum distribution. The initial energy (or entropy) density of a single event can be written in terms of the initial anisotropies, namely the ellipticity and triangularity. Specifically, ellipticity and triangularity (shown by ε 2 and ε 3 , respectively) are cumulants of the distribution indicating how much it is deviated from a two-dimensional rotationally symmetric Gaussian distribution [30].
The final momentum distribution is studied by its Fourier expansion in the azimuthal direction, In the above, v n and ψ n are unknown parameters, which can be found easily viav n ≡ v n e inψ n = e inφ . Here, the averaging is obtained by using the distribution 1 N dN dφ in a given event. The parameterv n is called flow harmonic. Instead of a complex form, we occasionally use the flow harmonics in Cartesian coordinates, v n,x = v n cos nψ n , v n,y = v n sin nψ n .
Low multiplicity in a single event, randomness of the reaction plane angle and non-flow effects are the main challenges in extracting the experimental values of the flow harmonics. By defeating these experimental challenges, one would be able to find the distribution p(v n,x , v n,y ) for each centrality class which is a manifestation of the event-by-event fluctuations. In finding p(v n,x , v n,y ), we rotate the system at each single event such that the reaction-plane angle Φ RP is set to be zero. In this case, the averaged ellipticityv 2 ≡ v 2,x is a manifestation of the geometrical initial ellipticity for events in a given centrality class irrespective of the fluctuations. In general, we are able to define the averaged flow harmonic v n ≡ v n,x too. In odd harmonics, however, this average would be zero for spherical ion collisions with the same sizes.
A practical way to study a distribution function p(r) is using two-dimensional cumulants, where r stands for a generic two-dimensional random variable. Consider G(k) as the characteristic function of the probability distribution p(r). The characteristic function is, in fact, the Fourier transformed p(r), Consequently, we can define the cumulative function as C(k) = log G(k). Two-dimensional cumulants are obtained from where A mn and C m,n are the 2D cumulants in Cartesian and polar coordinates, respectively. The two-dimensional cumulants have already been used to quantify the initial energy density shape by Teaney and Yan in Ref. [30]. However, we use the two-dimensional cumulants to study the harmonic distribution of the flows in the present study 2 . In the second line in the above, we used k = k 2 x + k 2 y , ϕ k = atan2(k x , k y ). In Appendix A, we study 2D cumulants in these two coordinates and their relations in more detail.
The random Φ RP rotates the point (v n,x , v n,y ) with a random phase in the range [0, 2π) at each event. As a result, the distribution p(v n,x , v n,y ) is replaced byp(v n,x , v n,y ), which is a rotationally symmetric distribution. 3 We should mention that the distributionp(v n,x , v n,y ) is experimentally accessible. In Ref. [18], the unfolding technique has been used by the ATLAS collaboration to remove the statistical uncertainty (due to the low statistics at each event) and non-flow effects from a distribution of the "observed" flow harmonics (v obs n,x , v obs n,y ). In this case, the only unknown parameter in finding an accurate p(v n,x , v n,y ) is the reaction-plane angle. Since there is no information in the azimuthal direction of p(v n,x , v n,y ), we can simply average out this direction to find a one-dimensional distribution, 4 2 In Ref. [30], A mn has been shown by W n,ab (n = 1, 2, . . . and a, b ∈ {x, y}). Also C mn has been found by W 0,n , W s 0,n and W c 0,n . 3 In general p(v n,x , v n,y ) is not rotationally symmetric. For instance p(v 2,x , v 2,y ) does not have this symmetry in a non-central collision of spherically symmetric ions, while p(v 3,x , v 3,y ) does, for the same collisions. 4 We simply use the notation ϕ ≡ nψ n .
Note that we can interchangeably usep(v n,x , v n,y ) or p(v n,x , v n,y ) in the above because obviously the effect of the random reaction-plane angle and the azimuthal averaging are the same.
In polar coordinates, we havep(v n,x , v n,y ) ≡p(v n ). As a result, the characteristic function of the distributioñ p(v n,x , v n,y ) in polar coordinates is given by where we used Eq. (5) in the above. Here, J 0 (x) is the Bessel function of the first kind. Also, · · · 2D means averaging with respect top(v n,x , v n,y ), while · · · 1D specifies the averaging with respect to p(v n ). Equation (6) indicates that we can study the radial distribution The cumulants of p(v n ) can be found by expanding the cumulative function log J 0 (kv n ) in terms of ik. The coefficients of ik in this expansion (up to some convenient constants) are the desired cumulants, where c n {2m} are those obtained from 2k-particle correlation functions [13,14], The cumulants c n {2k} ∝ v 2k n {2k} (see Eq. (12)) are indicating the characteristics of the distribution p(v n ) while the cumulants A mn (or equivalently C m,n ) in Eq. (4) are shown the characteristics of p(v n,x , v n,y ). The interconnection between v n {2k} and A mn have been studied previously in the literature. We point out that in order to define p(v n,x , v n,y ), we considered Φ RP = 0 for all events. In this case, the cumulant A 30 = (v n,x − v n,x ) 3 is related to the skewness of p(v n,x , v n,y ). For the case n = 2, this quantity is found for the first time in Ref. [27], and it is argued In other words, A 30 is related to the fine-splitting between v 2 {4} and v 2 {6}. In Ref. [28], the kurtosis of p(v 3,x , v 3,y ) in the radial direction has also been calculated, and it is shown it is proportional to v 4 3 {4} 6 . One may wonder whether the distribution p(v n ) contains more information than c n {2k} because the odd moments of p(v n ) are absent in the definitions of c n {2k} [29]. However, it is worth mentioning that the moments can be found by expanding the characteristic function G(k) in terms of ik, for the radial distributions like p(v n ). Since the above series is convergent, 7 we can find the characteristic function G(k) by finding the moments v 2k n . Having the characteristic function in hand, we immediately find p(v n ) by inversing the last line in Eq.
It means that by assuming the convergence of the series in Eq. (9) the distribution p(v n ) can be found completely by using only even moments. Equivalently, we can use the following argument: for the distributionp(v n,x , v n,y ), one simply finds the only nonvanishing moments are v 2k n x v 2 n y . It means that in the polar coordinates only v 2(k+ ) n are present. Additionally, by finding the two-dimensional cumulants ofp(v n,x , v n,y ) in polar coordinates C m,n (Eq. (4)), we find that the only non-zero cumulants are C 2k,0 ∝ c n {2k} (see Appendix A).
As a result, in the presence of random reaction-plane angle, c n {2k}'s are all we can learn from the original p(v n,x , v n,y ), whether we use 2k-particle correlation functions or obtain it from the unfolded distribution p(v n ) in principle. However, we should note that the efficiency of the two methods in removing single event statistical uncertainty and non-flow effects could be different, which leads to different results in practice.
Furthermore, the whole information about the fluctuations is not encoded in p(v n,x , v n,y ). In fact, the most general form of the fluctuations are encoded in a distribution as p(v 1,x , v 1,y , v 2,x , v 2,y , . . .). It is worth mentioning that the symmetric cumulants, which have been 6 The coefficients of the proportionality in both skewness and radial kurtosis are also functions of v n {2k}. We should note that the skewness [27,28] for the details). 7 It is an important question whether is it possible to determine p(v n ) uniquely from its moments [32] (see also Ref. [33])? Answering to this question is beyond the scope of the present work. Here we assume that p(v n ) is M-determinate which means we can find it from its moments v 2q n in principle. 8 We use the orthogonality relation ∞ 0 k J α (kr)J α (kr ) dk = δ(r − r )/r . introduced in Ref. [33] and have been measured by the ALICE collaboration [34], are non-vanishing. Additionally, the event-plane correlations (which are related to the moments v q m (v * n ) q m/n ) have been obtained by the ATLAS collaboration [35,36]. They are non-zero too. These measurements indicate that p(v 1,x , v 1,y , v 2,x , v 2,y , . . .) cannot be written as n p(v n,x , v n,y ). In the present work, however, we do not focus on the joint distribution and leave this topic for studies in the future. Let us point out that moving forward to find a generic form for the moments of the harmonic distribution of the flow was already done in Ref. [37].

Approximated averaged ellipticity
A question arises now: how much information is encoded in p(v n ) from the original p(v n,x , v n,y )? In order to answer this question, we first focus on n = 3. Unless there is no net triangularity for spherical ion collisions, the non-zero triangularity at each event comes from the fluctuations. Hence, we havev 3 = 0 for such an experiment. In this case, the eventby-event randomness of Φ RP is similar to the rotation of the triangular symmetry plane due to the event-by-event fluctuations. It means that p(v 3,x , v 3,y ) itself is rotationally symmetric, and the main features of p(v 3,x , v 3,y ) andp(v 3,x , v 3,y ) are the same. As a consequence, p(v 3 ) or equivalently c 3 {2k} can uniquely reproduce the main features of p(v 3,x , v 3,y ).
However, it is not the case for n = 2 due to the nonzero averaged ellipticityv 2 . The distribution p(v 2,x , v 2,y ) is not rotationally symmetric and reshuffling (v 2,x , v 2,y ) leads to information loss from the original p(v 2,x , v 2,y ). Therefore, there is at least some information in p(v 2,x , v 2,y ) that we cannot obtain from p(v 2 ) or c 2 {2k}. Nevertheless, it is still possible to find some features of p(v 2,x , v 2,y ) approximately. For instance, we mentioned earlier in this section that the skewness of this distribution in the v 2,x direction is The other important feature of p(v 2,x , v 2,y ) isv 2 , for which it is not obvious how to find it from c 2 {2k}. In fact, we are able to approximately findv 2 in terms of c 2 {2k} by approximating p(v 2,x , v 2,y ). The most trivial approximation is a two-dimensional Dirac delta function located at (v 2 , 0), This corresponds to the case that there are no fluctuations, and the only source for ellipticity is coming from an ideally elliptic initial geometry. Considering Eq. (5), the moments v 2q 2 can easily be obtained as follows: Now by using Eq. (8), we find c 2 {2} =v 2 2 , c 2 {4} = −v 4 2 , c 2 {6} = 4v 6 2 , etc. It is common in the literature to show the averaged ellipticityv 2 , which is approximated by c 2 {2k} as v 2 {2k}. Note that the quantity v 2 {2k} is defined for the case that the flow harmonic distribution is considered to be a delta function. Furthermore, we can assume the ideal case: that for any harmonics the distribution p(v n,x , v n,x ) has a sharp and clean peak aroundv n . By the assumption mentioned, we have [14] v Nevertheless, we know thatv n can be non-zero for odd n when the collided ions are not spherical or have different sizes. 9 The delta function approximation for p(v 2,x , v 2,y ) is not compatible with the experimental observation. In this case, we have  [20,21]. We can improve the previous approximation by replacing the delta function with a Gaussian distribution. In this case, we model the fluctuations by the width of the Gaussian distribution. Let us assume that p(v 2,x , v 2,y ) Using the above and Eq. (5), one can simply find is the well-known Bessel Gaussian distribution [10,22], Here, I 0 (x) is the modified Bessel function of the first kind. Now, we are able to find the moments v 2q 2 by using this approximated p(v 2 ). According to the relations in Eq. (8), 9 The assumptions which have been used in Ref. [14] to find Eq. (12) are exactly equivalent to considering p(v n,x , v n,y ) as a delta function. 10 We consider the reasonable assumption that the widths of the Gaussian distribution in the v 2,x and v 2,y directions are the same. we find where we used the notation v 2 {2k} introduced in Eq. (12). This result explains the large difference between v 2 {2} and v 2 {2k} for k > 1. In fact, the presence of fluctuations is responsible for it. This description for the difference between v 2 {2} and other flow harmonics was argued first in Ref. [22]. It is found that the splitting between v 2 {2} and other flow harmonics contains information from the two-dimensional distribution [22].
The above two examples lead to the following remarks: -In order to relatev n to c n {2k}, one needs to estimate the shape of p(v n ) wherev n is implemented in this estimation explicitly. We show this estimated distribution by p(v n ;v n ). We should say that the first remark is very strong and we can estimatev n by a weaker condition. Obviously, if we estimate only one moment or cumulant of p(v n ) as a function ofv n , in principle, we can estimatev n by comparing the estimated moment or cumulant with the experimental data. But the question is how to introduce such a reasonable estimation practically. In the following sections, we introduce a method to estimatev n from a minimum information of p(v n ). One notes thatv 2 = v 2 {4} = · · · is true only if we approximate p(v 2 ) by a Bessel-Gaussian distribution. In the next section, we find an approximated distribution around Bessel-Gaussianity.

Radial-Gram-Charlier distribution and new cumulants
In Sect. 2.2, we argued that the quantityv n , which is truly related to the geometric features of the collision, can be obtained by estimating a function for p(v n,x , v n,y ). We observed that the Dirac delta function choice for p(v n,x , v n,y ) leads tov n = v n {2k} for k > 0, while by assuming the distribution to be a 2D Gaussian located at (v n , 0) (the Bessel-Gaussian in one dimension) we findv n = v n {2k} for k > 1.
One notes that in modeling the harmonic distribution of the flow by a delta function or Gaussian distribution, the parameterv n is an unfixed parameter, which is eventually related to the v n {2k}. In any case, the experimental observation indi-cates that the v n {2k} are split; therefore, the above two models are not accurate enough. Instead of modeling p(v n,x , v n,y ), we will try to model p(v n ) with an unfixed parameterv n , namely p(v n ;v n ). In this section, we introduce a series for this distribution such that the leading term in this expansion is the Bessel-Gaussian distribution. The expansion coefficients would be a new set of cumulants that specifies the deviation of the distribution from Bessel-Gaussianity. In fact, by using these cumulants, we would be able to model p(v n ;v n ) more systematically.
It is well known that a given distribution can be approximated by a Gram-Charlier A series which approximates the distribution in terms of its cumulants (see Appendix B). Here we use this concept to find an approximation for p(v n ;v n ) in terms of the cumulants c n {2k}. One of the formal methods of finding the Gram-Charlier A series is using orthogonal polynomials. In addition to this well-known method, we will introduce an alternative method, which is more practical for finding the series of a one-dimensional p(v n ;v n ) around the Bessel-Gaussian distribution.

Gram-Charlier A series: 1D distribution with support R
Before finding the approximated distribution around the Bessel-Gaussian, let us have practice with the alternative method of finding Gram-Charlier A series by applying it to a one-dimensional distribution p(x) with support (−∞, ∞). 11 This method will be used in the next section to find the radial-Gram-Charlier distribution for arbitrary harmonics.
The characteristic function for a one-dimensional distribution is e ikx , and the cumulants κ n of such a distribution are found from log e ikx = n=1 (ik) n κ n /n!. The first few cumulants are presented in the following: where the averages are performed with respect to p(x) in the right-hand side. Now, consider an approximation for the original distribution where its cumulants are coincident with the original p(x) only for a few first cumulants. We show this approximated distribution by p q (x) where the cumulants κ n for 1 ≤ n ≤ q are the same as the cumulants of the original p(x). 11 A standard method for finding the Gram-Charlier A series of p(x) is reviewed in Appendix B.1.
Assume the following ansatz for this approximated distribution: where In the above, a i,k (except a 0,0 ) are unknown coefficients. Note that p 0 (x) is nothing but a Gaussian distribution located at the origin. One can find the unknown coefficients a i,k by using the equations in (17) together with the normalization condition iteratively. In what follows we show how it works: let us present the moments obtained from p q (x) as x m q . Also, assume that the first moment (which is the first cumulant too) is zero. At the end, we recover the first moment by applying a simple shift. Now for the first iteration (q = 1) we have 1 1 = 1 + a 1,0 = 1 from the normalization condition and x 1 = σ 2 a 1,1 = κ 1 = 0 from Eq. (17a). It is a linear two-dimensional system of equations and the solution is a 1,0 = a 1,1 = 0. In the next step (q = 2), we have three equations [one normalization condition and the two equations (17a) and (17b)]. By considering κ 2 = σ 2 , we find a 2,0 = a 2,1 = a 2,2 = 0. However, the third iteration is non-trivial. The equations are The above equations can be solved easily, a 3,0 = a 3,2 = 0 and a 3, We are able to continue the iterative calculations to any order and find In the above, h 0 = 1 and h 1 = 0 together with where γ n are the standardized cumulants defined as Note that in Eq. (20), we arbitrarily shifted the distribution to the case that the first moment of p(x) is κ 1 . In addition, we assumed that the width of the Gaussian distribution is exactly equal to κ 2 . Equation (20) is the well-know Gram-Charlier A series for the distribution p(x). One could consider Eq. (20) as an expansion in terms of Hermite polynomials. Using the fact that He n (x) are orthogonal with respect to the weight we find the coefficients h n in Eq. (21) (see Red. [38]). To this end, we change the coordinate as x → (x − κ 1 )/σ . As a result, we have By using the series form of the Hermite polynomial, we find h n as a function of p(x) moments. Rewriting moments in terms of cumulants [reverting the equations in (17)], one finds Eq. (20).

Radial-Gram-Charlier distribution
Using standard methods, we can extend the one-dimensional Gram-Charlier A series (20) to two dimensions (see Appendix B.2), where h mn are written in terms of two-dimensional cumulants A mn [see Eqs. (72)-(74)], and N (r) is a two-dimensional Gaussian distribution similar to Eq. (14) located at (μ x , μ y ) and σ x = σ y . It is worth noting that the concept of a 2D Gram-Charlier A series has been employed in heavy ion physics first in Ref. [30] by Teaney and Yan. They used this series to study the energy density of a single event. 12 However, we use this to study the harmonic distribution of the flow in the present work. Now let us consider a two-dimensional Gram-Charlier A series for p(v n,x , v n,y ). By this consideration, one can find a corresponding series for p(v n ) by averaging out the azimuthal direction. We should say that the results of this averaging for the second and third harmonics are different.
For n = 3, the distribution p(v 3,x , v 3,y ) is rotationally symmetric and, as we already remarked in the previous section, the whole information of the distribution is encoded in c 3 {2k}. As a result, we are able to rewrite the 2D cumulants A mn in terms of c 3 {2k}. It has been done in Ref. [28], and an expansion for p(v 3 ) has been found. On the other hand, for n = 2 the whole information of p(v 2,x , v 2,y ) is not in c 2 {2k}. Therefore, we are not able to rewrite all A mn in terms of c 2 {2k} after averaging out the azimuthal direction of a 2D Gram-Charlier A series.
For completeness, we study the azimuthal averaging of Eq. (23) in the most general case in Appendix C. In this appendix, we show that the distribution in Ref. [28] is reproduced only by assuming A 10 = A 01 = 0. Also, we discuss the information we find from the averaged distribution compared to the two-dimensional one. However, the method which we will follow in this section is different from that pointed out in Appendix C. Consequently, the most general series we will find here is not coincident with the distribution obtained in Appendix C.
Before finding a Gram-Charlier A series for arbitrary harmonic, let us find the series for odd harmonics (mentioned in Ref. [28]) by employing orthogonal polynomials. The result will be used to find the series for the most general case later. Since we havev 3 = 0 for n = 3, the Bessel-Gaussian distribution reduces to a radial Gaussian . Moreover, the Laguerre polynomials L n (x) are orthogonal with respect to the weight By changing the coordinate as 3 /(2σ 2 ) dv 3 ; the radial Gaussian distribution appears as the weight for orthogonality of L n (v 2 3 /(2σ 2 )). Hence, we can write a general distribution where the coefficients odd 2n can be found by 13 Considering the series form of the Laguerre polynomial, 13 In Eq. (24), we chose the coefficient expansion as n! for convenience.
we immediately find odd n in terms of moments v where in the above we defined the standardized cumulants Γ odd 2k as similar to Eq. (22). The expansion (24) together with Eq. (27) is exactly the series found in Ref. [28] which is true for any odd n. This approximated distribution is called the radial-Gram-Charlier (RGC) distribution in Ref. [28].
It should be noted that it is a series for the case thatv n = 0. In the following, we will try to find a similar series for p(v n ;v n ) wherev n could be non-vanishing.
In order to find the Gram-Charlier A series for the distribution p(v n , ;v n ) in the general case, we come back to the iterative method explained in Sect. 3.1 where we found the distribution (20) by considering the ansatz (18) and iteratively solving the equations in (17).
Here, we assume that p q (v n ;v n ) is an approximation of p(v n ) where only the cumulants c n {2k} with 1 ≤ k ≤ q are the same as the original distribution. Now suppose an ansatz that has the following properties: -Its leading order corresponds to the Bessel-Gaussian distribution. -In the limitv n → 0, the distribution approaches (24).
Using such an ansatz, we calculate the moments v 2k n with some unknown parameters and find them by solving the equations in (8) iteratively.
We introduce the following form for the ansatz: where 14 Refer to the footnote 15 and set A 10 = 0.
One simply finds that by choosing a 0,0 = 1 the distribution By comparing the above expansion with Eq. (26), we realize 2σ 2 )). In order to reproduce Eq. (24), we choose where 2i are unknown coefficients. Similar to Sect. 3.1, we indicate the moments of p q (v n ;v n ) by v 2m n q . For the first iteration, we have the normalization condition 1 0 = 0 = 1. For the second iteration, the normalization condition is trivially satisfied, For the next iteration, we find that the normalization condition and Eq. (8a) are automatically satisfied, Using the above equation, we immediately find 4 . In a similar way, we are able to continue this iterative calculation and find 2q from the only non-trivial algebraic equation at each step. A summary of the few first results is as follows: 0 = 1 and 2 = 0; moreover, One may wonder if, similar to the two previous cases, we are able to write all the coefficients 2k in terms of some stan- 15 Obviously, there is no one-to-one correspondence between c n {2k} and A mn due to the loss of information by averaging. Specifically, one find c n {2} = A 2 10 +A 2 01 +A 20 +A 02 (see Eq. (84a)). Note that by assuming Φ RP = 0 we have A 01 = v n,y = 0 and A 10 = v n,x =v n . Also, it is a reasonable assumption that A 20 A 02 [see Refs. [27,28]]. By choosing σ = σ x = σ y = A 20 A 02 , one approximates p(v n,x , v n,y ) around a symmetric Gaussian distribution located at (v n , 0) where its width is exactly similar to the distribution p(v n,x , v n,y ). In this case, we find c n {2} =v 2 n + 2σ 2 . dardized cumulants. These standardized cumulants should smoothly approach the contents of Eq. (28). In fact, it motivates us to define a new set of cumulants, n . Now, we can define the standardized form of this new set of cumulants as follows: Using the above definitions, we can rewrite the coefficients 2k in the following form: which are in agreement with the equations in (27) in the limit v n → 0.
Let us to summarize the series in (29) as follows: Recall that the two distributions (20) and (24) could be found by using the orthogonality of He n (x) and L n (x). We can ask if there is any similar approach to finding Eq. (39). Surprisingly,Q i (v n ;v n ) is related to a generalized class of orthogonal polynomials which are called multiple orthogonal polynomials (see Ref. [39]). These generalized versions of the polynomials are orthogonal with respect to more than one weight. Specifically, the polynomials related toQ i (v n ;v n ) have been introduced in Ref. [40]. In order to avoid relatively formal mathematical material here, we refer the interested reader to Appendix D where we briefly review the multiple orthogonal polynomials and re-derive Eq. (39) by employing them.
An important point as regards the distribution (39) is the convergence of its summation. For sure, finding the convergence condition of the infinite sum in Eq. (39) is beyond the scope of the present paper. However, if we find that at least a few first terms in Eq. (39) are sufficient to give a reasonable approximation of p(v n ), then there is no concern about the convergence or divergence of this series practically. 16 In order to show in how far the distribution (39) is a good approximation, we need to have a sample for p(v n ) where itsv n is known. To this end, we generate heavy ion collision events by employing a hydrodynamic based event generator which is called iEBE-VISHNU [41]. The reaction-plane angle is set to zero in this event generator. Thus, we can simply find p(v n,x v n,y ) and subsequentlyv n . The events are divided into 16 centrality classes between 0 to 80 percent and at each centrality class we generate 14000 events. The initial condition model is set to be MC-Glauber. Let us recover the notation in Eq. (29) and assume that p q (v n ;v n ) is the distribution (39) where the summation is done up to i = q. We first compute the c 2 {2k} andv 2 from iEBE-VISHNU output and plug the results in Eq. (39). After that we can compare the original simulated distribution p(v n ) with the estimated p q (v 2 ;v 2 ). The results are presented in Fig. 1 for the events in 65-70%, 70-75% and 75-80% centrality classes, in which we expect the distribution is deviated from the Bessel-Gaussian. In this figure, the black curve corresponds to the Bessel-Gaussian distribution ( p 0 (v n ;v n )) and the red, green and blue curves correspond to p q (v 2 ;v 2 ) with q = 2, 3 and 4, respectively. Recall that q = 1 has no contribution because 2 vanishes. As can be seen in the figure, the black curve shows that the distribution is deviated from Bessel-Gaussian and the distribution p q (v 2 ;v 2 ) with q = 0 explains the generated data more accurately.
In order to compare the estimated distributions more quantitatively, we plotted χ 2 /NDF, comparing the estimated distribution p q (v 2 ;v 2 ) with iEBE-VISHNU output. We plotted the results in Fig. 2 for q = 0, 2, 3, 4, 5, 6 and 7 for the events in the 65-70%, 70-75% and 75-80% centrality classes. The value of χ 2 /NDF associated with the Bessel-Gaussian distribution is much greater than the others. Therefore, we multiplied its value by 0.878 to increase the readability of the figure.
As Fig. 2 demonstrates, the Bessel-Gaussian distribution has less compatibility with the distribution. In addition, the 16 This is an argument presented in Ref. [45]. In the same reference the convergence condition of Eq.    Fig. 1 Comparing the harmonic distribution of the flow from iEBE-VISHNU (shaded region) and the Gram-Charlier A series different approximations quantity χ 2 /NDF becomes closer to 1 by increasing q. 17 It is relatively close to 1 for higher values of q. One may deduce from the figure that the series converges because χ 2 /NDF for q = 6 and q = 7 are very close to each other and very close to 1. However, we should say that although it is an strong evidence for the series convergence, there is no guaranty that  The above remarks indicate that q n {2k} contains information originating from the fluctuations only and the explicit effect of the collision geometryv n is extracted from it. 18 It is important to note that although we have found q n {2k} by RGC distribution inspiration, we think it is completely independent of that and there must be a more direct way to find q n {2k} independent of the RGC distribution.

75-80% Centrality
Concerning the difference between c n {2k} and q n {2k} in terms of the Gram-Charlier expansion, we should mention that the cumulants c n {2k} appear as the coefficients of the It is because we are approximating a distribution which is more Bessel-Gaussian rather than radial-Gaussian. On the other hand for the third harmonics, we expect that the underling distribution is more radial-Gaussian, and practically we see a larger difference between v 3 {2} and v 3 {4} compared to the second harmonics [8]. Based on the above arguments, we deduce that the q n {2k} are a more natural choice for the case thatv n is non-vanishing. Nevertheless, the cumulants q n {2k} (unlike c n {2k}) are not experimentally observable because of the presence ofv n in their definition. However, they are useful to systematically estimate the distribution p(v n ) and consequently estimate the parameterv n . This will be the topic of the next section.

Averaged ellipticity and harmonic fine-splitting of the flow
In this section, we would like to exploit the cumulants q n {2k} to find an estimation forv n . Note that if we had prior knowledge about one of the q 2 {2k} or even any function of them Any given distribution can be quantified by q n {2k}. While p(v n ) is approximately Bessel-Gaussian, we can guess that In fact, this is confirmed by the simulation. The cumulants q 2 {2k} are obtained from the iEBE-VISHNU output and presented in Fig. 3. Therefore, as already remarked, we expect that the few first cumulants q n {2k} are enough to quantify the main features of a distribution near Bessel-Gaussian. Let us concentrate on n = 2 from now on. Recall that q 2 {4} = q 2 {6} = · · · = 0 corresponds to a Bessel-Gaussian distribution. This choice of cumulants is equivalent to a distribution with v 2 {4} = v 2 {6} = · · · , which is not compatible with the splitting of v 2 {2k} observed in the experiment. As we discussed at the beginning of this chapter, we findv 2 by estimating any function of cumulants q 2 {2k}. Here we use the most simple guess for this function, which is g(q 2 {2}, q 2 {4}, . . .) = q 2 {2k}. Therefore, the equation q 2 {2k} = 0 for each k ≥ 1 corresponds to a specific estimation for p(v 2 ).
The first non-trivial choice is q 2 {4} = 0. Referring to Eq. (36b), we find where in the abovev 2 {4} refers tov 2 , which is estimated from q 2 {4} = 0. This is exactly the assumption that has been made in Ref. [27] to find the skewness experimentally. By estimatingv 2 , we find the other q 2 {2k}. We present a few first cumulants in the following: where we used the notation for the fine-splitting between different v n {2k}'s. Furthermore, in Eq. (42) we expanded q 2 {2k} in terms of finesplitting Δ 2 {2k, 2 }. Note that the above q 2 {2k} are characterizing an estimated distribution p(v 2 ;v 2 {4}). For such an estimated distribution, q 2 {6} is proportional to the skewness introduced in Ref. [27]. Interestingly, q 2 {8} is proportional to Δ 2 {4, 6}−11Δ 2 {6, 8}, which has been considered to be zero in Ref. [27]. However, here we see that this combination can be non-zero and its value is related to the cumulant q 2 {8}. In fact, the same quantity can be computed for a generic narrow distribution [29]. In turns out that this quantity can be non-vanishing in the small fluctuation limit.
The equations in (42)    In order to improve the estimation ofv 2 , we set q 2 {6} = 0 in Eq. (36c). This equation has six roots, where only two of them are real and positive. In addition, as can be seen from Fig. 4, the true value ofv 2 is always smaller than v 2 {4} (=v 2 {4}) in higher centralities. In fact, it is true for all v 2 {2k} for k > 2. Based on this observation, we demand the root to be smaller than v 2 {4}. In fact, we have checked that the equation q 2 {2k} = 0 for k = 3, 4, 5 has only one root which is real, positive and smaller than v 2 {2k} for k = 2, 3, 4, 5.
In Fig. 4,v 2 {6} is plotted by a red curve which is obtained by solving q 2 {6} = 0 numerically. As can be seen, it is clorse to the real value ofv 2 rather than v 2 {4}. In fact, we are able to find this root analytically too, which is compatible with the red curve in Fig. 4 with good accuracy. Using the estimator (44), we find the other q 2 {2k} as follows: By comparing Eqs. (42) and (45), we note that q 2 {2k} (except We can go further and estimatev 2 by solving the equation q 2 {8} = 0. The result is plotted by a blue curve in Fig. 4. Also, its analytical value can be found as follows: The quantity δ 2k for k = 2, 3, 4 and 5 with respect to centrality class As Fig. 4 indicates, comparingv 2 {6}, the estimatorv 2 {8} is a worse estimation ofv 2 (except between the range 30% to 50% centralities). We might expect that because q 2 {8} q 2 {6} (see Fig. 3), the quantityv 2 {8} should be a better approximation thanv 2 {6}. But this argument is not true. In fact, the cumulants q 2 {2k} for the true distribution are small, but they are non-zero at any centralities. Let us rewrite Eqs. (36b)-(36d) as follows: where The estimatorv 2 {2k} (k = 2, 3, 4) can be found by solving Eqs. (47)a,b,c) where we set δ 2k = 1. Alternatively, by employing the actual value of δ 2k from the simulation we findv True 2 . In fact, the difference betweenv True 2 andv 2 {2k} is a manifestation of the inaccuracy in the setting δ 2k = 1. In other words, demanding q 2 {2k} = 0 is not exactly correct. Looking at the problem from this angle, by referring to Fig. 4, we realize that δ 4 = 1 is the most inaccurate approximation. Also, δ 6 = 1 is more accurate than δ 8 = 1.
By using iEBE-VISHNU generated data, we can check the accuracy of the δ 2k = 1 estimation by comparing different values of δ 2k (for k = 2, 3, 4, 5) calculated from the simulation. The result is plotted in Fig. 5. This figure confirms the difference between the estimatorsv 2 {2k} discussed above. The quantity δ 4 has the greatest deviation from unity. Also, we see that δ 8 deviates from unity for centralities above 55% while δ 6 (and δ 10 ) is closer to 1 up to 65% centrality. Moreover, δ 6 (δ 10 ) is larger than δ 8 for all centralities. This can be considered as a reason for the fact thatv 2 {8} is less accurate thanv 2 {6} (andv 2 {10}).  Fig. 6 The averaged flow harmonic estimatorv 2 {2k} obtained from the ATLAS experimental data [18] Furthermore, let us mention that the cumulant q 2 {6} changes its sign (see Figs. 3 and 5) for the centralities around 60-65%. It means it is exactly equal to 0 at a specific point in this range, and we expect thatv 2 {6} becomes exactly equal tov True 2 at this point. This can be seen also in Fig. 4 where the red curve (v 2 {6}) crosses the brown curve (v True 2 ). The situation for q 2 {10} is very similar to q 2 {6}. As a result, it is not surprising that we find the estimatorv 2 {10} to be similar tov 2 {6}. We have foundv 2 {10} by solving q 2 {10} = 0 numerically. The result is plotted by a black curve in Fig. 4. As can be seen, the results ofv 2 {6} and v 2 {10} are approximately similar. Now, we are in a position to estimate thev 2 of the real data by usingv 2 {2k}. According to the above discussions, we expect thatv 2 {6},v 2 {8} andv 2 {10} are closer to the real value of the averaged ellipticityv 2 compared to v 2 {4} (or any other v 2 {2k} for k > 2). The result is plotted in Fig. 6. In finding the estimatedv 2 , we employed v 2 {2k} reported by the ATLAS collaboration in Ref. [18]. The value ofv 2 {4} is exactly equal to v 2 {4}, which is plotted by the green curve in the figure. By plugging experimental values of v 2 {2k} into Eqs. (36c), (36d) and (36e) and setting them to 0, we have numerically foundv 2 {6} (red curve),v 2 {8} (blue curve) and v 2 {10} (black curve), respectively. 19 The errors ofv 2 {10} are too large for the present experimental data, and a more precise observation is needed to find a more accurate estimation. Exactly similar to the iEBE-VISHNU simulation, the value 19 In detail, all the Eqs. (36c)-(36e) were written in terms of the moments v 2k 2 . Considering the reported experimental distribution p(v 2 ) in Ref. [18], we are able to produce the covariance matrix associated with statistical fluctuations of the moments v 2k 2 . Using the covariance matrix, we generated 10,000 random numbers by using a multidimensional Gaussian distribution. Employing each random number, we solved Eqs. (36c)-(36e) numerically and found the estimatedv 2 . We obtained the standard deviation of the finalv 2 distribution as the statistical error of thev 2 {2k}. ofv 2 {8} is between v 2 {4} andv 2 {6}. 20 Therefore, we expect the true value of the averaged ellipticity to be close to the value ofv 2 {6}. 21 In this section, we introduced a method to estimate p(v 2 ,v 2 ). By considering the cumulants c 2 {2k} of the true distribution p(v 2 ), we estimatedv 2 by assuming that the cumulant q 2 {2k} of p(v 2 ,v 2 ) is zero for a specific value of k. These estimations for q 2 {6} = 0 and q 2 {8} = 0 are presented analytically in Eqs. (44) and (46) and also with red and blue curves in Fig. 4 numerically. We exploited a hydrodynamic simulation to investigate the accuracy of our estimations. We found thatv 2 {6} is more accurate thanv 2 {8}. Finally, we found the experimental values forv 2 {6},v 2 {8} andv 2 {10}.
Until now, we considered the cumulants v n {2k} as an input to find an estimation forv n . In the next section, we try to restrict the phase space of the allowed region of v n {2k} by using cumulants q n {2k}.

Constraints on the flow harmonics phase space
Referring to Eqs. (44) and (46), we see that these estimators lead to real values forv 2 In this section, we would like to investigate these constraints and their validity range.
We first consider Eq. (47b). The quantities v 2 {2k} andv 2 are real valued. Therefore, it is a well-defined and simple question what the allowed values are of v 2 {4} and v 2 {6} such that Eq. (47b) has at least one real root. The polynomial in the left-hand side of Eq. (47b) goes to positive infinity for v 2 → ±∞. As a result, it has at least one real root if the polynomial is negative in at least one of its minima. This condition is satisfied for Since δ 6 is unknown, there is no bound on v 2 {4} and v 2 {6}. Although we do not know the exact value of δ 6 , we know that 0.9 δ 1/6 6 1 based on our simulation (see Fig. 5). In this case, if we take into account v 2 {6} ≤ v 2 {4}, we immediately deduce the inequality in Eq. (49). On the other hand, we 20 We have computed the quantitiesε 2 {4},ε 2 {6} andε 2 {8} for an elliptic-power distribution [25] which is a simple analytical model for the initial state distribution. We have observed exactly the same hierarchy, and we have seen thatε True is closer toε 2 {6}. This is evidence that this behavior is generic for the distributions of interest in heavy ion physics. 21 Comparing Fig. 4 with Fig. 6, one finds that the values ofv 2 {2k} from simulation are relatively smaller than that obtained from the real data. This deviation is due to the difference in p T range. In Fig. 6, we used the data from Ref. [18] where p T > 0.5 GeV, while the output of the iEBE-VISHNU is in the range p T 4 GeV. For a confirmation of iEBE-VISHNU output, we refer the reader to Ref. [20,21], where v 2 {4} is reported for p T below 3 GeV. The order of magnitude ofv 2 {2k} in our simulation is compatible with that mentioned in Refs. [20,21]. may observe that v 2 {6} is slightly greater than v 2 {4}. This observation means that δ 6 is definitely smaller than unity. Here, we show both cases by an approximate inequality as v 2 {4} v 2 {6} due to the smallness of δ 1/6 6 . Alternatively, it is well known that the initial eccentricity point (ε 2,x , ε 2,y ) is bounded into a unit circle [25], and it leads to a negative skewness for p(ε 2,x , ε 2,y ) in non-central collisions. By considering the hydrodynamic linear response, the skewness in p(ε 2,x , ε 2,y ) is translated into the skewness of p(v 2,x , v 2,y ) and the condition v 2 {4} > v 2 {6} [27]. However, it is possible that the non-linearity of the hydrodynamic response changes the order in the inequality to the case that v 2 {6} is slightly greater than v 2 {4}. This is compatible with the result which we have found from a more general consideration. Now, we concentrate on Eq. (47c). Due to the complications in finding the analytical allowed values of v 2 {2k}, we investigate it numerically. First, we consider the case that δ 6 = δ 8 = 1. In this case, we fix a value for v 2 {4} and after that randomly generate v 2 {6} and v 2 {8} between 0 to 0.15. Putting the above generated and fixed values into Eq. (47c), we findv 2 numerically. If the equation has at least one real solution, we accept (v 2 {6}, v 2 {8}), otherwise we reject it. The result is presented as scatter plots in Fig. 7. As can be seen from the figure, some region of the v 2 {2k} phase space is not allowed. The condition 11Δ 2 {6, 8} ≥ Δ 2 {4, 6} (see the square root in (46)) indicates that the border of this allowed region can be identified with v 2 {8} = (12v 2 {6} − v 2 {4})/11 up to order Δ 2 {2k, 2 }. This is shown by a red line in Fig. 7. Alternatively, the numerically generated border of the allowed region slightly deviates from the analytical border line. It happens for the region that v 2 {4} is considerably different from v 2 {6} and v 2 {8}. The reason is that Δ 2 {2k, 2 } is not small in this region, and the condition 11Δ 2 {6, 8} ≥ Δ 2 {4, 6} is not accurate anymore.
Let us combine the constraint obtained from Eqs. (47b) and (47c). For a more realistic study, we use the ATLAS data for v 2 {4} as an input. Instead of using a fixed value for v 2 {4}, we generate it randomly with a Gaussian distribution where it is centered around the central value of v 2 {4}, and we have a width equal to the error of v 2 {4}. The result for 40-45% centralities is presented in Fig. 8a. For this case, we expect that the Bessel-Gaussian distribution works well. As a result, we assume δ 6 δ 8 1 (see Fig. 5). From the ATLAS results For more peripheral collisions, we expect a non-zero value for δ 2k . In the 65-70% centrality class (the most peripheral class of events reported by ATLAS in Ref. [18]), we have v 2 {4} 0.093 ± 0.002. According to our simulation in this centrality class, we expect the values of δ 6 and δ 8 to be 0.88 and 0.8, respectively. However, here we do not choose fixed values for δ 6 and δ 8 . Instead, we generate a random number between 0.8 to 1 and assign the result to both δ 6 and δ 8 . The result is presented in Fig. 8b. Referring to this figure, the allowed region is compatible with the experiment similar to the previous case. For non-zero values of δ 6 and δ 8 , the allowed region can be identified by . These two constraints (similar to Fig. 8a) are presented by two bands in Fig. 8b. In this case, the width of the bands is due to the inaccuracy in δ 6 and δ 8 together with v 2 {4}.
By considering the correlation between v 2 {6} and v 2 {8}, the experimental one sigma region of the v 2 {6}-v 2 {8} space would not be a simple domain. Nevertheless, for the present inaccurate case which is depicted in Fig. 8, we are able to restrict the one sigma domain by comparing it with the allowed region showed by the blue dots.
Let us summarize the constraint on the harmonic finesplitting of the flow as follows: which correspond to the region filled by the blue dots in Fig. 8. Note that we used the approximate inequality in Eq. (50b) because of not only ignoring δ 1/8 8 but also ignoring terms with order O(Δ 2 ).
Let us point out that the inequalities in Eq. (50) can be written as 11 12 In Ref. [18], the ratios v 2 {6}/v 2 {4} and v 2 {8}/v 2 {4} have been calculated experimentally. In Fig. 9, 0 to find the allowed region of v 2 {6}-v 2 {8} phase space. In this case, δ 6 , δ 6 and v 2 {4} are not completely fixed. The region is compatible with the ATLAS data reported in Ref. [18] tralities. Furthermore, the inequality in the right-hand side of Eq. (51) is not saturated, while we need a more accurate experimental observation to decide on the saturation of the inequality in the left-hand side.  Fig. 9 Experimental values of ratios v 2 {6}/v 2 {4} and 11v 2 {6}/12v 2 {4} + 1/12 with respect to the centrality class from the ATLAS data [18] Finally, we would like to mention that Eq. (36e) does not lead to any constraint on v 2 {2k} for 0.5 δ 10 1. The reason is thatv 2 = 0 is a minimum of the following relation: and its value in the minimum is −456 δ 10 v 10 2 {10}. It means that for δ 10 > 0 the equation always has real roots.

Conclusion and outlook
In the present work, we have employed the concept of the Gram-Charlier A series to relate the distribution p(v n ) to c n {2k}. We have found an expansion around the Bessel-Gaussian distribution where the coefficients of the expansion have been written in terms of a new set of the cumulants q n {2k}. We have shown that the corrected Bessel-Gaussian distribution can fit the actual distribution p(v n ) much better than the Bessel-Gaussian distribution. The new cumulants q n {2k} were written in terms of c n {2k} and the averaged flow harmonicv n . Because the only non-vanishing new cumulants are q n {2} for a Bessel-Gaussian distribution, they are a more natural choice to study the distributions near the Bessel-Gaussian case than c n {2k}.
By using the cumulants q n {2k}, we could systematically introduce different estimations for p(v n ) and consequently relate the averaged ellipticityv 2 to the flow harmonic finesplitting v 2 {2k} − v 2 {2 } for k, ≥ 2 and k = . As a specific example for thev 2 estimator, we have shown that We have used the iEBE-VISHNU event generator to compare the true value of thev 2 to the estimated one; also, we have shown that the estimatorv 2 {6} is more accurate than v 2 {2k} for k > 1. As another application of new cumulants, we have constrained the phase space of the flow harmonics v n {2k} to the region we need a more accurate experimental observation for the  quantity 12v One should note that we have shown the compatibility of the allowed phase space of v 2 {2k} with experimental results of (high-multiplicity) Pb-Pb collisions. Recently, the flow harmonics were measured for p-p, p-Pb and low-multiplicity Pb-Pb collisions by ATLAS [42]. In the light of q 2 {2k} cumulants, it would be interesting to study the similarity and difference between the splitting of v 2 {2k} in these systems and examine the compatibility of the results with the allowed region comes from q 2 {2k}.
Furthermore, we have only focused on the distribution p(v n ) in the present study. However, based on the observation of symmetric cumulants and event-plane correlations, we expect that a similar systematic study for the distribution p(v 1 , v 2 , . . .) can connect this joint distribution to the observations. Such a study would be helpful to relate the initial state event-by-event fluctuations to the observation. This would be a fruitful area for further work.
Acknowledgements The authors would like to specially thank Hessamaddin Arfaei, HM's supervisor, for useful discussions, comments and providing guidance over HM's work during this project. We would like to thank Jean-Yves Ollitrault for useful discussions and comments during the "IPM Workshop on Collective Phenomena & Heavy Ion Physics". We also thank Ali Davody for providing us with the iEBE-VISHNU data and for discussions. We thank Mojtaba Mohammadi Najafabadi and Ante Bilandzic for useful comments. We would like to thank Navid Abbasi, Davood Allahbakhshi, Giuliano Giacalone, Reza Goldouzian and Farid Taghinavaz for discussions. We thank the participants of the "IPM Workshop on Collective Phenomena & Heavy Ion Physics".

Data Availability Statement
This manuscript has associated data in a data repository. [Authors' comment: The experimental data used in the present study was published by ATLAS Collaboration [18].] Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecomm ons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Funded by SCOAP 3 .

A Two-dimensional cumulants in cartesian and polar coordinates
The Cartesian cumulants A mn can be found in terms of the moments x k y by using the first line of the Eq. (4). A few first cumulants can be found as follows: Note that for a fixed number χ we have A mn → χ m+n A mn by replacing x → χ x and y → χ y. We call m + n as the order of A mn . Consequently, there are m + n + 1 number of cumulants with the order m + n.
In order to find the cumulants C m,n in polar coordinates x = r cos ϕ and y = r sin ϕ, we use the second line of Eq. (4). Considering the Jacobi-Anger identity we are able to write Eq. (4) as follows: where in the left-hand side we have used the series form of the Bessel function J n (kr). In Eq. (54), the combination (ik) 2m+n e −inϕ k has appeared in the left-hand side. It means that if we have odd |n|, then the power of ik is odd, and if we have even |n|, then the power of ik is even. Therefore, the only non-vanishing C m,n are those for which both m and |n| are odd or even. The other consequence of the combination (ik) 2m+n e −inϕ k is that we have C m,n = 0 for |n| > m. The reason is that in the right-hand side of the Eq. (54), the combination (ik) m e in ϕ k has appeared. Therefore, in order to have a non-vanishing C m,n we need to have some terms in the left-hand side such that 2m + n = m and n = −n . It immediately leads to m = (m + n )/2 ≥ 0 and m + n = (m − n )/2 ≥ 0. One should note that the (m + n)! in the denominator of Eq. (54) diverges if m + n < 0. As a result, we deduce that C m ,n can be non-vanishing if |n | ≤ m . Strictly speaking, for m ≥ 0 only the cumulants are non-zero. Also, note that we have C m,−n = C * m,n because the distribution is a real function.
Based on the above considerations one can find all nontrivial values of C m,n in terms of moments r m e inϕ by using Eq. (54). A few first two-dimensional cumulants in polar coordinates are presented explicitly in the following: (56e) The first cumulant, C 0,0 , is equal to zero by considering the normalization condition of the probability distribution. By redefinition we find the cumulants in polar coordinate which has been obtained in Ref. [30]. In Ref. [30], the translational and rotational invariance have been used to eliminate W c 1,1 , W s 1,1 and W s 2,2 . In our notation, it is equivalent to C 1,1 = C 1,−1 = 0 and C 2,2 = C 2,−2 .
One notes that by replacing r with χr we have C m,n → χ m C m,n . In other words, the order of C m,n is indicated by m in polar coordinates. Referring to Eq. (55), we find that there are m + 1 cumulants with order m in polar coordinates. As a result, we are able to find a one-to-one relation between C m,n and A k with the same order by equating the two sides of the equation, In the left-hand side of the above equality, we rewrite m,n ((ik x ) m (ik y ) n A mn /(m!n!)) in polar coordinates. A few first cumulants in polar coordinates can be written in terms of A mn as follows: The cumulants C m,−n can be obtained easily by using the reality condition. One can invert the above equations to find A mn in terms of C m,n too, For rotationally symmetric distributions, only the moments r 2m survive in the polar coordinates. In such a case, the only non-zero polar cumulants are C 2k,0 . For instance, all the cumulants in Eq. (58) vanish except C 2,0 where it is equal to r 2 /2. A few of the first C 2k,0 are given by In Sect. 3.1, we have introduced an iterative method to find the Gram-Charlier A series for a one-dimensional distribution. Also, we briefly explained the standard method of finding the series by using the orthogonal polynomials. Here, we use another standard method of finding the Gram-Charlier A series. Let us consider the characteristic function of a given distribution p(x) as G(t) = dx e it x p(x). Then one can find the cumulants of the distribution by using the cumulative function, Now assume that p(x) can be approximated by a known distribution, namely a Gaussian distribution, By setting the mean value μ = 0 and the width σ = 1 for simplicity, the generating function of a Gaussian distribution can be simply found to be G N (t) = e −t 2 /2 . Referring to Eq. (61), we have G(t) = e n=1 κ n (it) n /n! . Additionally, we consider the fact that the mean value and the variance of the distribution p(x) are κ 1 = μ = 0 and κ 2 = σ 2 = 1. It means that we are able to write G(t) for a generic distribution p(x) as follows: Now, we are able to find p(x) by using an inverse Fourier transformation. By considering the relation dt By scaling x → x/σ and shifting x → x − μ , we find 22 Recall that the probabilistic Hermite polynomial is defined as He r (x) = e x 2 /2 (−d/dx) r e −x 2 /2 . As a result, one can simply find It is worth noting that Eq. (64) is exact. We are able to approximate this exact relation by expanding it in terms of number of derivatives. The result of such an expansion is called a Gram-Charlier A series, where h n was presented in Eq. (21) (h 1 = 1, h 2 = 0).

B.2 Gram-Charlier A series: two dimensions
Consider the following two-dimensional Gaussian distribution: which is more general than Eq. (14). Referring to Eq. (3), we find the characteristic function of Eq. (66) as follows: Now, assume that the mean values x and y ( · · · = dxdy · · · p(r)) are exactly equal to μ x and μ y , respectively, and alternatively set μ x = μ y = 0 by a shift. Then the characteristic function of p(r) can be written as follows: where in the last line the A mn are exactly as A mn except Note that we could choose A 20 = σ x and A 02 = σ y similar to what we did in one dimension. But here we do not use this convention for the future purpose. We can find p(r) by an inverse Fourier transformation, where for finding the above we replaced (ik x ) m (ik x ) n with (−1) m+n (d m+n /dx m dy n ) in Eq. (68). The Gram-Charlier A series can be found by expanding the exponential in Eq. (70) in terms of a number of derivatives. By recovering μ x and μ y by a reverse shift in r, we find In the above, we have used a trivial extension of Eq. (65) to two dimensions. The coefficients h mn for m + n ≤ 2 in Eq. (71) are given as follows (we set A 11 = 0 for simplicity): h 00 = 1, The other h mn can be found accordingly.

B.3 Gram-Charlier A series and energy density expansion
We should mention that the Gram-Charlier A series [Eq. (71)] is exactly what has been used in Ref. [30] to quantify the initial energy density deviation from a rotationally symmetric Gaussian distribution. 23 In Ref. [30], the cumulants of the energy density ρ(r) have been expanded in terms of the momentŝ where m, n = 0, 1, . . . and {· · · } = · · · ρ(r)dxdy ρ(r)dxdy .
Due to the translational invariance, we can freely choose the origin of the coordinates such that {r e ±iϕ } = 0. Using this and referring to Eq. (56) and Eq. (75), one finds {r 2 }ε 2,2 e −2iΦ 2,2 , It is common in the literature to define the initial anisotropies as ε 1 ≡ ε 3,1 , ε 2 ≡ ε 2,2 and ε 3 ≡ ε 3,3 . According to Eq. (76), the initial anisotropies are itself cumulants. In fact, the parameter ε n quantifies the global features of the initial energy density. For instance, ε 1 , ε 2 and ε 3 quantify the dipole asymmetry, ellipticity and triangularity of the distribution, respectively. The symmetry planes Φ 1 ≡ Φ 3,1 , Φ 2 ≡ Φ 2,2 and Φ 3 ≡ Φ 3,3 quantify the orientation of the dipole asymmetry, ellipticity and triangularity with respect to a reference direction. Furthermore, one can relate two-dimensional cumulants in Cartesian coordinates, A mn , to those in polar coordinates, C m,n by using Eq. (59). Using them, we find A 02 = 1 2 {r 2 }(1 + ε 2 cos 2Φ 2 ), Now, we use Eq. (71) to find a Gram-Charlier A series for the energy density. The result is the following: In the above, we assumed that σ 2 x = σ 2 y = {r 2 }/2. This result coincides with that presented in Ref. [30]. The cumulants C m,n , m > 3 cannot be written simply in terms of one moment ε m,n , however, it is always possible to find a Gram-Charlier A series, see Eq. (78), to any order beyond triangularity.

C A radial distribution from 2D Gram-Charlier A series
The Bessel-Gaussian distribution (15) was obtained by averaging out the azimuthal direction of a two-dimensional Gaussian distribution (14). Alternatively, one can do the same calculation and find a one-dimensional series by averaging out the azimuthal direction of Eq. (71). We should emphasize that this calculation has already been done in Ref. [28] for the case that p(r) is rotationally symmetric around the origin. However, we will try to find such a radial series for a more general case. For simplicity, we assume that μ y ≡ A 01 = 0 (μ x ≡ A 10 ) together with σ ≡ σ x = σ y = A 20 = A 02 . Also, we consider the fact that A 11 = 0. These considerations are compatible with p(v n,x , v n,y ). For such a case, h 20 = h 11 = h 02 = 0.
By averaging out the azimuthal direction of Eq. (71), we will find a distribution which has the following form: We can find the polynomials P 1,i (r ) and P 2,i (r ) by direct calculations. For i = 1, 2, one simply finds that P 1,1 (r ) = 1 and P 2,1 (r ) = P 1,2 (r ) = P 2,2 (r ) = 0. The polynomials for i = 3 are given by Finding P 1,4 (r ) and P 2,4 (r ) is straightforward, but they are more complicated. For that reason, we decided not to present them considering the limited space. Note that for each P a,i (r ) only cumulants A mn (if there are any) with order m + n = i are present 24 . In Sect. 2.1, we have argued that the one-dimensional distributions like p(v n ) can be characterized by c n {2k}. As a result, we expect to relate A mn to c n {2k} by considering Eq. (80) (replace r with v n and μ x withv n ). Considering Eq. (8) and using Eq. (80) up to i = 4, we find c n {2k} for k = 1, 2, 3 as follows: The above result is not surprising. In fact, it is a general relation of A mn of the distribution p(v n,x , v n,y ) with c n {2k} of p(v n ). Recall the equality e iv n k cos(ϕ−ϕ k ) 2D = J 0 (kv n ) 1D in Eq. (6) where the average · · · 2D is performed with respect to the distributionp(v n,x , v n,y ). For the case that the average is performed with respect to p(v n,x , v n,y ), this relation should be replaced by 1 2π 2π 0 dϕ k e iv n k cos(ϕ−ϕ k ) Using the above, Eq. (4) and Eq. (7), we find  The relation for c n {6} is more complicated. Note that by setting the simplifications we used in finding Eq. (80), the above relations reduce to Eq. (82a) and Eq. (82b).
In general, c n {2k} is written in terms of a large number of A pq ; therefore, we are not able to write A pq in terms of cumulants c n {2k}. However, it is possible to ignore the cumulants A pq with p + q ≥ 4 in some cases. For n = 2, One could wonder that, while h 12 and h 30 are present in the polynomials P 1,3 (v n ) and P 2,3 (v n ) in Eq. (81), it may be possible to find all h mn from Eq. (80) by fitting it to an experimental p(v n ). We should mention that always a combination of h mn appears in the polynomials P a,i (v n ). As a result, if we assume that the series is convergent and we have access to an accurate experimental distribution, we can only find a combination of h mn due to the information loss during the averaging. Specifically, in the simple case of Eq. (86), only A 40 + 2A 22 + A 04 can be found by fitting.
In any case, the distribution (80) is different from that mentioned in Eq. (39). These two different series refer to two different truncations. Note that q n {2k} can be written in terms of A pq with p + q ≤ 2k. For instance, if we keep A mn up to fourth order, all the cumulants q n {2k} are non-zero. Therefore, in principle, the summation (39) goes to infinity while the summation (80) is truncated up to i = 4.

D RGC distribution from multiple orthogonal polynomials
In Sect. 3, we have discussed how the orthogonality conditions of He n (x) and L n (x) can be used to find Eqs. (20) and (24), respectively. However, ordinary orthogonal polynomials cannot be employed for the more general distribution series (39). Recall that in one dimension the polynomials P n (x) and P m (x) are orthogonal with respect to the measure w(x)dx if P n (x)P m (x)w(x) dx = ζ n δ nm . This relation can be equivalently written as x m P n (x)w(x)dx = 0 for 0 ≤ m < n [39]. We will show that we can obtain Eq. (39) by employing a generalization of the orthogonal polynomials. These generalized orthogonal polynomials are called multiple orthogonal polynomials (MOPs) [39]. The MOPs are orthogonal with respect to more than one weight. Let us define r different weights w(x) = (w 1 (x), . . . , w r (x)) and a multi-index n = (n 1 , . . . , n r ) (n i ∈ N 0 ). Then the type I multiple orthogonal polynomial is given as P n = (P 1,n , . . . , P r,n ) where P j,n is a polynomial with degree at most n j . 26 These polynomials obey the following orthogonality condition: 26 Another class of MOPs are called type II multiple orthogonal polynomial. In this class, a multi-indexed monic polynomial P n satisfies the following r orthogonality conditions: where 0 ≤ i ≤ r [39]. Type II MOPs are not used here.
where N = r i=1 n i , and a dot indicates the inner product. Note that, for r = 1, the above relation returns to the ordinary orthogonal polynomial.
Here, we restrict ourselves to the weakly complete systems. In this case, the multi-index n always has the following form [43,44]: Here 0 ≤ j < r . In this case, we can define a one-to-one map between a single index i and n as i(n) = rk + j + 1. Consequently, we are able to define Y i(n) (x) = P n • w(x).