Probabilistic Stirling Numbers of the Second Kind and Applications

Associated with each complex-valued random variable satisfying appropriate integrability conditions, we introduce a different generalization of the Stirling numbers of the second kind. Various equivalent definitions are provided. Attention, however, is focused on applications. Indeed, such numbers describe the moments of sums of i.i.d. random variables, determining their precise asymptotic behavior without making use of the central limit theorem. Such numbers also allow us to obtain explicit and simple Edgeworth expansions. Applications to Lévy processes and cumulants are discussed, as well.


Introduction
The classical Stirling numbers play an important role in many branches of mathematics and physics as ingredients in the computation of diverse quantities. In particular, the Stirling numbers of the second kind S( j, m), counting the number of partitions of {1, . . . , j} into m nonempty, pairwise disjoint subsets, are a fundamental tool in many combinatorial problems. Such numbers can be defined in various equivalent ways (cf. Abramowitz and Stegun [1, p. 824] and Comtet [7,Chap. 5]). Two of the most useful are the following. Let j = 0, 1, . . . and m = 0, 1, . . . , j. Then S( j, m) can be explicitly defined as or via their generating function as Motivated by various specific problems, different generalizations of the Stirling numbers S( j, m) have been considered in the literature (see, for instance, Hsu and Shiue [11], Luo and Srivastava [13], Cakić et al. [6] and El-Desouky et al. [9]). In [3], we considered the following probabilistic generalization. Let (Y k ) k≥1 be a sequence of independent copies of a real-valued random variable Y having a finite moment generating function and denote by S k = Y 1 + · · · + Y k , k = 1, 2, . . . (S 0 = 0). Then, the Stirling numbers of the second kind associated with Y are defined by Observe that formula (3) recovers (1) when Y = 1. The motivations behind definition (3) have to do with certain problems coming from analytic number theory, such as extensions in various ways of the classical formula for sums of powers on arithmetic progressions (cf. [3]) and explicit expressions for higher-order convolutions of Appell polynomials (see [4]).
In this paper, we extend definition (3) to complex-valued random variables Y and show its usefulness in various classical topics of probability theory. In this regard, we show in Sect. 3 that the moments ES j n can be written in closed form in terms of the Stirling numbers S Y ( j, m). When Y is real-valued and centered, two remarkable consequences deserve to be mentioned. First, we can directly obtain the precise asymptotic behavior of ES j n as far as rates of convergence and leading coefficients are concerned, without appealing to the central limit theorem. Monotonicity properties of the sequence ES 2 j n /(n) j , n ≥ j, where (n) j = n(n − 1) · · · (n − j + 1), are also derived in a simple way. We point out that monotonicity results in the central limit theorem seem to be rather scarce. (In this respect, Teicher [19] and Kane [12] showed that P(S n ≥ 0) converges monotonically for various choices of the law of S n .) Second, from a computational point of view, we can evaluate ES j n for n ≥ j in terms of ES j n for n < j and S Y ( j, j/2 ). In Sect. 4, we deal with analogous properties referring to Lévy processes and centered subordinators.
Concerning rates of convergence in the central limit theorem, Edgeworth expansions provide a great accuracy in the approximation at the price of using rather involved technicalities (cf. Petrov [15], Hall [10], Barbour [5], and Rinot and Rotar [16]). In Sect. 5, we give explicit and relatively simple full Edgeworth expansions whose coefficients depend on the Stirling numbers S Y +i Z ( j, m), where the real-valued random variables Y and Z are independent and Z has the standard normal distribution. The order of magnitude of such expansions is that of n −(r −1)/2 , whenever EY k = EZ k , k = 1, 2, . . . , r , for some r ≥ 2. In Sect. 6, we show that the cumulants of a random variable Y can also be described by means of S Y ( j, m). Finally, in Sect. 2 we gather some equivalent definitions of S Y ( j, m) when Y is complex-valued without proofs, since they are similar to those previously given in [3] for real-valued random variables Y .

Probabilistic Stirling Numbers
The following notations will be used throughout the paper. Let N be the set of positive integers and N 0 = N ∪ {0}. Unless otherwise specified, we assume that j, m ∈ N 0 , x ∈ C, and z ∈ C satisfies |z| < R, where R > 0 may change from line to line. We always consider measurable exponentially bounded functions f : C → C, i.e., | f (x)| ≤ e R|x| . We denote by I j (x) = x j the jth monomial function and by (x) j the descending factorial, that is, Finally, we set j ∧ m = min( j, m) and denote by y the integer part of y ∈ R.
Let G 0 be the set of complex-valued random variables Y having a finite moment generating function in a neighborhood of the origin, i.e., For any r ∈ N, we consider a random variable β(r ) having the beta density whereas we set β(0) = 1. Note that β(1) is uniformly distributed on [0, 1]. For any r ∈ N 0 , let (Y k ) k≥1 and (β k (r )) k≥1 be two sequences of independent copies of Y ∈ G 0 and β(r ), respectively, and assume that both sequences are mutually independent. We denote The following two important special cases will also be denoted On the other hand, consider the difference operator together with the iterates Such generalized difference operators were used by Mrowiec et al. [14] and Dilcher and Vignat [8] in different analytic contexts. Observe that is the usual mth forward difference of f . In general, the iterates in (7) have a cumbersome expression. However, we have the following formulas stated in [3], where it is understood that

If, in addition, f is m times differentiable, then
The Stirling numbers of the second kind S Y ( j, m), m ≤ j, associated with the random variable Y ∈ G 0 are defined as in (3). Observe that this definition is justified in the sense that as follows by choosing f = I j and x = 0 in Lemma 1. Such numbers are characterized in the following result (cf. [3]).
Equivalently, the numbers S Y ( j, m) are defined via their generating function as For the classical Stirling numbers S( j, m), expression (9) gives us This formula was already obtained by Sun [18]. Theorem 1 allows us to obtain explicit expressions of S Y ( j, m) for different choices of the random variable Y (see [3]). In many cases, such numbers are actually real numbers. For instance, if Y = U + i V , where U and V are independent real-valued random variables and V has a real characteristic function (in particular, if V = 0 or if V has the standard normal distribution). In fact, let t ∈ R with |t| < R. Since Ee tY = Ee tU Ee it V is real, we see that both sides in (10) are real when choosing z = t. This shows the claim. Finally, if Y is nonnegative, then S Y ( j, m) is nonnegative as well, as follows from (5) and (9).

Moments
In this section, we give closed-form expressions for the moments of S n , as defined in (6) in terms of the probabilistic Stirling numbers S Y ( j, m) and discuss some of their consequences. In this respect, for any r ∈ N, denote by G r the subset of G 0 consisting of those random variables Y = U + i V such that In the case in which U and V are independent, observe that and so on. Also observe that if U and V are independent copies of a random variable having the standard normal distribution, then Y ∈ G r , for any r ∈ N, since The following auxiliary result will be used in Sect. 5.
Lemma 2 Let r ∈ N and let U and V be two independent real-valued random variables such that U ∈ G 0 and V has the standard normal distribution. Then, Proof Let Z be an independent copy of V . By (13), we see that Assume that EU k = EV k , k = 1, . . . , r , and let s = 1, . . . , r . By (14), we have To show the reverse implication, we use induction on r . For r = 1, the result is obviously true. Assume that the result is true for some r ∈ N. Let Y ∈ G r +1 ⊆ G r . By the induction assumption, EU k = EV k = EZ k , k = 1, . . . , r . We thus have from (14) This shows the reverse implication and completes the proof.
The interesting feature of the random variables Y in the subset G r , r ∈ N, is that its corresponding Stirling numbers satisfy S Y ( j, m) = 0, for j < m(r + 1), as shown in the following result. This property has remarkable consequences to evaluate the moments ES j n , as seen in the remaining results of this section, as well as to obtain the Edgeworth expansions considered in Sect. 5.
Proof We start with the following identity, which follows from the formula for the remainder term in Taylor's theorem: where β(r +1) is the random variable defined in (4). Replacing z by zY in this formula and then taking expectations, we have from (5) and (11) 1 Thus, the result follows from (10) with the change j = m(r + 1) + k.
Theorem 3 Let Y ∈ G r , for some r ∈ N 0 . Denote by τ r ( j) = j/(r + 1) . For any n ∈ N 0 , we have Moreover, for any n ≥ τ r ( j), we have Proof Note that By (10) and Theorem 2, we see that This, together with (17), shows (15). On the other hand, if n = τ r ( j), formula (16) directly follows from definition (3). Assume that n > τ r ( j). The following combinatorial identity can be easily shown by induction on p. Using (3), (15) and the preceding identity, we have This, together with definition (3), shows (16). The proof is complete.
The classical Stirling numbers of the second kind S( j, m) can also be defined by means of the equations In this sense, formula (15) may be thought as the probabilistic counterpart of (18).
This result extends the well-known upper bound for the classical Stirling numbers of the second kind, namely The case in which Y ∈ G 1 is real-valued deserves special attention. First, denote by σ 2 = EY 2 its variance and define the real-valued random variable Y whose distribution function is given by where F Y is the distribution function of Y and it is assumed that σ 2 > 0. Note that for any function f we have In the trivial case in which Y = 0, a.s., we define Y = 0, a.s., so that formula (19) still holds. Second, consider the random variable W m ( Y ) as defined in (6). Finally, recall that if Z is a random variable having the standard normal distribution, then With these ingredients, we give the following.
whenever j ≥ 2m, whereas S Y ( j, m) = 0, for j < 2m. In particular, Proof The proof of (21) follows along the lines of that of Theorem 2, by taking into account (20) and the fact that as follows from (19). The identities in (22) are a consequence of (21) and the equalities The proof is complete.

Corollary 3 Let Y ∈ G 1 be real-valued. Then,
As a consequence, the sequence ES 2 j n /(n) j , n ≥ j, decreases to E(σ Z ) 2 j . Moreover, for any n ≥ j, we have Proof As follows from (21), the Stirling numbers S Y (2 j, m) are positive. This, together with (22) and (23), implies that the sequence ES 2 j n /(n) j , n ≥ j, decreases to S Y (2 j, j) = E(σ Z ) 2 j . The remaining assertions readily follow from Theorem 3 by choosing r = 1. The proof is complete.
Let Y ∈ G 1 be real-valued. Traditionally, the problem of convergence concerning the moments ES j n , as n → ∞, is carried out by establishing first the central limit theorem and afterward showing (see, for instance, von Bahr [20]) that The explicit expressions in Corollaries 2 and 3 directly give us the precise asymptotic behaviour of the moments ES j n as far as rates of convergence and leading coefficients are concerned. Note, in particular, that the odd moments ES 2 j+1 n have the order of magnitude of (n) j (resp. (n) j−1 ) with leading coefficients S Y (2 j +1, j) (resp. S Y (2 j + 1, j − 1)) in the case that EY 3 = 0 (resp. EY 3 = 0), as follows from (22).
On the other hand, ES 2 j n /(n) j decreasingly converges to E(σ Z ) 2 j . This monotonicity property is no longer true, in general, for the odd moments, since the leading coefficient of ES 2 j+1 n /(n) j depends on EY 3 . Another consequence of formula (24) is that, with the help of (22), we can quickly compute the moments ES j n for any n ≥ j in terms of the corresponding moments for n < j.

Lévy Processes and Centered Subordinators
Lévy processes are, in continuous time, the analogue to sums of independent identically distributed random variables in discrete time. It therefore seems plausible to obtain for such processes similar moment results to those given in the preceding section. Recall that a Lévy process (Y (t)) t≥0 is a stochastically continuous process starting at the origin and having independent stationary increments. A subordinator (X (t)) t≥0 is a Lévy process having right continuous nondecreasing paths.
Let (W (t)) t≥0 be a zero mean square integrable Lévy process whose characteristic function is given by (cf. Steutel and van Harn [17, p. 181 where K (dx) is a Lévy measure on R, which puts mass 0 on {0} and satisfies The characteristic function (25) can be written as where β(2) is defined in (4) and U is a random variable independent of β(2), with distribution function We see from (26) that EW (t) = 0 and EW 2 (t) = tκ 2 , t ≥ 0, so that κ 2 is the variance of W (1). Now, let (B(t)) t≥0 be a standard Brownian motion on R, independent of (W (t)) t≥0 and define the Lévy process (Y (t)) t≥0 by setting Observe that EY (t) = 0, EY 2 (t) = t(κ 2 + σ 2 ), t ≥ 0, and as follows from (26) and (27). Let V be a random variable uniformly distributed on [0, 1] and independent of β(2) and U . Then, we can rewrite (28) as where On the other hand, a subordinator (X (t)) t≥0 is called centered if E(X (t) − t) = 0 and E(X (t) − t) 2 < ∞, t ≥ 0. In such a case, the characteristic function of X (t) is then given by (cf. Steutel and van Harn [17, p. 107] and [2]) where T is a nonnegative random variable. Denote by τ 2 = ET . This notation comes from the fact that E(X (t) − t) 2 = tτ 2 , t ≥ 0. Consider the nonnegative random variable T * whose distribution function is given by and equal to zero for y < 0, where F T is the distribution function of T and it is assumed that τ 2 > 0. In the case in which T = 0, a.s., we simply define T * = 0, a.s. Observe that for any function f we have It turns out that (cf. [2]) where the random variables T * and β(2) are independent. The main difference between formulas (29) and (31) is that T is real-valued, whereas T * is nonnegative. We finally observe that if (X (t)) t≥0 is the standard Poisson process, then T = T * = 1, whereas for the gamma process, the random variables T and T * have the probability densities ρ(θ) = e −θ and ρ * (θ ) = θ e −θ , θ ≥ 0, respectively. Once representations (29) and (31) are given, we can obtain closed-form expressions for the moments of Y (t) and X (t) − t in a simple way, as the following result shows. (30) and (31), respectively, belong to G 0 . For any j ∈ N 0 and t ≥ 0, we have

Theorem 4 Assume that T and T * , appearing in
Moreover, the functions g 2 j (t) and h j (t) are completely monotonic.
Proof The identities in Theorem 4 follow by expanding the characteristic functions given in (29) and (31), and recalling (20). The last statements concerning complete monotonicity follow from the facts that EW m (T ) 2( j−m) and EW m (T * ) j−2m are nonnegative for m ≤ j/2 . The proof is complete.
With respect to Theorem 4, similar comments to those made after Corollary 3 are valid. Details are omitted.

Edgeworth Expansions
Let y ∈ R and let Z be a random variable having the standard normal density Denote by G(y) the standard normal distribution function. Recall that the Hermite polynomials (H n (y)) n≥0 are defined by g(y)H n (y) = (−1) n g (n) (y).
Let Y ∈ G 0 be a real-valued random variable having an integrable characteristic function. Suppose that EY = 0 and EY 2 = 1. Denote by F n (y) the distribution function of S n / √ n. Under such circumstances, it is well known (see, for instance, Petrov [15, p. 117 We will show that the Edgeworth expansion of F n (y) − G(y) can be described in a simple way in terms of the Stirling numbers associated with the complex-valued random variable where Y and Z are supposed to be independent. To this end, fix r = 2, 3, . . . Consider the sets We are in a position to state the following.
Theorem 5 Let Y ∈ G 0 be a real-valued random variable having an integrable characteristic function. Assume that EY k = EZ k , k = 1, 2, . . . , r for some r ≥ 2. For any n ∈ N and y ∈ R, we have Proof Let ξ ∈ R. By (34), the integrand in (33) can be written as By Lemma 2, the random variable Y belongs to G r . We therefore have from Theorem 2 and (35) Hence, integrating (37) with respect to ξ , the conclusion follows from (32). The proof is complete.
Fix r ≥ 2. Compared with the usual full Edgeworth expansions, Theorem 5 gives us an explicit and relatively simple expansion of F n (y) − G(y), making clear, at the same time, that its order of magnitude is that of n −(r −1)/2 , provided that the first moments of Y and Z (up to the order r ≥ 2) coincide. The coefficients in this expansion depend on the Stirling numbers S Y ( j, m) associated with the complex-valued random variable Y defined in (34). As noted after Theorem 1, such numbers are actually real numbers, which can be evaluated by means of Theorem 2.

Cumulants
Recall that the cumulant generating function of a random variable Y ∈ G 0 is defined as where the coefficients (κ j (Y )) j≥1 are called the cumulants of Y . Such cumulants can be written in terms of the Stirling numbers S Y ( j, m), as shown in the following result. In view of (38), this shows the first equality in (39). The second one readily follows from definition (3) and the well-known combinatorial identity p l=0 s + l l = s + p + 1 p , p, s ∈ N 0 .
The proof is complete.
We finally mention that, in certain particular cases, we find for the cumulants simpler formulas than those given in (39). For instance, in the case of the Lévy processes considered in (29), it can be checked that κ j (Y (t)) = t(σ 2 + κ 2 )ET j−2 , j = 2, 3, . . . , t ≥ 0.
A similar formula holds for the centered subordinators defined in (31).