An iterative algorithm to bound partial moments

This paper presents an iterative algorithm that bounds the lower and upper partial moments of an arbitrary univariate random variable X by using the information contained in a sequence of finite moments of X. The obtained bounds on the partial moments imply bounds on the moments of the transformation f(X) for a certain function f:R→R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f:\mathbb {R}\rightarrow \mathbb {R}$$\end{document}. Two examples illustrate the performance of the algorithm.


Introduction
Statistical computations for insurance policy, inventory management, Bayesian point estimation, and other areas often involve the computation of partial moments (Winkler et al. 1972). Bawa and Lindenberg (1977) derive a capital asset pricing model from utility functions based on lower partial moments. The i-th upper (lower) partial moment of a univariate real-valued random variable X with cumulative distribution function F is defined as the i-th moment in excess of (below) a certain threshold a ∈ R, This paper presents an algorithm that bounds the partial moments of X using information contained in a sequence of (full) moments of X . This approach is strongly related to the moment problem. The moment problem originally considered whether a certain sequence of moments corresponds to at least one univariate probability measure. This problem has been extensively discussed in the mathematical literature (Akhiezer 1965;Kreȋn and Nudelman 1977;Shohat and Tamarkin 1943;Stoyanov 2013). The admissible support of the B Sander Muns s.muns@uvt.nl 1 Tilburg University, Netspar, P.O. Box 90153, 5000 LE Tilburg, Netherlands probability measure splits the moment problem in three different subproblems: The admissible support can be unrestricted (Hamburger moment problem), restricted to the positive half-line (Stieltjes), or restricted to a bounded interval (Haussdorff). The algorithm presented in this paper fits into the Hamburger moment problem as it allows for X with support at both tails.
In case a sequence of moments is known to correspond to at least one probability measure, an important follow-up question involves what information on the distribution is contained in this sequence. Research on this question has focused on several characteristics: (i) class of probability measure (Berg 1995;Gut 2002;Lin 1997;Pakes et al. 2001;Stoyanov 2000) (ii) tail of the distribution (Goria and Tagliani 2003;Lindsay and Basak 2000) (iii) mode (Gavriliadis 2008) (iv) cumulative distribution function (Mnatsakanov 2008b;Mnatsakanov and Hakobyan 2009) (v) density function (Gavriliadis and Athanassoulis 2009;Mnatsakanov 2008a) (vi) Shannon entropy (Milev et al. 2012) The above-mentioned literature can differ in the admissible support of the probability measure.
By using a sequence of moments, this paper adds to the literature on the moment problem by presenting an iterative algorithm that bounds the partial moments (1) at a = 0. More specifically, the algorithm bounds the partial moments from the positive part, max(X , 0), and the negative part, max(−X , 0), of a univariate real-valued random variable X by using information contained in known (ratios of) subsequent finite moments of X . In case one is interested in the partial moments at a = 0, obtain moments of X a := X −a from the sequence of moments of X and bound the moments of max(X a , 0) and max(−X a , 0).
The bounds on the partial moments imply bounds on unobserved higher order moments of X . Using a Taylor expansion, one can obtain bounds on the moments of the transformation f (X ) for a certain function f : R → R.
Where applicable, procedures from related literature can be added to our algorithm. If the random variable X belongs to the class of Pearson distributions, 1 a recursive relationship on partial moments is in Winkler et al. (1972). In case X has a known finite support, the information on the support implies additional bounds (Barnett et al. 2002) and suitable numerical optimization routines can be invoked (Dokov and Morton 2005;Frauendorfer 1988;Kall 1991). Because the sequence of known moments of X typically consists of more than six moments, an unrestricted support can hamper a numerical optimization routine that restricts the sequence of moments.
The paper proceeds as follows. Preliminaries are in Sect. 2. Section 3 presents the iterative algorithm that bounds partial moments, and how these bounds imply bounds on other expressions. In Sect. 4, the algorithm is demonstrated for two example distributions. Section 5 is reserved for concluding remarks and points of discussion. 1 The density function of a Pearson distribution is proportional to exp − x−a b 0 +b 1 x+b 2 x 2 dx with constants a, b 0 , b 1 , and b 2 . Example distributions include the normal, beta, uniform, Gamma, Chi-squared, Student's t, and Pareto distribution.
For a quasi-degenerate random variable X , there exists x ∈ R with P(X ∈ {0, x}) = 1. A nonquasi-degenerate (nqd) random variable X is a random variable that is not quasi-degenerate, thus P(X ∈ {0, x}) < 1 for all x ∈ R. A degenerate X is necessarily quasi-degenerate, while a nqd X is necessarily non-degenerate.
Define the i-th moment ratio of X as m i : i and m − i refer to the i-th moment ratio of the positive part X + and the negative part X − , respectively. Both ratios can be referred to as a partial moment ratio of X . Lower bounds and upper bounds are denoted by underlines and bars, respectively. For instance,

The algorithm
This section outlines the algorithm in a number of lemmas and a theorem. By using some simple constraints, the first lemma provides bounds on partial moments and partial moment ratios. This enables an initialization of the bounds. Consider the Christoffel function associated with cdf F of a non-degenerate X : where P 0 (x) ≡ 1 and j are the j zeros of the j-th degree polynomial P j . Using the definitions above, the following lemma from Gavriliadis and Athanassoulis (2009) bounds the distribution function at certain points.
Lemma 2 Suppose the moments μ 1 , . . . , μ 2n of a non-degenerate X are known (n ∈ N + ). The cumulative distribution function F satisfies at the j zeros x Letx (1) < · · · <x (N ) represent the strictly increasing sequence of the N distinct zeros of the set of polynomials {P j } n j=1 : Define the increasing step functions x >x (N ) .
Notice that for L(x) (respectively U (x)), one can simply take for each j the largest (smallest) i that satisfies x Using these bounds on the cdf F(x), Lemma 3 produces lower bounds on the partial moments: Lemma 3 (Initialization II) Given the notation above, Upper bounds on the partial moments μ + i and μ − i are not available along the lines in Lemma 3. For instance, the summation N . The iterative algorithm we develop derives upper bounds on μ + i and μ − i under certain assumptions. In addition, the algorithm can sharpen the lower bounds obtained in Lemma 3.
The next lemma is useful to apply in the iterative algorithm as well as for proving some inequalities.

Lemma 4 (Moment ordering)
(i) For nonnegative X and n ≥ 2, or for arbitrary X and even n ≥ 2, The inequalities are all strict if and only if X is nqd.
Lemma 5 produces bounds on the moment ratio of certain summations of nonnegative random variables.
Lemma 5 Consider Y = j∈J Y j where each Y j : Ω → R + 0 is a nonnegative random variable with strictly positive first moment. Let i ∈ N + .
(i) Suppose the following two conditions both hold (ii) If Y is a mixture, i.e., for almost all outcomes ω ∈ Ω at most one Y j (ω) is nonzero, then (iv) Suppose the following two conditions both hold then for each nonempty I ⊆ J , j∈I Y j is logconcave and Sufficient condition (b) in (i) reflects that the moment ratio m i (Y j ) must be separable into two parts with a common functional form across i and j. Nonnegative distributions with moment ratios that satisfy this condition include the Gamma distribution and distributions with a single, scalar parameter such as the half-normal distribution and uniform distributions of the type U [0, b] with b > 0. The moment ratio of the lognormal distribution has not a separable form. It is natural to inquire about the necessity of the sufficient conditions in Lemma 5. We provide for each condition a counterexample where the sufficient condition is not satisfied and the corresponding inequality fails to hold.
(i) Suppose X = j∈I X j is a mixture of random variables X j : Ω → R, i.e., for almost all outcomes ω ∈ Ω at most one X j (ω) is nonzero, then (ii) If X = Y − Z with Y = j Y j and the following conditions hold (a) Y and Z are independent and nonnegative random variables (iii) If X = Y − Z is a mixture of nonnegative random variables Y and Z , i.e., P({Y = 0} ∪ {Z = 0}) = 1, then X + ≡ Y and Similar relations apply to m − i := m i (X − ) and m i (Z ). Condition (c) in Lemma 6(ii) on logconcave distributions is satisfied by many univariate distributions. Examples include the exponential distribution, the normal distribution, the uniform distribution, and the Gamma distribution with shape parameter at least one. The Gamma distribution with shape parameter less than one, the Student's t-distribution, and the log-normal distribution are not log-concave.
We provide for each sufficient condition in Lemma 6 an example where the condition is not satisfied and the corresponding inequality is violated.  i) Inequality (9) fails to hold for the nonmixture case where X ≡ 2 with X 1 ≡ X 2 ≡ 1. Here, X , is not a mixture and m i ( The independence assumption (a) fails to hold, while assumptions (b) and (c) are satisfied. Inequality (10) is not satisfied: m 2 (X + ) = 7 9 > 2 3 = m 2 (Y ). (b) Consider again the counterexample of Lemma 5(iv) where Y = Y 1 + Y 2 with the log-concave distributions Y 1 ∼ Exp(1) and Y 2 ∼ Exp 1 2 , and let Z ≡ 5. This satisfies conditions (a) and (c). To violate (b), impose a perfect negative correlation on the quantiles . It can be verified that m 2 (X + ) = 3.88 · · · > 3.81 · · · = 6 − 2 9 π 2 = m 2 (Y ). (c) Suppose P(Y = 1) = 9 10 , P(Y = 9) = 1 10 , and Z ≡ 1. While conditions (a) and (b) are satisfied, condition (c) on logconcavity of each Y j fails to hold. Inequality (10) also fails to hold: m 2 (X + ) = 8 > 5 = m 2 (Y ). (iii) The nonmixture case with Y ≡ 2 and Z ≡ 1 corresponds for each i ∈ N + to X ≡ X + ≡ 1 and m + i = 1 = 2 = m i (Y ). The next theorem presents the theoretical novelty of the paper. For a nqd X , the moment μ s i is bounded in terms of the bounds on m s i , which are in turn in the proof derived from the bounds on the moments of X −s . The requirement of a nqd X can be easily verified, since the moments of a quasi-degenerate distribution are uniquely characterized by finite m 1 = m 2 = . . . and μ i = (μ n ) i/n (i ∈ N + 0 ). To identify a quasi-degenerate X , it therefore suffices if μ i = (μ n ) i/n holds for some odd i and even n. The requirements on i and n exclude X that are restricted to the set {−x, 0, x} for certain x ∈ R.
Theorem 1 (μ s i in terms of the bounds on m s i ) Consider a nqd X with known moments μ n−2 , μ n−1 , μ n with n ≥ 2 even. Define Assume for the bounds on the partial moment ratios The moments of X s are bounded by The next lemma bounds μ s i using previously obtained bounds.
Lemma 7 (μ s i in terms of the bounds on m s j , μ s j , and μ −s j ) For i ∈ N + and s ∈ {+, −}, Lemma 8 bounds the moment ratios in terms of the bounds on the moments.

Lemma 8 (m s i in terms of the bounds on μ
The bounds in (22)  The algorithm below is the iterative algorithm that bounds the partial moments μ + i and μ − i and the corresponding moment ratios m + The moments μ 1 , . . . , μ n of the random variable X are known. Before proceeding to the next step, each step is executed for i = 1, . . . , n [and also for i = 0 in (iii)-(iv)]. The index j denotes an index that may depend on i. (iv) Bound μ + i and μ − i in terms of the bounds on μ + j , μ − j , m + j , and m − j (Lemma 7). (v) Bound m + i and m − i in terms of the bounds on μ + j and μ − j (Lemma 8). (vi) Stop if the change in the bounds since step (ii) is smaller than some predetermined tolerance, otherwise go to step (ii).
Within each step, apply all applicable inequalities. The different assumptions in particularly Lemmas 5 and 6 must be checked in advance. Some may be valid for a given random variable X , while some others may not.
The following algorithm applies Lemma 6(iii) and is helpful for bounding unobserved higher order moments.
. For a certain ν ∈ N + , construct bounds on μ s i using bounds on m s i where i = n + 1, . . . , n + ν: Conditions on X s are needed for an upper bound on partial moments μ s i (i > n) in Algorithm 2(ii). To see this, the tail of any X s can admit P(X s = x) = 1/x i−ε for arbitrarily large x and ε ∈ (0, i). By Markov's inequality, the probability mass at x of such X s has unbounded impact on moments i ∈ N + and higher if x → ∞: The bounds on the moments of X + and X − from Algorithm 2 provide bounds on the unobserved higher order moments of X through

. This can be used in a Taylor expansion of the transformation f (X ).
For an expansion of f (X ) around a nonzero a, one can consider using max(X a , 0) and max(−X a , 0) with X a := X − a.

Examples
The example in Sect. 4.1 reports bounds on partial moments for a case using Algorithm 1. Section 4.2 contains an example that uses Algorithm 1 and bounds a Taylor series expansion by extrapolating the results to higher order moments.

Example 1: a sum of random variables
Consider X = 1 2 T − U + Z with T an exponential distribution with intensity one, U a uniform distribution on [0, 1], and Z a standard normal distribution. The variables T , U , and Z are independent. Since the odd moments of Z are zero, the moments μ i of X can be obtained from the expansion where i ∈ N + 0 , i/2 denotes the largest natural number not greater than We obtain the i-th moment of X from (24) for i = 0, 1, . . . , 16.
We are interested in bounding the moments of X + := max(X , 0) and 2 , the distribution of X is equal in distribution to a mixture of two distributions: Both components in (26) consist of three independent log-concave distributions. Upper bounds on m + i := m + i (X ) and m − i := m − i (X ) are initialized by applying Lemma 6(i)-(ii) and then Lemma 5(iv), The initial moment ratios in (27)-(28) follow from where the moments of T and U are in (25), and As a measure of convergence after iteration k, consider the change in the width of the error bounds: where the first subscript i of each moment bound refers to the considered order of the partial moments, and the second subscript k refers to the iteration number with iteration 0 immediately after the initialization. A small ε k indicates a small improvement in the error bounds. Using ε k < 10 −10 as a stopping condition, Algorithm 1 stops after 83 iterations and 151 ms of CPU time. 3 Table 1 reports the bounds on the moments μ + i of X + and μ − i of X − . By (20)-(21), the differenceμ s i − μ s i is independent of the sign s ∈ {+, −} (3rd column). This difference relative to the moment μ i tends to decrease with the order i of the moment (4th column). This can be helpful for bounding Taylor expansions because the moments of the highest observed order are most important for bounding unobserved higher order moments of X (here, the moments greater than order sixteen).
The difference between the bounds relative to μ + i is also decreasing with the order (7th column). In contrast, the difference between the bounds relative to μ − i increases for the higher order moments to 43.7% (rightmost column). This higher percentage may reflect that the sequence of 16 moments cannot uniquely pin down the characteristics of particularly X − . The reason is that X + tends to be larger than X − , as indicated by the moment bounds. As such, the higher order moments of X are mainly determined by the higher order moments of X + .
The convergence by iteration is for the partial moments of order 10 in Table 2. Here, 25 iterations suffice for having each of the bounds with a difference less than 0.03% of the bounds after the final iteration on the 10-th moment. The final bounds on μ + 10 and μ − 10 differ from each other by 2.2 and 22.0%, respectively. Table 3 indicates that for this example, executing 25 iterations gives bounds with a difference of at most 0.14% to the final bounds on the 16-th partial moments of X . The relative difference between the final bounds on μ + 16 is, by coincidence, also 0.14%. Figure 1 depicts the series ε k as a function of the iteration k. The linear trend in ε k suggests that for someã,b ∈ R, log(ε k ) ≈ã +bk.
A small number of numerical experiments suggest that the exponential convergence of ε k in (31) might hold in general. A proof of this conjecture is beyond the scope of this paper.

Table 1
Moments of X from (24) and final bounds on partial moments of X from X s from Algorithm 1 (s  The product ∞ k=1 (1 − ε k ) is a measure for the cumulative decrease in the width of the error bound over all iterations. Provided (31) is the correct model andb ≤ 0, the lower bound and upper bound do not converge to each other: where a = eã and b = eb. An estimation of the linear regression (31) using ε 20 , . . . , ε 83 of this example givesã = 1.45 andb = −0.299. This predicts a small cumulative relative decrease in the width of the error bounds after iteration 83: Because ε k is a maximum over different orders of moments, the value in (33) can be interpreted as an upper bound on the cumulative decrease of the partial moments 0, 1, . . . , 16. In some cases, initial bounds on the moment ratios can be difficult to obtain. Instead of the initial bounds (27)-(28), suppose we were to initialize the upper bounds on the moment ratios by  By Lemma 5(iii), this gives less strict initial bounds on the moment ratios than (27)-(28) ( Table 4). The effect of the initial bounds on the final bounds is substantial as can be seen by comparing Tables 1 and 5. Each differenceμ s i − μ s i is higher in Table 5 than in Table  1. This underlines the importance of providing initial bounds in Algorithm 1 that are as strict as possible. Particularly, the bounds on the moments of X − are sensitive to the initial bounds. More specifically, the accuracy of the bounds on the highest order moment of X − decreases by three orders of magnitude when the less strict initial bounds in (34) are imposed (rightmost column in Tables 1 and 5).

Example 2: the exponential function on a quadratic form
Suppose one is interested in E exp(X ) with the quadratic form X = Z Ã Z = 1 2 Z Ã +Ã Z where Z ∼ N (0, I ) has dimension d andÃ is a d × d matrix with eigenvalues less than 1 2 . Diagonalize the symmetric matrix A := 1 2 Ã +Ã as V ΛV with Λ a diagonal matrix with diagonal entries the eigenvalues λ 1 ≤ · · · ≤ λ d < 1 2 . Since V Z ∼ Z , the random variable X is a weighted summation of d independent Chi-squared distributions with one degree of freedom,

Table 5
Moments of X from (24) and final bounds on partial moments of X from X s from Algorithm 1 (s Instead of (27)-(28) as in Table 1, the less strict bounds from Lemma 5(iii) are used We can infer the exact outcome of E exp(X ) from the moment generating function of a Chi-squared distribution: The expectation of exp(X ) is infinite if max i λ i ≥ 1 2 . The outcome in (35) enables a direct comparison with the bounds that we obtain. It should be stressed that an exact representation of E[ f (X )] is in general unavailable.
Consider the Taylor series expansion The moments of X are (Magnus 1986, Lemma 3) where the summation is over all ν = (n 1 , . . . , n i ) with each n j ∈ N 0 and i j=1 n j j = i. This procedure is computationally expensive for moments of a high order i. A procedure based on Algorithm 2 enables us to bound high moments of X , and thus E exp(X ) in (36). The bounds follow from a few additional steps we outline below.
Using a polar coordinate system, it can be verified that where the direction vector U d follows a uniform distribution on the unit d-sphere S d and R, which is the squared distance to the origin, follows a χ 2 (d)-distribution. The latter distribution is a Gamma distribution with shape parameter d/2 and scale parameter 2. This means that X is an infinite mixture of Gamma distributions X u with each component characterized by some u ∈ S d . More specifically, each component X u in the mixture is a Gamma distribution scaled by λ(u) := u Λu. The scaling parameter λ(u) varies between λ min = min i λ i = λ 1 and λ max = max i λ i = λ d . For each component X u , a positive (negative) λ(u) indicates that the corresponding Gamma distribution is added (subtracted). We loosely write X u ∼ Gamma(d/2, 2λ(u)) for all λ(u), including negative λ(u).
Define the scaled remainder ξ n (Y ) of a random variable Y by the functional For a random variable Y that degenerates at x, i.e., Y ≡ x, the functional in (38) can be written as a function of x, 108 S. Muns 1 (a, b; x) is the confluent hypergeometric function of the first kind with parameters a and b. By (36) and (38), The moments μ 1 , . . . , μ n are obtained from (37), while Algorithm 1 produces bounds on μ + n and μ − n . The following lemma derives bounds on ξ n (X + ) and ξ n (−X − ) in (39).
(i) The functionals ξ n (X + ) and ξ n (−X − ) are bounded by where 2 F 1 is the Gaussian hypergeometric function.
The bounds in Lemma 9(i) can deal with any distribution X , particularly any eigenvalue of A. In contrast, the two bounds in Lemma 9(ii) that are based on the Gaussian hypergeometric function 2 F 1 require that each absolute eigenvalue of A is less than 1 2 , because X u has scale parameter θ(u) = 2u T Au = 2λ(u) with u ∈ S d .
We present example cases where the minimal eigenvalue λ min is − 0.1, − 0.2, − 0.3, or − 0.4, while the maximal eigenvalue λ max is 0.1, 0.2, 0.3, or 0.4. The other eigenvalues are either equally spaced between λ min and λ max , or equally split at the two extremes. Define where μ Y andμ Y are a lower and an upper bound on the mean μ Y of Y , respectively. The ratio r represents the size of the maximal error relative to the mean μ Y . A small r indicates more accurate bounds. The weight w ∈ [0, 1] is the normalized location of μ Y on the interval μ Y ,μ Y . A value of w close to zero (one) reflects that the lower , and λ equally spaced on  (40) (upper) bound is the most accurate bound on μ Y . The bounds are equally accurate if w = 1 2 . Algorithm 1 stops after iteration k if ε k < 10 −6 with ε k as in (30), or if k = 100. Subsequently, the expression in (39) can be bounded. Table 6 reports several statistics for the case with two eigenvalues (d = 2). The mean μ Y is in all cases of the same order of magnitude. The maximal relative error r is minimal when both extreme eigenvalues λ min and λ max are close to zero; a similar observation holds with d = 20 in Table 7 (eigenvalues equally spaced on the interval [λ min , λ max ]) and in Table 8 (10 eigenvalues at both λ min and λ max ). Indeed, we can perfectly estimate μ Y = 1 for the case where each eigenvalue equals zero.
It follows from w in Table 6 that with d = 2, the upper bound tends to be more accurate than the lower bound if λ max is high, thus when X + tends to be large. This observation is reversed for d = 20 (Tables 7 and 8). Compared to λ max , the value of λ min has a smaller impact on w. The accuracy decreases with the dimension d and the magnitude of the eigenvalues of A. More specifically, we observe a lower accuracy r if d = 20 and max(|λ min | , |λ max |) ≥ 0.3.
The computation time in Tables 6, 7, 8 is the mean CPU time of 5,000 computations of each model. The standard error is at most 0.12 ms. The computation time and the number of iterations are lower in cases where |λ min | = |λ max |.

Discussion and conclusions
This paper has presented an iterative algorithm that bounds the lower and upper partial moment at a ∈ R of the random variable X with a known finite sequence of moments of X . In a numerical example, particularly the higher order partial moments can have narrow bounds. The obtained bounds imply bounds on unobserved higher order moments of X which is useful for bounding moments of the transformation f (X ). In another application, a transformation f (x) = e x is considered for the quadratic form X = Z AZ where Z is a multivariate normal distribution. Numerical experiments suggest that the obtained bounds on E exp(X ) are most accurate if X + is not too large. The accuracy depends on the dimension as well as the eigenvalues of A.

in (3) follow from the identity
. The second and third inequality in (3) follow from (n ∈ N + ) Proof of Lemma 2 See Theorem 3.1 in Gavriliadis and Athanassoulis (2009).

Proof of Lemma 3 The j zeros
j of each polynomial P j are real and distinct (Akhiezer 1965;Szegö 1975). Lemma 2 implies that the functions L and U are a lower bound and an upper bound of the cdf F of X , respectively: where j + 0 , d L j , and dU j ( j = 1, . . . , N ) are defined in Lemma 3.

Proof of Lemma 4 (i) The Cauchy-Schwarz inequality with the inner product
Substituting Y = X n/2−1 and Z = X n/2 leads to the desired result for n > 2, and for n = 2 provided μ 0 := P(X = 0) = 1. An additional step is needed for the case where n = 2 and μ 0 < 1. Consider the Cauchy-Schwarz inequalityμ 2 1 = E X 2 ≤ E X 0 E X 2 =μ 0μ2 with X ∼ X X =0 . Sinceμ i = μ i /P(X = 0), multiplying both hand sides of this inequality by [P(X = 0)] 2 gives the result with n = 2 and μ 0 < 1.
The Cauchy-Schwarz inequality holds with equality if and only if (iff) Y = λZ for some constant λ, Y ≡ 0, or Z ≡ 0. None of these three conditions holds if X is nqd. It follows that μ 2 n−1 = μ n μ n−2 holds iff X is quasi-degenerate. (ii) Follows from μ 1 > 0 and (i).
A convex combination of larger m i (Y j ), j ≥ j 0 , is larger than a convex combination of this combination and another convex combination of smaller m i (Y j ) with j < j 0 : Let w (i) . By (41), the measure w (i+1) stochastically dominates the measure w (i) : Because {h j } j∈J is a positive nondecreasing sequence and w (i+1) stochastically dominates w (i) , for each r ∈ J , Inequality (5) follows from rearranging (42) and Hence, by independence of each {Y r } r ∈J and Lemma 4(ii), Substituting (44), (iv) As the case I = J is straightforward, assume I ⊂ J . Define the complement of I as I c := J \I . Since each Y j is independent and log-concave, the convolutions Y I := j∈I Y j , Y I c := j∈I c Y j , and Y = Y I + Y I c have log-concave distributions (Ibragimov 1956). Denote the probability density function (pdf) of the nonnegative random variables Y I , Y I c , and Y by This proves that log f Y I (x)/ f Y (x) , and thus the ratio fỸ I / fỸ , is nonincreasing on [0, ∞). Therefore, E Ỹ I ≤ E Ỹ and thus m i (Y I ) ≤ m i (Y ) for any i ≥ 1.

Proof of Lemma 6
We prove the bounds on m + i := m(X + ) and m i (Y ), the statements on m − i and m i (Z ) follow analogously.
(i) Since X = j∈I X j is a mixture, the positive part of X is X + = j∈I X + j . Apply Lemma 5(ii). (ii) Since each Y j is independent and log-concave, the convolution Y = j∈J Y j is log-concave (Ibragimov 1956). Denote the probability density function (pdf) of X and Y by f X (x) := ∞ 0 f Y (x + z) dF Z (z) and f Y (x), respectively. For any i ≥ 1, considerX andỸ with pdf Because Notably, the derivation above started with bounds on the moments of X − and ends with bounds on the moments of X + .
In (54), an upper bound on μ s n corresponds to a lower bound on k s . The function k s is continuous on (0, ∞) and monotonically increases on the interval ([s m n ] + , ∞). Provided s m n > 0, k s monotonically decreases on (0, s m n ). By (14), 1/m s n ≥ 1/(s m n ) such thatm s n gives the maximal upper bound on μ s n in (54). Thus, usingm s n (instead of m s n ) can tighten the upper the bound on μ s n . This proves the last inequality in (16).

Proof of Lemma 7
The inequality in (17) follows from Lemma 4(i). Next, consider Hölder's inequality The first inequality in (18) is obtained by substituting Y = (X s ) j , Z ≡ 1, p = i/ j and q = p/( p − 1). The first inequality in (19) follows from Y = (X s ) i , p = j/i, and q = p/( p − 1). The second inequality in both (18) and (19)

Proof of Lemma 8
We prove the case with strict bounds, since the case with equality signs is trivial. Both inequalities in (22)  We prove the properties of r m,n in Tabel 9. Notice that r 1,n (x) = ξ n (x) = 1 F 1 (1, n + 2; x) − 1 (n + 1)! .