Approximations to the distribution of sum of independent non-identically gamma random variables

Calculating the sum of independent non-identically distributed random variables is necessary in the scientific field. Computing the probability of the corresponding significance point is important in cases that have a finite sum of random variables. However, it is difficult to evaluate this probability when the number of random variables increases. Under these circumstances, consideration of a more accurate approximation of the distribution function is extremely important. A saddlepoint approximation is performed using upper probabilities from the distribution of the sum of independent non-identically gamma random variables under finite sample sizes. In this study, we compared the results from a saddlepoint approximation to those from normal and moment-based approximations to identify the most appropriate method to use for the distribution function.


Introduction
The distribution of the sum of independent identically distributed gamma random variables is well known. However, within the scientific field, it is necessary to know the distribution of the sum of independent non-identically distributed (i.n.i.d.) gamma random variables. For example, it would be necessary to know this distribution for calculating total waiting times where component times are assumed to be independent exponential or gamma random variables. In addition, engineers calculate total excess water flow into a dam as the sum of i.n.i.d. gamma random variables. To calculate the exact probability distribution of the sum of i.n.i.d. gamma random variables, the probability of all possible elements consistent with the sum must be computed. Mathai [12] derived the distribution of the sum of i.n.i.d. gamma random variables by converting the moment-generating function. Additionally, Moschopoulos [13] calculated the distribution of the sum of i.n.i.d. gamma random variables using a simple recursive relation approach. For the detail of the gamma distribution family, we refer the reader to Khodabin and Ahmadabadi [9]. However, Mathai [12] and Moschopoulos [13] derived the density of the sum of i.n.i.d. gamma random variables with infinite summation. This method of computation is intractable in practice, especially in cases in which there is an increase in the number of random variables. An exact calculation is feasible by applying the standard inversion formula to the characteristic function in computer algebra systems, such as Mathematica. However, in these calculations, the probability is estimated with an approximation method. Approximation methods are widely used and have been studied extensively. From a practical view, approximations are typically precise and straightforward to implement in various statistical software programs. Hence, obtaining a more accurate approximation for evaluating the density or the distribution function of i.n.i.d. random variables remains an important area of debate in statistics. In this study, we describe the use of approximation methods to calculate the distribution of the sum of i.n.i.d. gamma random variables in Sect. ''A Saddlepoint approximation to the distribution of sum of i.n.i.d. gamma random variables''. Furthermore, we discuss the derivation of the order of errors of suggested approximation for the given distribution. For the approximation presented in this paper, we used the saddlepoint formula employed previously by Daniels [2,3] and developed by Lugannani [11]. The saddlepoint approximation can be obtained for any statistic or random variable that contains a cumulant generating function. Additionally, the saddlepoint generates accurate probabilities in the tail of distribution. Saddlepoint approximations have been used with great success by several researchers. Excellent discussions of their applications to a range of distributional problems are found in the following studies: Jensen [8], Huzurbazar [7], Kolassa [10], and Butler [1]. Recently, Eisinga et al. [4] discussed the use of the saddlepoint approximation for the sum of i.n.i.d. binomial random variables. Additionally, Murakami [14] and Nadarajah [15] considered the use of the saddlepoint approximation for the sum of i.n.i.d. uniform and beta random variables, respectively. In Sect. ''Numerical results'', we discuss the results obtained from using the saddlepoint approximation. In Sect. ''Concluding remarks'', we summarize our conclusions.
A Saddlepoint approximation to the distribution of sum of i.n.i.d. gamma random variables In this section, we discuss the use of the saddlepoint approximation of the sum of independent non-identically gamma random variables. We assumed that X 1 ; . . .; X n are independent random variables, with shape, a i [ 0, and scale parameters, b i [ 0, for i ¼ 1; . . .; n. Next, we let S n ¼ X 1 þ X 2 þ Á Á Á þ X n . The moment-generating function of S n is It is important to note that Mathai [12] derived the density function of the sum of i.n.i.d. gamma random variables by converting its moment-generating function as follows: x [ 0; where q ¼ a 1 þ Á Á Á þ a n and ðyÞ z denote the Pochhammer symbol. In addition, Moschopoulos [13] obtained the density function of the sum of i.n.i.d. gamma random variables using the following simple recursive relation approach: . . .; with d 0 ¼ 1. It is difficult to evaluate the exact density of S n with increasing n.
Herein, we consider an approximation to the distribution of S n . The cumulant generating function of S n is j n ðsÞ ¼ À X n i¼1 a i logð1 À b i sÞ: Using the cumulant generating function, the mean, l, and variance, r 2 , of S n are given below: According to Daniels [3], the saddlepoint approximation of the density function of S n is as follows: where f Ã ðvÞ ¼ 2pj 00 n ðŝÞ È É À 1 2 exp j n ðŝÞ Àŝv f g ; andŝ is the root of j 0 n ðsÞ ¼ v which is readily solved numerically by the Newton-Raphson algorithm.
Several approaches have been used to further minimize the error of the saddlepoint approximation [5]. For example, one method uses a higher order approximation by including adjustments for the third and fourth cumulants [3]. A higher order saddlepoint approximation uses the following correction term: where j ð3Þ n ðsÞ ¼ The approximate tail probabilities of S n are determined by numerically integrating Eq. (1). An alternative approach is to use the Lugannani and Rice [11] for the continuous tail probability approximation as follows: where /ðÁÞ is the standard normal density function, UðÁÞ is the corresponding cumulative distribution function, and where sgnðŝÞ ¼ AE1, 0 ifŝ is positive, negative, or zero.

Numerical results
In this section, we investigated the upper probability using the saddlepoint approximations to S n . In this study, we focused on the Lugannani-Rice formula. Note that Mathai [12] obtained a normal approximation with n ! 1.
Moschopoulos [13] derived the density of S n with infinite summation as follows: We used a finite number and truncated the infinite series to meet an acceptable precision as published by Moschopoulos [13]. This equation is listed as follows: To bound the truncation error with the sum of the first ' þ 1, we used the following equation: In addition, we used another approximation method for the distribution of S n , a moment-based approximation proposed by Ha and Provost [6]. The distribution of S n is approximated by the polynomial adjustedf k ðvÞ such that and m(k) and EðM k Þ denote the kth moment of the adjusted distribution wðvÞ and the kth moment of S n , respectively. Herein, we consider the approximation adjusted with the skew-normal distribution as follows: n ¼ min ð0:99; jfjÞ: Then, : Note that we obtained n 0 ¼ 1; Afterwards, the moment-based approximation with skewnormal polynomial was as follows: An important step for the proposed method is to determine the optimal degrees for the polynomials. We followed the selection rule, which is based on the integrated squared differences between density approximations as previously published by Ha and Provost [6]. For this study, the following notations were utilized: exact probability of S n , E P , (as proposed by Moschopoulos    [13]); normal approximation, A N ; saddlepoint approximation with Lugannani-Rice formula, A L ; moment-based approximation with skew-normal polynomial, A M ; and the relative error of approximations, r.e. (Tables 1, 2, 3). We used different values for ã ¼ ða 1 ; a 2 ; . . .; a n Þ and b ¼ ðb 1 ; b 2 ; . . .; b n Þ. These values were grouped into cases 1-3. Herein, we assumed that a i and b i for n ¼ 5; 10 and 15 as follows:  we observed that the A L approximation was more accurate than the A M approximation in all cases tested. Therefore, we suggest estimating the probability using the A L approximation in cases with large n.

Concluding remarks
In this paper, we considered both the saddlepoint and moment-based approximations on the distribution of the sum of i.n.i.d. gamma random variables. Use of the saddlepoint approximation was an accurate method for calculating distribution. From our results, we determined that the precision of the saddlepoint approximation was superior to both the normal and moment-based approximations.