Efficient Importance Sampling for Large Sums of Independent and Identically Distributed Random Variables

We discuss estimating the probability that the sum of nonnegative independent and identically distributed random variables falls below a given threshold, i.e., $\mathbb{P}(\sum_{i=1}^{N}{X_i} \leq \gamma)$, via importance sampling (IS). We are particularly interested in the rare event regime when $N$ is large and/or $\gamma$ is small. The exponential twisting is a popular technique for similar problems that, in most cases, compares favorably to other estimators. However, it has some limitations: i) it assumes the knowledge of the moment generating function of $X_i$ and ii) sampling under the new IS PDF is not straightforward and might be expensive. The aim of this work is to propose an alternative IS PDF that approximately yields, for certain classes of distributions and in the rare event regime, at least the same performance as the exponential twisting technique and, at the same time, does not introduce serious limitations. The first class includes distributions whose probability density functions (PDFs) are asymptotically equivalent, as $x \rightarrow 0$, to $bx^{p}$, for $p>-1$ and $b>0$. For this class of distributions, the Gamma IS PDF with appropriately chosen parameters retrieves approximately, in the rare event regime corresponding to small values of $\gamma$ and/or large values of $N$, the same performance of the estimator based on the use of the exponential twisting technique. In the second class, we consider the Log-normal setting, whose PDF at zero vanishes faster than any polynomial, and we show numerically that a Gamma IS PDF with optimized parameters clearly outperforms the exponential twisting IS PDF. Numerical experiments validate the efficiency of the proposed estimator in delivering a highly accurate estimate in the regime of large $N$ and/or small $\gamma$.


Introduction
Efficient estimation of rare event probabilities finds various applications in the performance evaluation/prediction of wireless communication systems operating over fading channels [27].In particular, the left-tail of the cumulative distribution function (CDF) of sums of nonnegative independent and identically distributed (i.i.d.) random variables is an example of a rare event probability that is of practical importance.More specifically, the outage probability at the output of equal gain combining (EGC) and maximum ratio combining (MRC) receivers can be expressed as the CDF of the sum of fading channel envelops (for EGC) and fading channel gains (for MRC) [9].
The accurate estimation of the left-tail of the CDF of sums of random variables requires the use of variance reduction techniques because the naive Monte Carlo sampler is computationally expensive [3,21,25].Moreover, the existing closed-form approximations [13,14,18,22,23,29,30] fail to be accurate when the tail of the CDF is considered.The literature is rich in works in which variance reduction techniques were developed to efficiently estimate rare event probabilities corresponding to the left-tail of the CDF of sums of random variables, see [1,4,5,8,9,11,16] and the references therein.For instance, the authors in [4] used exponential twisting, which is a popular importance sampling (IS) technique, to propose a logarithmically efficient estimator of the CDF of the sum of i.i.d.Log-normal random variables.The logarithmic efficiency is a popular property in rare event simulation used to ensure estimators' efficiency [9].Let α be an unbiased estimator of α, i.e., E[α] = α.We say that α is logarithmically efficient if lim α→0 log(E[ α2 ]) log(α 2 ) = 1.In [16], the CDF of the sum of correlated Log-normal random variables was considered.The authors developed an IS estimator based on shifting the mean of the corresponding multivariate Gaussian distribution.Under mild assumptions, they proved that their proposed estimator is logarithmically efficient.Based on [16] and under the assumption that the left-tail sum distribution is determined by only one dominant component, the authors in [1] combined IS with a control variate technique to construct an estimator with the asymptotically vanishing relative error property, which is the most desired property in the field of rare event simulations [21].In [9], two unified IS approaches were developed using the hazard rate twisting concept [7,20] to efficiently estimate the CDF of sums of independent random variables.The first estimator is shown to be logarithmically efficient, whereas the second achieves the bounded relative error property for i.i.d.sums of random variables and under the given assumption that was shown to hold for most of the practical distributions used to model the amplitude/power of fading channels.The bounded relative error is a stronger criterion than the logarithmic efficiency.We say that an unbiased estimator α of α achieves the bounded relative error property if var [ α]   α 2 is asymptotically bounded when α goes to 0, see [9] The efficiency of the above mentioned estimators was studied when the number of summand N was kept fixed.More specifically, recall that the objective is to efficiently estimate the probability that the sum of nonnegative i.i.d.random variables falls below a given threshold, i.e., P( N i=1 X i ≤ γ).A close look at the above mentioned estimators shows that the efficiency results were proved when the rarity parameter γ decreases whereas N is kept fixed.However, in most cases, the efficiency of the existing estimators is considerably affected when N increases.This represents the main motivation of the present work.We aim to introduce a highly accurate estimator that efficiently estimate P( N i=1 X i ≤ γ) in the rare event regime when N is large and/or γ is small.
It is well-acknowledged that the exponential twisting technique compares favorably, in most cases, to existing estimators.It is the optimal IS probability density function (PDF) in the sense that it minimizes the Kullback-Leibler (KL) divergence with respect to the underlying PDF under certain constraints [24].However, it has some limitations.First, it requires the knowledge of the moment generating function of X i , i = 1, 2, • • • , N .Second, sampling according to the new IS PDF is not straightforward and might be expensive.Moreover, the twisting parameter is not available in a closed-form expression and needs to be estimated numerically.Motivated by the above limitations, we summarize the main contributions of the present work as follows: • We propose an alternative IS estimator that approximately yields, for certain classes of distributions and in the rare event regime, at least the same efficiency as the one given by the estimator based on exponential twisting and at the same time does not introduce the above limitations.
• The first class includes distributions whose PDFs vanish at zero polynomially.For this class of distributions, the Gamma IS PDF with appropriately chosen parameters retrieves approximately, in the regime of rare events corresponding to small values of γ and/or large values of N , the same performances as the exponential twisting PDF.
• The above result does not apply to the Log-normal setting as the corresponding PDF approaches zero faster than any polynomials.We show numerically that in this setting, the Gamma IS PDF with optimized parameters achieves a substantial amount of variance reduction compared to the one given by exponential twisting.
• Numerical comparisons with some of the existing estimators validate that the proposed estimator can deliver highly accurate estimates with low computational cost in the rare event regime corresponding to large N and/or small γ.
The paper is organized as follows.In section 2, we define the problem setting and motivate the work.In section 3, we introduce the exponential twisting approach and present its limitations.The main contribution of this work is presented in section 4, where we show that the Gamma IS PDF with optimized parameters retrieves approximately, for certain classes of distributions and in the rare event regime, at least the same performance as the exponential twisting technique.Finally, numerical experiments are shown in section 5 to compare the proposed estimator with various existing estimators.
2 Problem Setting and Motivation where P hX (•) is the probability under which the random vector , for any Borel measurable set A in R N , we have P hX (X ∈ A) = A h X (x)dx.As an application, the quantity of interest α(γ, N ) could represent the outage probability at the output of EGC and MRC wireless receivers operating over fading channels.In fact, the instantaneous signal to noise ratio (SNR) at EGC or MRC diversity receivers is given as follows [9] where N is the number of diversity branches, Es N0 is the SNR per symbol at the transmitter, R i , i = 1, 2, ..., N , is the fading channel envelope and The outage probability is defined as the probability that the SNR falls below a given threshold.Using (2), it can be easily shown that the outage probability at the output of EGC and MRC receivers can be expressed as the CDF of the sum of fading channel envelops (for EGC) and fading channel gains (for MRC), and hence can be expressed as in (1).We focus on the estimation of α(γ, N ) when N is large and/or γ is small.Before delving into the core of the paper, we illustrate via a simple example that the efficiency of an IS estimator, that performs well when γ decreases and N is not sufficiently large, can deteriorate when we increase the values of N .
We first write the quantity of interest as where w i is equal in distribution to Xi γ conditional on the event {X i ≤ γ}, i = 1, 2, .• • • , N , and h w (w) = N i=1 f w (w i ) with f w (•) is the PDF of w i , i.e., the conditional PDF of Xi γ given the event { Xi γ ≤ 1}, and is given by f w (w) = γfX (γw) FX (γ) 1 (w<γ) .Note that E hw [•] denotes the expectation under h w (•).The estimator is then given by estimating the right-hand side term of (4) by the naive Monte Carlo method where (w N ) , k = 1, • • • , M , are independent realizations sampled according to h w (•).Note that this estimator can be understood as applying IS with IS PDF being the truncation of the underlying PDF over the hypercube [0, γ] N .It can be easily proved that for fixed N , this estimator achieves the desired bounded relative error property with respect to the rarity parameter γ for distributions that satisfy f w (x) ∼ bx p as x approaches zero and for p > −1 and b > 0, see [8].This property means that the squared coefficient of variation, defined as the ratio between the variance of an estimator and its squared mean, remains bounded as γ → 0, see [21].More precisely, when this property holds, the number of required samples to meet a fixed accuracy requirement remains bounded independently of how small α(γ, N ) is.The question now is what happens when N is large.Using the Chernoff bound, we obtain for all η > 0 where E fw [•] denotes the expectation under f w (•).The squared coefficient of variation of α(γ, N ) in ( 4) is given by In particular, when η = 1, the squared coefficient of variation (which is asymptotically equal to 1/P hw ( N i=1 w i ≤ 1) in the regime of rare events) is lower bounded by exp (−1 − N log (E fw [exp(−w)])).This shows that the squared coefficient of variation increases at least exponentially, which proves that the efficiency of the estimator deteriorates when N is large.

Exponential Twisting
In this section, we review the popular exponential twisting IS approach and enumerate its limitations in estimating the quantity of interest.When applicable, it is well-acknowledged that the exponential twisting technique is expected to produce a substantial amount of variance reduction and to compare favorably, in most cases, to other estimators [4].For distributions with light right tails and under the i.i.d.assumption, the estimator based on exponential twisting can be proved, under some regularity assumptions, to be logarithmically efficient when the probability of interest is either P hX ( N i=1 X i > γ) and γ → +∞ or P hX ( N i=1 X i > γN ) and N → +∞ [3].In the left tail setting, which is the region of interest in the present work, the exponential twisting was shown in [4] to achieve the logarithmic efficiency property in the case of i.i.d.Log-normal random variables when the probability of interest is P hX ( N i=1 X i < N γ) and either N → +∞ or γ → 0.
In [24], the exponential twisting technique was also shown to be optimal in the sense that it minimizes the KL divergence with respect to the underlying PDF under the constraint that the rare set {x ∈ R N + , such that N i=1 x i ≤ γ} is no longer rare.The IS PDF is selected to be the solution of the following optimization problem, see [24], The solution of this problem is given as (see [24] for a more general setting) Hence, by writing h * X (x) = N i=1 f * X (x i ), we clearly observe that the optimal density is given by exponentially twisting each univariate PDF f X (•) ] and the optimal twisting parameter θ * satisfies Since the left-tail of sums of random variables is considered in this work, we have that θ * → −∞ as γ → 0 and/or N → +∞ [4].Using the exponential twisting technique, the IS estimator of α(γ, N ) using Observe, however, that the exponential twisting technique has some restrictive limitations.The main one is that sampling according to f * X (•) is not straightforward.One generally needs the use of an acceptance-rejection technique, the complexity of which can be dramatic when the probability of acceptance is relatively small.In such a case, the computational complexity of the algorithm can be huge and even worse than the naive Monte Carlo method.There are other less critical drawbacks.First, computations are much simpler if the moment generating function M (θ) is known in closed-form.Such a requirement does not hold in general.Also, the twisting parameter θ * does not have, in general, a closed-form expression, and hence, it should be approximated numerically.

Gamma Family as IS PDF
The objective of this paper is to propose an alternative IS PDF that approximately yields, for certain classes of distributions that include most of the common distributions and in the rare event regime corresponding to large N and/or small γ, at least the same performance as the exponential twisting technique and at the same time does not introduce serious limitations.We distinguish three scenarios depending on how the PDF f X (•) approaches zero.
4.1 f X (x) ∼ b as x goes to 0 and b > 0 is a constant Recall that the exponential twisting IS PDF satisfies We choose θ to be equal to θ such that Through simple computation, we obtain θ = − N γ .To conclude, when f (x) ∼ b and b > 0, we propose an IS PDF given by the exponential distribution with rate N γ .
4.2 f X (x) = x p g(x) with g(x) ∼ b as x goes to 0, p > −1, and b > 0 is a constant Using the same methodology as in section 4.1, the IS PDF that we consider is Therefore, the new PDF corresponds to the Gamma PDF with shape parameter p + 1 and scale parameter −1/θ.The normalizing constant is M (θ) = Γ(p+1) (−θ) p+1 .Hence, the value θ is chosen to be equal to θ such that N and is given by Using the Gamma IS PDF in (7), the proposed IS estimator of α(γ, N ) using In Table I, we provide a non-exhaustive list of distributions that belong to section 4.2 (note that distributions in section 4.2 include those in section 4.1).These distributions are among the most used distributions to model the amplitudes and powers of wireless communications fading channels.
Remark 1.It is worth mentioning that for distributions satisfying f X (x) = x p g(x) with g(x) ∼ b as x goes to 0, p > −1, and b > 0 is a constant, the proposed approach with the Gamma IS PDF in (7) with parameters p and θ in (8) achieves approximately, as γ decreases to 0 and/or N increases, the same performance as the one given by the exponential twisting without introducing a Functions I ξ (•), and K ξ (•) are respectively the modified Bessel functions of the first kind and order ξ and the second kind and order ξ [15].serious limitations.Let A 1 and A 2 be the second moments of the proposed and the exponential twisting estimators, respectively.Then, the ratio between A 1 and A 2 has the following expression First observe that x p dx for sufficiently small negative values of θ.Moreover, recall that θ * and θ go to −∞ as either γ → 0 or N → ∞, and that θ * and θ M ( θ) = γ N , respectively.Thus, as γ → 0 and/or N → ∞, we obtain that θ * is well-approximated by θ, and hence M (θ * ) is wellapproximated by b M ( θ).Finally, using the latter two approximations and the fact that g(x) ∼ b as x goes to 0, we conclude from (9) that A 1 is approximately equal to A 2 when γ goes to 0. For large values of N , the same conclusion can be deduced by observing that E Thus, the random variables X 1 , X 2 , • • • , X N take, when sampled according to the IS PDFs, sufficiently small values when N is sufficiently large.

The Log-normal Case
Distributions that do not approach 0 polynomially are much more difficult to handle and need to be tackled on a case-by-case basis.In this work, we consider the case of the sum of i.i.d.standard Log-normal random variables.The density decreases to 0 at a faster rate than any polynomials and thus the Gamma distribution with fixed shape parameter will not recover the results given by the use of the exponential twisting technique.Note that in [4], the exponential twisting technique was applied to the sum of i.i.d.standard Log-normals by i) providing an unbiased estimator of the moment generating function, ii) approximating the value of θ, and iii) using acceptance-rejection to sample from the IS PDF.
The main difficulty is that the PDF of the Log-normal distribution does not have a Taylor expansion at x = 0.The first estimator we propose is based on truncating the support [0, +∞] and only working on [a, +∞] with a = δγ/N .This allows the use of a Taylor expansion at x = a.This procedure, however, introduces a bias that needs to be controlled.We show numerically that this estimator exhibits better performances than the one based on exponential twisting.Moreover, we observe that, in the regime of rare events, the proposed estimator achieves approximately the same performances as the Gamma IS PDF with shape parameter equal to 2. This is the main motivation behind introducing a second estimator whose IS PDF is a Gamma PDF with optimized parameters.The numerical results show that the second estimator achieves substantial variance reduction with respect to the first estimator.

Biased estimator
We rewrite the quantity of interest as where δ is a fixed value belonging to [0, 1).The first factor on the right-hand side has a known closed-form expression.Let fX (•) be the PDF of Next, we write the second factor on the right hand side of (10) as follows: with hX (x) = N i=1 fX (x i ).The exponential twisting IS PDF is then given by Now, by using the Taylor expansion of fX (•) at the point x = δγ/N , we write where ξ x,δ,N is between δγ N and x.Hence, the approximate exponential twisting IS PDF is given by fX with the notation fX = fX ( δγ N ) and f ′ X = f ′ X ( δγ N ).We assume that δγ N is strictly less than exp(−1) to ensure that f ′ X > 0. This assumption is not restrictive, as we are interested in the rare event regime corresponding to N large and/or γ small.Through a simple computation, we get The value of θ that solves The remaining part is to sample from fX (•).To do this, we write fX (x) = − fX exp(θδγ/N ) where f1 (x) = − θ exp(θx) exp(θδγ/N ) and f2 (x) = θ 2 (x−δγ/N ) exp(θx) exp(θδγ/N ) are two valid PDFs for x > δγ/N .
The question that remains is related to controlling the bias through a proper choice of the parameter δ.
Then, the global relative error can be upper bounded as follows: where α1,is,M is the IS estimator of α 1 (γ, N ) based on M i.i.d.realizations sampled according to hX (x) = N i=1 fX (x i ) where the the PDF fX (•) is given in (11) .
The parameter δ is then chosen to control the bias term in (12), that is the first term on the right-hand side of (12).The second term on the right-hand side is the statistical relative error of estimating α 1 (γ, N ) by α1,is,M .From the Central Limit Theorem (CLT), this error term is approximately proportional to the coefficient of variation of α1,is,M .
To achieve a global relative error of order ǫ, it is sufficient to bound the two error terms, i.e., the statistical relative error and the relative bias, by ǫ/2.Hence, the value of δ is selected such that the following inequality holds The following lemma provides the relation between δ and ǫ such that ( 13) is fulfilled.
Proof.We first write that On the other hand, we have Therefore, we get By equating the right-hand side of the above inequality with ǫ/2, we obtain and hence the proof is concluded.

The Gamma family as an IS PDF
When we consider a sufficiently small value of δ in the above analysis, we observe from the expression of the IS PDF in (11) that the proposed estimator with the IS PDF in (11) achieves approximately the same performance as the Gamma IS PDF with shape parameter equal to 2. This suggests investigating whether the Gamma family can achieve further variance reduction with respect to the approach in the previous subsection.Note that the advantage of using the Gamma family as IS PDFs compared to the approach in the previous subsection is that the estimator is unbiased.Recall that the Gamma PDF is given by fX where θ > 0 and k > 0 are the scale and shape parameters.The value of θ is chosen to be equal to θ = γ N k to ensure that the expected value of each of the The likelihood ratio is then given by The second moment of the IS estimator is bounded by The last upper bound is found by maximizing the function x → −(log(x)) 2 − 2k log(x) for x > 0. Next, using Stirling's formula for the gamma function where C is a constant.Next, the value of k is chosen such that it minimizes the above right-hand side term.The solution of this minimization problem is given as follows: Note that when N is large and/or γ is small, the value of k * satisfies k * ∼ log( N γ ).

Numerical results
In this section, we show some selected numerical results to compare the performance of the proposed estimators compared to some of the existing estimators.
We consider three scenarios depending on the distribution of X i , i = 1, 2, • • • , N : the Weibull, the Gamma-Gamma, and the Log-normal distributions.Note that the proposed approach is not restricted to these three distributions (see Table I for a non-exhaustive list of distributions that can be handled).
We recall that the squared coefficient of variation of an unbiased estimator α(γ, N ) of α(γ, N ) has the following expression Note that, from the CLT, the number of required samples to meet ǫ statistical relative error with 95% confidence is equal to (1.96) 2 SCV(α(γ, N ))/ǫ 2 .Therefore, when we compare two estimators, the one with the smaller squared coefficient of variation exhibits better performance than the other.

Weibull Case
In this section, we assume that X i , i = 1, 2, • • • , N , are distributed according to the Weibull distribution whose PDF is given in Table I.The comparison is made with respect to the second IS approach of [9] that is based on using the hazard rate twisting (HRT).In Figure 1 and Figure 2, we plot the squared coefficient of variations given by the HRT technique and the proposed approach for two different values of the shape parameter: k = 1.5 and k = 0.5, respectively.The value of α(γ, N ) ranges approximately from 10 −20 to 10 −6 (respectively from 10 −16 to 10 −6 ) using the system's parameters of Figure 1 (respectively of Figure 2).These figures show that the proposed approach clearly outperforms the one based on HRT.For instance, when k = 1.5, λ = 1, γ = 0.5, and N = 12, the proposed approach is approximately 270 times more efficient than the one based on HRT.More specifically, to meet the same accuracy, the number of samples needed by the approach based on HRT should be approximately 270 times the number of samples needed by the proposed approach.
In the next experiment, we aim to compare the proposed approach with the HRT one when N is fixed and γ decreases.In Figure 3, we compare the efficiency of both approaches in terms of squared coefficient of variations plotted as a function of γ for two scenarios depending on the value of N (N = 8 and N = 10).In this case, the value of α(γ, N ) ranges approximately from 10 −16 to 10 −6 for N = 8 and from 10 −22 to 10 −8 for N = 10.We observe a clear outperformance of the proposed approach compared to the one based on using HRT for both values of N .While the HRT approach was proved in [9] to achieve the bounded relative error property with respect to γ and for a fixed value of N , it is clear from Figure 3 that the asymptotic bound increases substantially with respect to N , and hence the performance of the HRT approach is dramatically affected by increasing N .On the other hand, we observe that increasing the value of N  has a minor effect on the efficiency of the proposed approach, i.e., the squared coefficient of variation is approximately unchanged for both values of N and for the considered range of γ.This numerical observation suggests to conclude that the proposed approach satisfies the bounded relative error property with an asymptotic bound that increases with a very slow rate, compared to the one given by the HRT approach, as we increase N .For illustration, the proposed approach is approximately 18 (respectively 64) times more efficient than the HRT one when N = 8 (respectively N = 10) and γ = 0.2.Note that the previous observations are valid independently of the value of α(γ, N ) (see Figure 3, where the squared coefficient of variation is approximately constant for a fixed value of N and for the considered range of γ).This experiment and the numerical results  in Figures 1 and 2 validate the ability of the proposed approach to deliver a very accurate and efficient estimate of α(γ, N ) when N increases and/or γ decreases.

Gamma-Gamma Case
The Gamma-Gamma distribution is used for various challenging applications in wireless communications.For instance, it exhibited a good fit to experimental data and was used to model wireless radio-frequency channels [26] and to model atmospheric turbulences in free-space optical communication systems [17].The PDF of X i is given in Table I.In Figure 4, we compare the proposed approach with the one in [6] by plotting the corresponding squared coefficient of variations as a function of N and for a fixed value of γ.Note that in [6], the proposed IS PDF is simply another Gamma-Gamma PDF with shifted mean.We call this method the IS-based mean-shifted approach.The range of the quantity of interest α(γ, N ) is approximately from 10 −18 to 10 −5 .We observe that the proposed estimator outperforms the one in [6].Also, we observe that the outperformance of the proposed estimator compared to the one based on mean shifting increases as we increase N .Moreover, we should note here that the cost per sample (in terms of CPU time) of the approach in [6] is twice the cost of the proposed approach.This is because a Gamma-Gamma random variable is generated by the product of two independent Gamma random variables, see [12].For illustration, we observe from Figure 4 that when N = 12, the proposed approach is approximately 2.5 times (five times if we include the computing time in the comparison) more efficient than the one of [6].

Log-normal Case
The Log-normal distribution can be used to model several types of attenuation including shadowing [28], and weak-to-moderate turbulence channels in free-space optical communications [2].The standard Log-normal PDF (the associated Gaussian random variable has zero mean and unit variance) is given by Figure 5 shows the squared coefficient of variation given by the exponential twisting [4], and the two proposed approaches, i.e., the one based on the biased estimator and the other based on using the Gamma distribution as an IS PDF.The value of α(γ, N ) ranges approximately from 10 −20 to 10 −2 .For the considered range of N , we observe that out of these three approaches, it is the one using the Gamma distribution as an IS PDF that outperforms the others.When N = 9, it is approximately 30 times more efficient than the one based on exponential twisting.In addition to the efficiency in terms of number of samples, it is worth recalling that the exponential twisting technique developed in [4] is computationally expensive in terms of computing time compared to the proposed approaches.Moreover, Figure 5 also shows that the approach based on the biased estimator achieves better performances than the one based on exponential twisting.It is important to mention here that, for the comparison to be fair, the required number of samples of the biased estimator should be multiplied by 4. This follows from the error analysis in (12), in which the statistical relative error should be bounded by ǫ/2, where ǫ is the required global relative error.In Figure 6, we plot the squared coefficient of variations given by the three approaches as a function of γ and for two different values of N (N = 8 and N = 10).The quantity of interest α(γ, N ) ranges approximately from 10 −15 to 10 −6 for N = 8 and from 10 −21 to 10 −9 for N = 10.We observe that the approach based on using the Gamma distribution as an IS PDF clearly asymptotically outperforms the two other approaches.For both values of N , the outperformance increases as we decrease γ.Moreover, the biased estimator exhibits better performances than the exponential twisting one for both values of N and for the considered range of γ values.Furthermore, increasing N has a considerable negative effect on the performances of the exponential twisting and the biased IS-based approaches.On the other hand, Figure 6 shows that increasing N does not largely effect the performance of the IS estimator based on the use of the Gamma distribution as an IS PDF.For illustration, the approach based on using the Gamma distribution as an IS PDF is approximately 15 times (respectively 35) more efficient that the exponential twisting one when N = 8 (respectively N = 10) and γ = 0.6.
It is important to mention that the outperformance of the estimator based on using the Gamma distribution as an IS PDF over the one based on using the biased estimator is expected.As it was mentioned above, the latter approach gives approximately the same performance as the Gamma distribution with shape parameter equal to 2 while the former one uses the Gamma distribution as an IS PDF with an optimized shape parameter (the shape parameter was chosen to minimize an upper bound of the second moment of the proposed estimator, see the expression of k * in ( 17)).
All of the above comparisons have been carried out in terms of the number The computing time is the time in seconds needed to get an estimator of α(γ, N ) using M i.i.d.samples of α(γ, N ).When comparing two estimators, the one that exhibits less WNRV is more efficient than the other estimator.More precisely, an estimator is efficient in terms of WNRV than another estimator means that it achieves less relative error for a given computational budget, or equivalently it needs less computing time to achieve a fixed relative error.Using the same setting as in Figure 6, we plot in Figure 7 the WNRV metric as a function of γ for two scenarios depending on the value of N (N = 8 and N = 10).We observe that as α(γ, N ) is getting smaller, it is the approach based on using the Gamma PDF as an IS PDF that outperforms the two other approaches in terms of WNRV (the efficiency increases as the event becomes rarer).It is worth recalling that the WNRV of the approach based on biased PDF should be multiplied by 4 in order for the analysis to be fair (this follows from the error analysis that was performed in section 4.3.1).Moreover, Figure 7 shows that, in addition to reducing the variance, as shown in Figure 6, the approach based on using the Gamma IS PDF also reduces the computing time compared to the one using the exponential twisting technique.To see that, for N = 10 and γ = 0.6, the approach based on using the Gamma IS PDF is approximately 35 times (respectively 340 times) more efficient than the one based on exponential twisting when using the squared coefficient of variation metric (respectively the  WNRV metric).More specifically, the Gamma based IS approach approximately reduces the computing time by a factor of 10 with respect to the exponential twisting approach.

Conclusion
We developed efficient importance sampling estimators to estimate the rare event probabilities corresponding to the left-tail of the cumulative distribution function of large sums of nonnegative independent and identically distributed random variables.The proposed estimators achieve asymptotically at least the same performance as the exponential twisting technique, in the regime of rare events and for certain classes of distributions that include most of the common distributions.The main conclusion is that the Gamma PDF with suitably chosen parameters achieves for most of the common distributions substantial variance reduction, and at the same time avoids the restrictive limitations of the exponential twisting technique.The numerical results validate the efficiency of the proposed approach in being able to accurately and efficiently estimate the quantity of interest in the rare event regime corresponding to large N and/or small γ.One possible extension of the present work is to connect it to the works in [10,19] by creating a sequence of approximate measures corresponding to increasing the values of N .

Figure 1 :
Figure 1: Squared coefficient of variation as a function of N where X i are i.i.d.Weibull random variables with rate λ = 1, k = 1.5, and γ = 0.5.

Figure 2 :
Figure 2: Squared coefficient of variation as a function of N where X i are i.i.d.Weibull random variables with rate λ = 1, k = 0.5, and γ = 0.01.

Figure 3 :
Figure 3: Squared coefficient of variation as a function of γ where X i are i.i.d.Weibull random variables with rate λ = 1, k = 1.5.

Figure 4 :
Figure 4: Squared coefficient of variation as a function of N where X i are i.i.d.Gamma-Gamma random variables with m = 4, k = 1.7,Ω = 1, and γ = 0.5.

Figure 5 :
Figure 5: Squared coefficient of variation as a function of N where X i are i.i.d.standard Log-normal random variables with γ = 0.5, and ǫ = 0.05.

Figure 6 :
Figure 6: Squared coefficient of variation as a function of γ where X i are i.i.d.standard Log-normal random variables with ǫ = 0.05.

Figure 7 :
Figure 7: WNRV as a function of γ where X i are i.i.d.standard Log-normal random variables with ǫ = 0.05.

Table I :
Some PDF asymptotics around zero a