State-dependent Importance Sampling for Estimating Expectations of Functionals of Sums of Independent Random Variables

Estimating the expectations of functionals applied to sums of random variables (RVs) is a well-known problem encountered in many challenging applications. Generally, closed-form expressions of these quantities are out of reach. A naive Monte Carlo simulation is an alternative approach. However, this method requires numerous samples for rare event problems. Therefore, it is paramount to use variance reduction techniques to develop fast and efficient estimation methods. In this work, we use importance sampling (IS), known for its efficiency in requiring fewer computations to achieve the same accuracy requirements. We propose a state-dependent IS scheme based on a stochastic optimal control formulation, where the control is dependent on state and time. We aim to calculate rare event quantities that could be written as an expectation of a functional of the sums of independent RVs. The proposed algorithm is generic and can be applied without restrictions on the univariate distributions of RVs or the functional applied to the sum. We apply this approach to the log-normal distribution to compute the left tail and cumulative distribution of the ratio of independent RVs. For each case, we numerically demonstrate that the proposed state-dependent IS algorithm compares favorably to most well-known estimators dealing with similar problems.


Introduction 1.Motivation
In a probabilistic model, rare events are important events that infrequently happen with very small probabilities.Estimating these probabilities has become a substantial area of research because of its many applications, such as queuing systems, insurance risk, financial engineering, and wireless communication [6,14,35,41].Typical examples occur in the context of communication systems, arXiv:2201.01340v2[cs.IT] 26 Oct 2022 where the rare event could be an event in which the system fails to operate properly.For illustration, one can encounter the problem of estimating failure probabilities of the order of 10 −9 for sophisticated networks, such as ultra-reliable fifth or sixth generation (5G or 6G) systems [19].
Calculating rare event quantities that could be written as an expectation of a functional of the sums of independent RVs is of paramount practical interest in many challenging applications.For instance, in financial engineering, calculating the value-at-risk (VaR) requires computing the left tails of the sums of RVs (i.e., the probability that the sum is less than a sufficiently small threshold).Another relevant example is calculating the probability that the signal-to-interferenceplus-noise ratio is less than a given threshold in communication systems.Under some particular fading environments, this probability can be expressed as a cumulative distribution function (CDF) of the ratio of independent RVs.

Literature Review
Various researchers have proposed closed-form approximations of the left and right tails of sums of RVs [12,27,29,43,45,47,49].However, these approximation methods are not generic.Moreover, their accuracy is not always guaranteed for all scenarios, as it can degrade for certain system parameters.The Monte Carlo (MC) method can be used as a generic tool to cope with these problems.However, it is well acknowledged that estimating rare event quantities with the naive MC sampler requires a prohibitively large number of simulation runs [42].Variance reduction techniques have been used extensively to improve the computational work of the naive MC method.In this context, importance sampling (IS) is among the most popular variance reduction techniques that provide accurate estimates of rare event probabilities with a reduced number of simulation runs when appropriately used [42].
Variance reduction techniques have been widely discussed in the literature, and particular focus has been devoted to proposing algorithms for the efficient simulation of the right tail of sums of RVs (i.e., the probability that the sum exceeds a sufficiently large threshold).In particular, for distributions with light right tails (i.e., decaying at an exponential rate or faster), under some regularity assumptions, the popular exponential twisting IS approach [6] satisfies the logarithmic efficiency property, which is a useful metric used to assess the efficiency of an estimator.In contrast, for heavy-tailed distributions, such as the case of log normals and Weibulls with shape parameters strictly less than 1, the exponential twisting method is inapplicable.Therefore, efficient algorithms have been developed for estimating tail probabilities involving heavy-tailed RVs.In this context, [3] provided the first logarithmically efficient estimator for such probabilities using the conditional MC idea.Other authors [10] have proposed an estimator with a bounded relative error under distributions with regularly varying tails, which was further extended to more general scenarios (e.g., see [8,9,26,34,37]).In addition to estimators based on the conditional MC, various stateindependent IS techniques have been proposed [2,39,40,44].
State-independent changes of measure for estimating certain rare events involving sums of heavytailed RVs cannot achieve logarithmic efficiency [11].Therefore, more complex state-dependent IS algorithms have been proposed in the literature over the last few years to estimate probabilities for sums of heavy-tailed independent RVs.Of valuable interest are studies developed in [21,22,23,24,30,32].The researchers in [23] developed an efficient state-dependent IS estimator with a bounded relative error under distributions with regularly varying heavy tails.The estimator can also be adapted to provide strongly efficient algorithms in light-tailed situations.A related approach, based on the construction of Lyapunov inequalities, has also been developed [24] for constructing strongly efficient estimators for large deviation probabilities of regularly varying random walks.These algorithms use a parametric family of change of measure based on mixtures that are appropriately selected using Lyapunov bounds.Moreover, stochastic control and game theory have been used to build efficient state-dependent IS schemes to simulate rare events [30,31,33].For instance, in the heavy-tailed setting, the authors in [30] constructed dynamic IS estimators with a nearly asymptotically optimal relative error for independent and identically distributed (i.i.d.) nonnegative regularly varying RVs.They considered a parametric family of change of measure whose parameters are determined by solving a deterministic, discrete-time control problem.The closest work to the proposed approach is in [31], where the authors proposed an approach based on connecting IS with stochastic optimal control (SOC).The scope of [31] is limited to the i.i.d.case and distributions with finite moment generating functions.In this work, independence is the only assumption we make.The connection between IS and SOC has been investigated for other scenarios, such as diffusions [38], and for stochastic reaction networks approximated by the Tau-Leap scheme [36].The dynamics in [36] evolve according to discrete-time discrete-space Markov chains, whereas the dynamics in this work evolve according to discrete-time continuous-space Markov chains.
Few researchers have recently addressed the left-tail region (i.e., the probability that sums of nonnegative RVs fall below a sufficiently small threshold [7,13,16,17,19,20]).For instance, [7] considered the specific setting of the i.i.d.sum of log-normal RVs.The approach was based on the exponential twisting technique and is logarithmically efficient.The work of [17] proposed two unified hazard rate twisting (HRT)-based approaches that estimate the outage capacity values for generalized independent fading channels.The first estimator achieves logarithmic efficiency for arbitrary fading models, whereas the second achieves the bounded relative error criterion for most well-known fading variates and logarithmic efficiency for the log-normal case.Recently, [19] proposed an IS scheme based on sample rejection applied to the case of the independent Rayleigh, correlated Rayleigh, and i.i.d.Rice fading models, showing that the estimator satisfies the bounded relative error property.

Main Contributions
In this paper, we propose a generic state-dependent IS approach to estimate rare event probabilities that could be written as an expectation of a functional of the sums of independent RVs.We adopt a SOC formulation to determine the optimal IS parameters, minimizing the variance or, equivalently, the second moment of the estimator within a preselected class of measures.After formulating the SOC problem and describing the algorithm used to derive the optimal controls, which are optimal IS parameters, we apply the proposed algorithm to two examples: the computation of the left-tail probability in a log-normal setting, and the computation of the CDF of the ratio of independent log-normal RVs.The proposed algorithm is generic and not restricted to the lognormal environment.The algorithm can be applied to compute the quantity of interest without restrictions on the distribution of the univariate RVs in the sum or the expression of the functional applied to the sum.Numerical simulations demonstrate the superior performance of the proposed estimator in terms of the number of samples and computational work to meet a given prescribed tolerance (TOL) compared to the existing state-of-the-art estimators dealing with similar problems.
The rest of the paper is organized as follows.Section 2 describes the problem setting, presents applications that fall within the scope of applicability of the proposed approach, and introduces the concept of IS.Section 3 contains the main work, explaining the state-dependent IS scheme via a novel SOC formulation, followed by presenting the algorithm.Section 4 applies the proposed algorithm to two applications in wireless communications.The proposed algorithm compares favorably to some well-known estimators dealing with similar problems.

Problem Setting
This section states the objective of the method and enumerates some applications that fall within the scope of its applicability.Next, this section introduces the concept of the naive MC method.Finally, it presents the IS technique, one of the most popular variance reduction techniques.

Objective
We consider X = (X 1 , X 2 , • • • , X N ) t to be a random vector comprising independent positive components with probability density functions (PDFs) f X 1 (.), f X 2 (.), . . ., f X N (.) and a joint PDF f (x) = N n=1 f Xn (x n ).In this work, X i , i = 1, • • • , N are one-dimensional vectors.However, this approach is still applicable to the multidimensional case.We let S N = N n=1 X i and g : R + → R be a given function.We aim to develop a state-dependent IS algorithm via a connection to an SOC formulation to estimate rare event quantities that could be written as follows:

Right and left tail
One of the problems that can be written as (1) is calculating the right-tail probability of sums of RVs (i.e., the probability that the sum is larger than a sufficiently large threshold), which arises in many areas of engineering.This probability can be expressed as corresponding to (1), where g(x) = 1 (x≥γ th ) .
As a practical example, the right-tail probability of the sums of RVs may represent the ruin probability of an insurance company.In this setting, S N represents the total sum of claims, and γ th is the initial reserve.The claims X 1 , • • • , X N can be modeled by heavy-tailed distributions [4].In the Cramer-Lundberg model, this probability can be expressed as (2) [6].Moreover, calculating lefttail probabilities occurs extensively in many applications.In these cases, the quantity of interest can be expressed as which is in the form of (1), where g(x) = 1 (x≤γ th ) .
One of the most relevant examples is estimating the VaR, defined as the 1−α quantile of the loss distribution, for a sufficiently small value of α.We let a portfolio be based on N assets with upcoming prices X 1 • • • , X N , which can be modeled using log-normal distributions [7].The VaR of S N at the level of α (VaR α (S N )) is defined as the value such that the probability of a loss larger than that value is equal to 1−α [1,46].In other words, VaR α (S N ) = F −1 S N (α), where F S N (.) is the CDF of S N .
Another challenging application is the analysis of wireless communication systems.The outage probability (OP), defined as the probability that the signal-to-noise ratio (SNR) falls below a given threshold γ th [48], is equivalent to computing the CDF of sums of the SNRs of the received signals.Hence, it can be expressed as in (3).

CDF of the ratio of independent RVs
Another performance metric that can be expressed as in (1) is the OP in the presence of co-channel interference and noise.For single-input, single-output (SISO) systems, the OP is expressed as [18] where X 0 denotes the desired user signal power, X 1 , • • • , X N represent the received powers of the N interfering signals, and η indicates the variance of the additive white Gaussian noise.We assume that X 0 , • • • , X N are independent.Through conditioning on X 1 , X 2 , • • • , X N and using the law of total expectation, we write ( 4) as where F X 0 (•) is the CDF of the RV X 0 , corresponding to the form in (1) with g(x) = F X 0 (γ th (x+η)).

Importance Sampling
The naive MC estimator of the quantity of interest in (1) is where M denotes the number of simulation runs, and However, the naive MC method is computationally expensive, requiring a substantial number of simulation runs to meet a given accuracy when considering rare event probabilities.Using appropriate variance reduction techniques, such as IS, is necessary to overcome the failure of naive MC simulations and considerably reduce the computational work.The idea is to perform a change of measure under which the rare event is generated with a higher probability than under the original distribution [42].The IS technique consists of writing α as where and E f [•] denotes the expectation under which the vector X has the joint PDF f (•).The IS estimator is expressed as where {X (k) } M k=1 represents independent realizations of X sampled according to f (•).When g(x) > 0, x ∈ R + , the optimal change of measure minimizing the variance of the IS estimator is given by This optimal change of measure yields zero variance; thus, it is called the zero variance change of measure.However, using such a change of measure is impractical because it assumes the knowledge of α, which is the unknown quantity.

IS via an SOC Formulation
This section explains the SOC formulation and how to link it to IS to construct the state-dependent IS estimator.Then, it introduces the HRT family as a change of measure.Finally, this section describes the steps of the proposed state-dependent IS algorithm.

State-dependent IS Approach
The idea we adopt is to link the problem of finding an efficient change of measure to a SOC problem.
To apply SOC to the current static problem, we embed it with the evolution of a Markov chain with the following dynamics: where S 0 = 0. Instead of sampling X n+1 according to f X n+1 (•), we perform a change of measure such that, given S n , X n+1 is distributed according to fX n+1 (•; µ n+1 (S n )), where µ n+1 is a function of S n .With this idea, the new joint PDF can be written as where The objective is to determine the optimal controls µ n : R minimize the second moment of the IS estimator.Therefore, assuming that the second moment of the estimator is finite, we define the cost function for where D = {µ : R + → A} represents the set of admissible Markov controls.More precisely, for µ ∈ D, µ : R + → A.
We also define the value function as follows: The above SOC formulation is flexible because the RVs are dependent.The same observation holds for the optimal change of measure (10).Therefore, if the family of PDFs fXn (.; µ n ) is sufficiently large, we can expect the SOC formulation to deliver an estimator with a performance close to that of the optimal estimator.Next, the question is how to solve the minimization problem and determine the optimal controls µ n , n = 1, 2, • • • , N .The idea is to solve it sequentially by going backward in time.In Proposition 1, we state the dynamic programming equation solved by the value function u.
Proof.For simplicity, we assume that the optimal control is attained: Step 1 We let µ * n+1 , • • • , µ * N be the optimal control minimizing (17).Then, we obtain Knowing X n+1 and S n , 2 is deterministic.Thus, using the Markov property of S n , we obtain Hence, the following inequality holds: Step 2 We choose the control µ + n+1 to be arbitrary and, given the value of S n+1 , we select the optimal controls µ * n+2 , • • • , µ * N .Then, the following lower bound holds: Taking the minimum over all controls µ + n+1 (s) yields Hence, the proof is concluded using ( 21) and (22).
Remark: We can prove the proposition without the assumption that the minimum is attained.For that, we use a minimizing sequence µ i of controls, satisfying

Hazard Rate Twisting Family
The choice the family of PDFs fXn (.; µ n ) , n = 1, • • • , N in this work is based on the well-known HRT.The HRT technique was originally developed to deal with the right tail of sums of heavytailed RVs [15,40].
We define the hazard rate λ X i (•) associated with the RV X i as where We also define the hazard function as From ( 24) and ( 25), the PDF of X i can be expressed as The HRT change of measure is obtained by twisting the hazard rate of each component X i , i = 1, • • • , N by a quantity µ i < 1 as follows: Moreover, µ i should satisfy 0 ≤ µ i < 1, i = 1, • • • , N to efficiently address the estimation of the right tail of the sum distribution.Consequently, the tail of the resulting distribution becomes much heavier to the right than the original.However, this feature is unsuitable for dealing with the left tail.Two approaches were proposed in [17] to adjust the HRT to handle the left-tail region.The first is based on twisting the RVs −X 1 , • • • , −X N instead of the original variates X 1 , • • • , X N .The second approach applies the HRT approach to X 1 , • • • , X N using a negative twisting parameter.
Considering the appropriate twisting parameter, we employ the HRT change of measure given by (27), and the set A in this case is given by A = (−∞, 1).By doing so, the value function is given by

Algorithm
Based on the results stated in the proposition, we propose a numerical algorithm to approximate the optimal controls µ n , where n = 1, • • • , N .We start by truncating the space R + and work in the interval [0, S], where S is a large number in R + .There are particular cases that we treat, where S is naturally chosen.For instance, when estimating P(S N ≤ γ th ), due to the nonnegativity of X i , u(n, s) = 0 for s ≥ γ th and n = 0, • • • , N .In this case, S is set equal to γ th .In the general case, S is selected to be sufficiently large.At each step of the backward algorithm, we use linear extrapolation to compute the value function for s > S.
We consider a mesh in the one-dimensional s-space: The algorithm is summarized as follows: Step 1: For each s k in the mesh, we solve the following: and This step is not expensive because we must compute a one-dimensional integral for each point in the mesh and perform an optimization problem for the parameter µ.When the HRT family is used, the optimization problem becomes equivalent to determining the root of a nonlinear equation.
Step 2: After obtaining u (N − 1, s k ) for all s k in the grid, the next step again applies the result of the proposition to obtain an approximation of u(N − 2, s k ) and To perform this step, we must know u(N − 1, s) for all s that are not necessarily in the grid.
To overcome this problem, we proceed by interpolating between the points u (N − 1, s k ), where k = 0, 1, • • • , K. As mentioned, linear extrapolation is employed for s > S when needed.
Step 3: After computing µ n (s k ) for n = 1, 2, • • • , N and all s k in the grid k = 0, 1, 2, • • • , K, the following step is to solve for µ n , n = 1, 2, • • • , N by going forward in time.More specifically, we start at S 0 = 0 and sample from fX 1 (•, µ 1 ) to obtain S 1 .Further, µ 1 (0) was already computed in the resolution of the backward problem.We compute µ 2 as After computing µ 2 , we simulate S 2 as S 2 = s1 +X 2 , with X 2 sampled from fX 2 (.; µ 2 ).We continue repeating this procedure until we reach µ N and then sample X N .In the case of smooth controls, the optimization problem (32) can be avoided using interpolation between controls, obtained in the backward step, on the grid, s 1 , • • • , s K .
Step 4: The forward problem is repeated M times.The proposed IS estimator is given as 4 Numerical Results This section presents selected numerical results to illustrate the performance of the proposed IS scheme.First, the methodology adopted to demonstrate the performance of the proposed approach is discussed.The motivation for using the improved version of the proposed method, called the aggregate method, is explained.Then, the proposed algorithm is applied to estimate the OP at the output of diversity receivers with and without co-channel interference in the log-normal environment.

Methodology
Within the broad applicability of the proposed estimator, we focused on applying it to calculate the left-tail probability and the CDF of the ratio of independent RVs.We used the proposed estimator to estimate the OP at the output of diversity receivers with and without co-channel interference.We considered the case in which the antennae are sufficiently spaced to assume that independent RVs can model fading channels.We considered the log-normal fading environment that exhibits a good fit for realistic propagation channels.We demonstrated that the proposed approach achieves a substantial reduction of the variance compared to other well-known IS algorithms.
In both applications, the objective was to efficiently estimate the following: where X 1 , • • • , X N denote i.i.d.log-normal RVs with parameters m and σ 2 .The PDF of X i , i = 1, • • • , N, is expressed as follows: For the second application, we let X 0 be a log-normal RV with parameters m 0 and σ 2 0 .We employed the HRT change of measure in (27) to build the estimator.Hence, we call this approach the HRT-SOC IS approach, and the corresponding estimator is denoted by T HRT-SOC which is expressed as follows: where g(x) = 1 (x≤γ th ) in the first application, and g(x) = F X 0 (γ th (x + η)) in the second application.
In this setting, each step of the backward algorithm can be expressed, for The controls µ n (s k ) are obtained by solving the following equation: For the forward step, assuming that the control is smooth (motivated by numerical observations), we can compute the controls by interpolating between the points µ n (s k ), k = 0, • • • , K. To sample from the change of measures fX i (•), i = 1.• • • , N , we used the inverse CDF technique.In [14], the authors revealed that the inverse CDF of the HRT of a log-normal RV X i is given by where µ i is the twisting parameter corresponding to X i and Φ(•) is the CDF of the standard normal distribution.This formula can be generalized to other distributions as [17, eq. ( 65)].
The relative error serves as a measure of the efficiency of the estimators.The relative error of the naive MC estimator and the proposed IS estimator are defined respectively through the central limit theorem [6] as where C is the confidence constant equal to 1.96 for the 95% confidence interval.
We compared the estimator defined in (36) to other existing estimators when calculating the OP at the output of diversity receivers with and without co-channel interference.For instance, using the log-normal setting with the HRT technique allows us to compare the estimator with the approach in [14], which used the HRT without SOC (i.e., the control is constant, independent of the state and time).We denote this method as HRT.In the numerical experiments, the HRT-SOC technique reduces the variance substantially compared to other approaches.However, it requires additional time, called a backward cost, to determine the optimal controls.We let M HRT and M HRT-SOC be the number of required simulation runs for the HRT estimator T HRT and the proposed estimator T HRT-SOC , respectively, to ensure a relative error equal to TOL.The total costs of the HRT-SOC and HRT approaches are expressed as follows: where T b is the time required in the backward algorithm to calculate a single control, and T f represents the cost per sample in the forward step (approximately the same for both approaches).Figures 2 and 4 illustrate that the variance reduction compared to the HRT technique increases as the quantity of interest becomes rarer.Thus, we determine that M HRT M HRT-SOC , especially for rare regions.Consequently, we expect that, for the regime of rare events and a fixed N , the backward time can be neglected compared to the forward cost of the HRT, which is presented in Figure 3.When the backward time dominates the forward time of the HRT, we propose an improved version of the HRT-SOC estimator.We call this version the aggregate method (HRT-SOC-AG), which aims to reduce the backward cost without considerably affecting the variance reduction.

Aggregate Method
The idea for the aggregate method is to divide the sum S N into B blocks and compute the controls for each block rather than for each X i , i = 1, • • • , N .Doing so reduces the backward cost from In other words, if we select B blocks, such that B ≤ N , we consider the following dynamics: where n m = m j=1 b j , and b m , m = 1, 2, • • • , B are chosen such that n B = B j=1 b j = N .We adopted the same control µ m (S n m−1 ) for each With this proposed approach, we decreased the cost of the backward step with the price of increasing the variance.
To determine µ X 1 , • • • , µ X B , we used the dynamics proposed in ( 43) instead of the initial dynamics (11) to define a reformulated dynamic programming equation.We employed the same steps as those followed in the proof of the proposition, but instead of conditioning on X n+1 , we conditioned on X nm+1 , • • • , X nm+bm .Applying the same control µ m+1 for each Instead of solving the above equation, we propose minimizing its approximate upper bound, which becomes clearer in the next two subsections.

OP at the Output of Diversity Receivers in a Log-normal Environment without Co-channel Interference
The computation of the OP at the output of diversity receivers is equivalent to evaluating the CDF of the sum of the SNRs.Therefore, the interest in the first application is in the estimation of the left-tail region of the following form: We compared the approach to the HRT technique (see [14]) and the exponential twisting estimator (see [7]).We also used the improved version to achieve better results.When applying the aggregate method, instead of solving (45), we propose to minimize an approximate upper bound of it.More precisely, for the i.i.d.log-normal case, holds asymptotically, i.e. when the sum nm+bm j=nm+1 t j is sufficiently small, where X has the same distribution as X j , j = n m + 1, • • • , n m + b m .This result can be proven using the asymptotic result of the tail of a Normal distribution in [5].Using the inequality (47), the twisting parameters µ X m+1 are then selected as the argmin of the following approximated upper bound where f nm+bm j=nm+1 X j (•) is the PDF of nm+bm j=nm+1 X j .Given that the PDF of sums of i.i.d.log-normal RVs is unknown, we suggest approximating it using a univariate log-normal PDF f Y m+1 (•), whose parameters are computed using moment matching (see [28]).Finally, we obtain (50) Figure 2 plots the number of samples, required for the various approaches, to ensure TOL = 5% as a function of γ th .The range of γ th ensures a range of probabilities between 2 × 10 −12 and 6 × 10 −6 .For the aggregate method, we selected a constant parameter b (i.e., b m = 2 for all m = 1, 2 ).The choice of the parameter K = 20 is motivated by Figure 1, which plots the variance as a function of K.A larger K results in a smaller variance.The backward step is costly when K is large.Further, the variance reduction for K > 20 is minimal compared to the increased cost of solving the backward problem.  Figure 2 indicates that the number of samples required by naive MC simulations increases faster as the threshold decreases.In addition, the HRT-SOC approach requires the smallest number of simulation runs and saves a considerable number of samples compared to the HRT approach.For example, the number of simulations reduces by about 41,775 times for a small threshold (4 dB), corresponding to an OP value of 2 × 10 −12 .In contrast, the HRT-SOC-AG requires an additional number of samples, compared to the HRT-SOC approach, to reach a 5% relative error, indicating that the variance has increased as expected.However, we still obtained better variance reduction compared to the HRT technique.
We further studied the computational work for each method.Figure 3 plots the total time required for the exponential twisting, HRT, HRT-SOC, and HRT-SOC-AG techniques to ensure a 5% relative error as a function of the threshold.We also plotted the time required by the HRT-SOC and HRT-SOC-AG techniques to demonstrate the time required for the backward step compared to that required for the forward step.
The proposed estimator is the best for computational time for small thresholds (corresponding to an OP of less than 3.6 × 10 −8 ).As the event becomes rarer, the time gap between the proposed approach and other IS techniques increases significantly.Additionally, Figures 2 and 3 reveal that the HRT approach requires numerous samples to estimate the OP of the order of 2×10 −12 with good accuracy.However, for an OP greater than 3.6 × 10 −8 , the proposed approach is more expensive than others due to the additional computational time for the backward step for each threshold, which exceeds the time the remaining techniques when the number of samples is not sufficiently large.Nevertheless, this was enhanced when we used the improved version.The HRT-SOC-AG reduces the CPU time by about 1.7 times compared to the HRT-SOC approach for γ th ≥ 5 dB.Thus, with this choice of b, the efficiency of the aggregate method regarding time reduction exceeds the loss in variance.This choice of b m , m = 1, • • • , B is not optimal.Despite this, it provides better results than the HRT-SOC approach.Another possible experiment is to study the efficiency as a function of the number N of antennae for a fixed threshold and investigate the number of simulation runs required for each method and the computational time (Figures 4 and 5, respectively).The range of the OP is between 10 −5 and 2.5 × 10 −12 when using a range between nine and 13 antennae and a fixed threshold γ th = 6 dB.   Figure 4 indicates that the HRT-SOC approach is more efficient and requires fewer simulation runs than the HRT and the exponential twisting approaches.For N = 13, the proposed method requires 7,455 times fewer simulation runs than the HRT technique to meet the same accuracy requirements.In addition, the variance reduction for the HRT-SOC-AG technique depends on whether N is odd or even.Moreover, the HRT-SOC-AG method requires more simulation runs than the HRT-SOC technique to reach a fixed precision TOL, but it is more efficient in terms of CPU time for N ≤ 12.When the event becomes rarer (for small γ th and large N ), the improved approach with a fixed choice of b becomes less efficient in terms of CPU time than the HRT-SOC approach.In these cases, the number of samples is large enough that the backward time is neglected.Thus, reducing the variance rather than the cost of the backward step is more efficient.These results demonstrate that the choice of b m , m = 1, • • • , B is crucial and should be adaptively chosen to provide better results.More precisely, for fixed parameters γ th , TOL and N , the following optimization problem should be solved: such that The above optimization problem reveals that an optimal choice of b m in the case of a very rare event is b m = 1, m = 1, • • • , B, where B = N .However, when the event becomes less rare, an optimal choice of B is to take a single block (i.e., b 1 = N ).By doing so, the HRT-SOC-AG technique reduces to the HRT technique because the controls are state-independent in this case.Future work can be devoted to solving the previous optimization problem.Using optimal values of b m , we expect the HRT-SOC-AG estimator to achieve better performance.
4.4 OP in the Presence of Co-channel Interference in a Log-normal Environment for SISO Systems We consider a SISO system and recall that the OP in the presence of co-channel interference and noise is expressed as follows: where X 1 , • • • , X N are the interfering power signal and are assumed to be i.i.d.log-normal RVs with parameters m and σ 2 .The PDF of N i=1 X i is denoted by f N i=1 X i (•).To motivate the need for IS to efficiently estimate P out , Figure 6 plots the quantities f N i=1 X i , g, and the optimal IS PDF, which is proportional to gf N i=1 X i .The product gf N i=1 X i in Figure 6 is not normalized (i.e., it is an unnormalized PDF).Sampling from the original PDF of N i=1 X i is not efficient (i.e., when sampling from the original PDF, most samples fall in the region where g takes almost zero values).Hence, the computation of P out behaves like a rare event problem and can be addressed using the proposed HRT-SOC technique.The comparison is made concerning the estimator of [18], which is based on a covariance matrix scaling (CS) technique.It transforms the problem of evaluating the OP to computing the probability that a sum of correlated log-normal RVs exceeds a certain threshold.The estimator in [18] is given by The expressions of m, Σ, and θ are given in [18, eq. ( 6)], [18, eq. ( 7)], and [18, eq. ( 19)] respectively.We also compared the proposed approach to the exponentially tilted (ET) estimator of [25].We also used the HRT-SOC-AG method proposed in the previous subsection to further improve the computational work of the HRT-SOC technique.The reformulated dynamic programming equation is Next, using the following inequality, proven in [40], which is particularly satisfied in the case of i.i.d.log-normal RVs and holds for nm+bm j=nm+1 t j that are large enough: we can write (56) The large value of nm+bm j=nm+1 t j is motivated by Figure 6, which illustrates that the change of measure tends to increase the value of the sum in the regime of rare events.We studied the efficiency of the four IS schemes regarding the number of samples necessary to ensure a fixed accuracy requirement.To this end, Figure 7 plots the number of samples to ensure TOL = 5% as a function of γ th .This figure reveals that the HRT-SOC approach saves numerous samples compared to other approaches.For instance, the CS technique requires approximately 2,000 times as many simulations as the HRT-SOC scheme needs.The aggregate method did not affect the variance reduction.
We further investigated the gain in terms of the required computational time.Figure 8 presents the total CPU time needed by the four techniques to achieve the fixed accuracy TOL.The HRT-SOC approach requires less CPU time than the ET approach for the range of considered thresholds.In particular, when γ th = −30 dB, it is 13 times more efficient than the ET scheme.Compared to the CS approach, the HRT-SOC technique is more efficient when γ th < −25 dB, corresponding to an OP less than 3 × 10 −8 .The required computational time for the HRT-SOC technique is almost the same in the considered threshold range, whereas the CS and ET approaches require much more time as the threshold decreases.Moreover, the HRT-SOC-AG technique requires less time than the HRT-SOC technique using b = 2 to estimate the quantity of interest α.Therefore, the improved approach widens the region over which the proposed approach outperforms the CS approach.Our approaches are 2000 times (respectively 65 times) more efficient than the CS (respectively the ET) approaches for all values of TOL.Furthermore, Figure 10 demonstrates that the required time for the proposed methods compared to the other algorithms unchanged for the considered range of TOL.Moreover, similarly to the previous conclusions, the computational time required by the proposed algorithm is less than that needed by the ET algorithm for all TOL.Additionally, the superior performance of the method compared to the CS approach is critical for small values of TOL.Finally, the HRT-SOC-AG method increases the threshold, below which the proposed method performs better than the CS approach, from 0.045 to 0.058.

Conclusions Summary
We developed a generic state-dependent IS algorithm to efficiently estimate rare event quantities that could be written as an expectation of a functional of the sums of independent RVs.These problems have applications in the performance analysis of wireless communications systems operating over fading channels.Within a preselected class of a change of measures, the optimal IS parameters are determined via the connection to an SOC formulation.The numerical experiments verified the ability of the proposed approach to accurately and efficiently estimate the quantity of interest in the rare event regime.The proposed approach yields a substantial variance reduction compared with other well-known estimators.Additionally, the estimator requires less CPU time than the other proposed approaches in rare regions.We also proposed an aggregate method to improve efficiency further in terms of computational time.

Possible Extensions
For future research, the present work can be extended in many directions.One possible direction is to optimize the aggregate method by solving the optimization problem (51).A further interesting extension is to consider multivariate RVs when estimating the quantity of interest.In this case, RVs should be mutually independent, but the components of each RV are not necessarily independent.As the backward cost increases exponentially with the dimensions, we could employ cheaper approximation methods to calculate the controls, such as neural networks.

Figure 9
Figure 9 confirms the high gains of the proposed methods compared to all other IS approaches.