Estimation of parameters of Weibull–Gamma distribution based on progressively censored data

In this paper, the estimation of parameters of a three-parameter Weibull–Gamma distribution based on progressively type-II right censored sample is studied. The maximum likelihood, Bayes, and parametric bootstrap methods are used for estimating the unknown parameters as well as some lifetime parameters reliability function, hazard function and coefficient of variation. Approximate confidence intervals for the unknown parameters as well as reliability function, hazard function and coefficient of variation are constructed based on the s-normal approximation to the asymptotic distribution of maximum likelihood estimators (MLEs), and log-transformed MLEs. In addition, two bootstrap CIs are also proposed. Bayes estimates of the unknown parameters and the corresponding credible intervals are obtained by using the Gibbs within Metropolis–Hasting samplers procedure. Furthermore, the results of Bayes method are obtained under both the balanced squared error loss and balanced linear-exponential loss. Analysis of a simulated data set has also been presented for illustrative purposes. Finally, a Monte Carlo simulation study is carried out to investigate the precision of the Bayes estimates with MLEs and two bootstrap estimates, also to compare the performance of different corresponding CIs considered.


Introduction
In industrial life testing and medical survival analysis, very often the object of interest is lost or withdrawn before failure or the object lifetime is only known within an B Rashad M. EL-Sagheer Rashadmath@azhar.edu.eg; Rashadmath@yahoo.com 1 Mathematics Department, Faculty of Science, A1-Azhar University, Nasr City, Cairo 11884, Egypt interval. Hence, the obtained sample is called a censored sample (or an incomplete sample). Some of the major reasons for removal of the experimental units are saving the working experimental units for future use, reducing the total time on test and lower the cost associated with these. Right censoring is one of the censoring techniques used in life-testing experiments. The most common right censoring schemes are type-I and type-II censoring, but the conventional type-I and type-II censoring schemes do not have the flexibility of allowing removal of units at points other than the terminal point of the experiment. For this reason, a more general censoring scheme called progressive type-II right censoring is proposed. A progressive type-II censoring is a useful scheme in which a specific fraction of individuals at risk may be removed from the experiment at each of several ordered failure times. Schematically a progressively type-II censored sample can be described as follows. Suppose that n independent items are put on a life test with continuous identically distributed failure times X 1 , X 2 , . . . , X n . Suppose further that a censoring scheme (R 1 , R 2 , . . . , R m ) is previously fixed such that immediately following the first failure X 1 , R 1 surviving items are removed from the experiment at random, and immediately following the second failure X 2 , R 2 surviving items are removed from the experiment at random. This process continues until, at the time of the m th observed failure X m , the remaining R m surviving items are removed from the test. The m ordered observed failure times denoted by X are called progressively type II right censored order statistics of size m from a sample of size n with progressive censoring scheme (R 1 , R 2 , . . . , R m ). It is clear that n = m + m i=1 R i . The special case when R 1 = R 2 = · · · = R m−1 = 0 so that R m = n − m is the case of conventional type-II right censored sampling. Also when R 1 = R 2 = ··· = R m = 0, so that m = n, the progressively type II right censoring scheme reduces to the case of no censoring (ordinary order statistics). Many authors have discussed inference under progressive type-II censored using different lifetime distributions, see for example, Basak et al. (2009), Kim et al. (2011), Ng et al. (2005), Balakrishnan and Lin (2003), Asgharzadeh (2006), Madi andRaqab (2009), Fernandez (2004), Mahmoud et al. (2014c) and Soliman et al. (2015). A thorough overview of the subject of progressive censoring and the excellent review article is given in Balakrishnan (2007). Aggarwala and Balakrishnan (1998) developed an algorithm to simulate general, progressively type-II censored samples from the uniform or any other continuous distribution. The joint probability density function for progressively type-II censored sample of size m from a sample of size n is given by, for details see Balakrishnan and Aggarwala (2000). where c = n(n − 1 − R 1 )(n − 2 − R 1 − R 2 ) . . . n − m−1 i=1 (R i + 1) . The Weibull-Gamma distribution is appropriate for phenomenon of loss of signals in telecommunications which is called fading when multipath is superimposed on shadowing. The Weibull-Gamma distribution is introduced by Bithas (2009). A random variable X has a Weibull-Gamma distribution if its probability density function (PDF) and the corresponding cumulative distribution function (CDF) are given by and F (x; α, β, λ) The Weibull-Gamma distribution with parameters α, β and λ will be denoted by WGD (α, β, λ). Its reliability and hazard functions are given by and It is noted that if α = 1 and λ = 1, the WGD reduced to standard Pareto distribution.
For more detials about WGD and its properties see Molenberghs and Verbeke (2011) and Mahmoud et al. (2014a, b). The coefficient of variation is used in numerous areas of science such as biology, economics, and psychology, and in engineering in queueing and reliability theory see, for example Sharma and Krishna (1994). Nairy and Rao (2003) gave a summary of uses of the coefficient of variation in a number of areas. Given a set of observations from WGD(α, β, λ), the sample coefficient of variation (C V ) is often estimated by the ratio of the sample standard deviation to the sample mean. Or equivalent where E (X ) and E X 2 are the first and the second moments of the WGD (α, β, λ), given by where (z) is the gamma function satisfies (z) = ∞ 0 y z−1 e −y dy. Then, the theoretical C V for the WGD according to (6) is where Molenberghs and Verbeke (2011) gave a summary of the Weibull-Gamma frailty model, its infinite moments, and its connection to generalized log-logistic, logistic, Caushy, and extreme value distributions. Mahmoud et al. (2014a) discussed the recurrence relations for moments of dual generalized order statistics from WGD and its characterizations. Mahmoud et al. (2014b) established a new recurrence relations satisfied by the single and product moments of the progressively type-II right censored order statistics from non truncated and truncated WGD, and derived approximate moments of progressively type-II right censored order statistics from this distribution.
The great success story of modern day Bayesian statistics is Markov chain Monte Carlo (MCMC) technique. MCMC has a sister method. It is Gibbs sampling method. They permit the numerical calculation of posterior distributions in situations far too complicated for analytic expression see Brooks (1998) for a review. Gibbs sampler requires only the specification of the conditional posterior distribution for each parameter. In situations where those distributions are simple to sample from, the approach is easily implemented. In other situations, the more complex Metropolis-Hastings approach needs to be considered see Gamerman and Carlo (1997) and Gupta et al. (2008). In the present paper, the author has developed a hybrid strategy combining the Metropolis algorithm within the Gibbs sampler for obtaining the samples from the posterior arising from WGD. To our best knowledge, statistical inference for unknown parameters of WGD has not yet been studied under progressive type-II censoring. In this paper, maximum likelihood and Bayesian inference of unknown parameters as well as reliability function, hazard function and coefficient of variation will be studied under progressive type-II censoring. The asymptotic confidence interval of the reliability function, hazard function and coefficient of variation are approximated by delta and bootstrap methods. An MCMC procedure to estimate the parameters and corresponding credible intervals is also discussed.
The layout of the paper is as follows: Sect. 2, discusses the maximum likelihood estimators (MLEs) of the unknown parameters, reliability function, hazard function and coefficient of variation. Asymptotic confidence intervals based the maximum likelihood estimates are presented in Sect. 3. In Sect. 4, we introduce two parametric bootstrap procedures to construct the confidence intervals for the unknown parameters, reliability function, hazard function and coefficient of variation. Section 5, provides the conditional distributions required for implementing the Markov chain Monte Carlo approach. A simulation example to illustrate the approach is given in Sect. 6. Monte Carlo simulation results are presented in Sect. 7. Finally, we conclude the paper in Sect. 8.

Maximum likelihood inference
Suppose that x = X 1:m:n , X 2:m:n , . . . , X m:m:n a progressively type-II censored sample drawn from Weibull-Gamma population whose pdf and cdf are given by (1) and (2), with the censoring scheme (R 1 , R 2 , . . . , R m ). From (1), (2) and (7), the likelihood function is then given by The log-likelihood function = ln L(α, β, λ|x) without normalized constant is obtained from (11) as (12) Calculating the first partial derivatives of with respect to α, β and λ and equating each to zero, we get the likelihood equations as and From (14) we obtain the MLEs β aŝ Since Eqs. (13)-(16) do not have closed form solutions, the Newton-Raphson iteration method is used to obtain the estimates. The algorithm is described as follows: 1. Use the method of moments or any other methods to estimate the parameters α, β and λ as starting point of iteration, denote the estimates as (α 0 , β 0 , λ 0 ) and set k = 0. 2. Calculate ∂ ∂α , ∂ ∂β , ∂ ∂λ (α k ,β k ,λ k ) and the observed Fisher Information matrix I −1 (α, β, λ), given in the next paragraph.
Moreover, using the invariance property of MLEs, the MLEs of S (t), h (t) and C V can be obtained after replacing α, β and λ byα,β andλ aŝ

Asymptotic confidence intervals
As indicated by Vander Wiel and Meeker (1990) the most common method to set confidence bounds for the parameters is to use the asymptotic normal distribution of the MLEs. The asymptotic variances and covariances of the MLEs,α,β andλ are given by the entries of the inverse of the Fisher information matrix I i j = E −∂ 2 ( ) /∂φ i ∂φ j where i, j = 1, 2, 3 and = (φ 1 , φ 2 , φ 3 ) = (α, β, λ). Unfortunately, the exact closed forms for the above expectations are difficult to obtain. Therefore, the observed Fisher information matrixÎ i j = E −∂ 2 ( ) /∂φ i ∂φ j =ˆ , which is obtained by dropping the expectation operator E, will be used to construct confidence intervals for the parameters, see Cohen (1965). The observed Fisher information matrix has second partial derivatives of log-likelihood function as the entries, which easily can be obtained. Hence, the observed information matrix is given bŷ Therefore, the asymptotic variance-covariance matrix [V ] for the MLEs is obtained by inverting the observed information matrixÎ (α, β, λ). Or equivalent It is well known that under some regularity conditions, see Lawless (1982), (α,β,λ) is approximately distributed as multivariate normal with mean (α, β, λ) and covariance matrix I −1 (α, β, λ). Thus, the (1−γ )100 % approximate confidence intervals (ACIs) for α, β and λ can be given by where Z γ /2 is the percentile of the standard normal distribution with right-tail probability γ /2. Furthermore, to construct the asymptotic confidence interval of the reliability function, hazard function and coefficient of variation, we need to find the variances of them. In order to find the approximate estimates of the variance ofŜ (t),ĥ (t) and C V we use the delta method discussed in Greene (2000). According to this method, the variance ofŜ (t),ĥ (t) and C V , can be approximated, respectively bŷ where Ŝ (t), ĥ (t) and C V are, respectively, the gradient ofŜ (t),ĥ (t) and C V with respect to α, β and λ. Thus, the (1 − γ )100 % ACIs for S (t), h (t) and C V can be given by , The main disadvantage of approximate (1 − γ )100 % CI is that it may yield negative lower bound though the parameter takes only positive values. In such a case the negative value is replaced by zero. However, a different transformation of the MLE can be used to correct the inadequate performance of the normal approximation. Meeker and Escobar (1998) suggested the use of the normal approximation for the log-transformed MLE. Thus, A two-sided (1 − γ )100 % normal approximation CIs for = (α, β, λ, whereˆ = (α,β,λ,Ŝ (t) ,ĥ (t) , C V ).

Bootstrap confidence intervals
A parametric bootstrap interval provides much more information about the population value of the quantity of interest than does a point estimate. Also it is evident that the confidence intervals based on the asymptotic results do not perform very well for small sample size. For this, two parametric bootstrap procedures are provided to construct the bootstrap confidence intervals of α, β, λ, S (t) , h (t) and C V . The first one is the percentile bootstrap (Boot-p) confidence interval based on the idea of Efron (1982). The second one is the bootstrap-t (Boot-t) confidence interval, proposed by Hall (1988). Boot-t developed based on a studentized 'pivot' and requires an estimator of the variance of the MLE of α, β, λ, S (t), h (t) and C V .
(5) Compute the T * ψ statistic defined as Let G 2 (z) = P(T * ≤ z) be the cumulative distribution function of T * for a given z, defineψ

Bayes estimation using MCMC
In this section we obtain Bayesian estimates and the corresponding credible intervals of the unknown parameters α, β and λ, as well as some lifetime parameters S (t), h (t) and C V . It is assumed here that the parameters α, β and λ are independent and follows the gamma prior distributions where the hyperparameters a i and b i , i = 1, 2, 3 are assumed to be nonnegative and known. The posterior distribution of the parameters α , β and λ denoted by π * (α, β, λ|x), up to proportionality can be obtained by combining the likelihood function (11) with the prior (27) via Bayes' theorem and it can be written as (28) Therefore, the Bayes estimate of any function of the parameters, say g (α, β, λ), under squared error loss function can be obtained aŝ It may be noted that, the calculation of the multiple integrals in (29) cannot be solved analytically. In this case, we use the MCMC technique to generate samples from the posterior distributions and then compute the Bayes estimators of the unknown parameters and construct the corresponding credible intervals. From (28), the joint posterior density function of α, β and λ can be written as The conditional posterior densities of α, β and λ can be written as and It can be easily seen that the conditional posterior densities of β given in (32) is gamma density with shape parameter (m + a 2 ) and scale parameter Thus, samples of β can be easily generated using any gamma generating routine. Also, since the conditional posteriors of α and λ in (31) and (33) do not present standard forms, but the plot of both them shows that they similar to normal distribution see Figs. 1 and 2, and so Gibbs sampling is not a straightforward option, the use of the Metropolis-Hasting (M-H) sampler is required for the implementations of MCMC methodology. Given these conditional distributions in (31)-(33), below is a hybrid algorithm with Gibbs sampling steps for updating the parameter β and with M-H steps for updating α and λ. To run the Gibbs sampler algorithm we started with the MLEs ofα,β andλ. We then drew samples from various full conditionals, in turn, using the most recent values of all other conditioning variables unless some systematic pattern of convergence was achieved. Now, the following steps illustrate the process of the Metropolis-Hastings algorithm within Gibbs sampling: (1): Start with initial guess α (0) , β (0) , λ (0) .
In order to guarante the convergence and to remove the affection of the selection of initial value, the first M simulated varieties are discarded. Then the selected sample are α ( j) To compute the credible intervals of α, β, λ,

Bayes estimation using balanced loss functions
In order to make the statistical inferences more practical and applicable, we often need to choose an asymmetric loss function. A number of asymmetric loss functions proposed for use, one of the most popular is the LINEX loss function. This loss function was introduced by Varian (1975), and several others; among of them Ebrahimi et al. (1991). Recently, A more generalized loss function called the balanced loss function (see Jozani et al. 2012) of the form where ρ is an arbitrary loss function, while δ 0 is a chosen a prior 'target' estimator of θ , obtained for instance using the criterion of maximum likelihood, least-squares or unbiasedness. Loss L ρ,ω,δ 0 , which depends on the observed value of δ 0 (X ) reflects a desire of closeness of δ to both; the target estimator δ 0 and the unknown parameter θ ; with the relative importance of these criteria governed by the choice of ω ∈ [0, 1). A general development with regard to Bayesian estimators under L ρ,ω,δ 0 is given, namely by relating such estimators to Bayesian solutions to the unbalanced case, i.e., L ρ,ω,δ 0 with ω = 0. L ρ,ω,δ 0 can be specialized to various choices of loss function, such as for absolute value, entropy, LINEX and a generalization of squared error losses. In (38), the choice ρ (θ, δ) = (δ − θ ) 2 leads to balanced squared error loss (BSEL) function, see Ahmadi et al. (2009), in the form and the corresponding Bayes estimate of the unknown parameter θ under balanced squared error loss (BSEL) is given by The balanced linear-exponential (BLINEX) loss function with shape parameter q(q = 0), is obtained with the choice of ρ (θ, δ) = e q(δ−θ) −q (δ − θ )−1; q = 0, see Zellner (1986). Hence the Bayes estimation of the unknown parameter θ under BLINEX loss function is given by It is clear that the balanced loss functions are more general, which include the MLE and both symmetric and asymmetric Bayes estimates as special cases. For examples, from (40), with ω = 1, the Bayes estimate under balanced squared error loss function reduces to ML estimate, and for ω = 0, it reduces to the Bayes estimate relative to squared error loss function (symmetric). Also, the Bayes estimator under balanced LINEX loss function in (41) reduces to ML estimate when ω = 1, and for ω = 0, it reduces to the case of LINEX loss function (asymmetric). If θ = (α, β, λ, S (t) , h (t) , C V ) and suppose that we judge convergence to have been reached after M iterations of an MCMC algorithm have been performed. Now the approximate posterior mean under balanced squared error loss become Thus, the approximate Bayes estimates of θ = α, β, λ, S (t), h (t) or C V under BSEL are given byθ Similarly, the approximate posterior mean under balanced LINEX loss become Thus, the approximate Bayes estimates for θ = α, β, λ, S (t), h (t) or C V , under BLINEX are given bŷ By sorting α ( j) , β ( j) , λ ( j) , S ( j) (t), h ( j) (t) and C V ( j) , j = M+1, . . . , N in ascending orders, using the method proposed by Chen and Shao (1999), the approximate 100(1− γ ) % CRIs for θ = α, β, λ, S (t), h (t) or C V , are given by
Also, the 95 % bootstrap (Boot-p and Boot-t) confidence intervals (CIs) are displayed in Table 1. Now we would like to compute the Bayes estimates of α, β, λ, S (t), h (t) and C V . We assume the informative gamma priors for α, β and λ that is, when the hyperparameters are a i = 1 and b i = 2, i = 1, 2, 3. As pointed out earlier, the posterior analysis has been done based on a hybrid strategy combining Metropolis within the Gibbs chain. We generate 12000 MCMC samples as has been suggested in Sect. 5. The initial values for the three parameters α, β and λ for running the MCMC sampler algorithm were taken to be their maximum likelihood estimates i.e (α (0) , β (0) , λ (0) ) = (α,β,λ). Burn-in is a problem that may be encountered.Which is the number of iterations that need to be discarded from the generated values. For any starting values α (0) , β (0) and λ (0) the first M values of the generated sequences Markov chain may be far from reminded converged sequences. To determine M there are a number of diagnostic tests proposed in the literature which address the convergence problem. One of them is the trace plot. It is a simply plot of the sampled values from an algorithm at each iteration, with the x-axis referencing the iteration of the algorithm and the y-axis referencing the sampled values. With a trace plot, a lack of convergence is evidenced by trending in the sampled values such that the algorithm never levels-off to a stable, stationary state. Figure 3 shows the trace plots of the first 10000 MCMC outputs for posterior distribution of α, β, λ, S (t), h (t) and C V . Visually the MCMC procedure converges very well. We provide the histogram plots of generated α, β, λ, S (t), h (t) and C V in Fig. 4. Discarding the first 2000 samples as 'burn-in'. Burnin of M = 2000 samples is enough to erase the effect of starting point (initial values). Therefore, MCMC samples can be used for constructing the approximate credible intervals or for estimating the parameters and any functions of them. A sample of size 10000 is obtained to make (approximate) Bayesian inference including posterior mean, median, mode and credible interval of the parameters of interest constructed by the 2.5 and 97.5 % quantities. Table 1 lists the 95 % probability intervals for the parameters, reliability function, hazard function and coefficient of variation. The MCMC results of the posterior mean, median, mode, standard deviation (S.D) and skewness (Ske) of α, β, λ, S (t), h (t) and C V . are displayed in Table 2.
The result of Bayes estimates relative to both BSEL and BLINEX with different values of the shape parameter q of LINEX loss function and various values of ω for the parameters α, β and λ as well as the S (t = 0.4), h (t = 0.4) and C V , are displayed in Table 3.
It is well known that LINEX loss function becomes symmetric for q close to zero and hence approximately behaves as the squared error loss function itself. In addition, we observed that the resulting estimates for q = 0.0001 are approximately similar to the corresponding squared error Bayes estimates.

Monte Carlo simulation study
In order to compare the estimators of parameters, as well as some lifetime parameters reliability function, hazard function and coefficient of variation of the MWD. Monte Carlo simulations were performed utilizing 1000 progressively type-II censored samples for each simulations. All computations were performed using MATHEMATICA ver. 8. To generate progressively type-II censored samples from MWD, we used the algorithm proposed by Balakrishnan and Sandhu (1995) with the parameters α = 2, β = 2 and λ = 3. We assume the informative gamma priors for α, β and λ that is, when the hyperparameters are a i = 1 and b i = 2, i = 1, 2, 3. The    for two distinct values of ω, namely 0 and 0.6. In our study, we consider the following scheme (CS): The performance of the resulting estimators of α, β, λ, S(t), h (t) and C V has been considered in terms of mean square error (MSE), which computed for k = 1, 2, . . . , 6, ϕ 1 = α, ϕ 2 = β, ϕ 3 = λ, ϕ 4 = S(t), ϕ 5 = h(t) and ϕ 6 = C V as Also, we compare CIs obtained by using asymptotic distributions of theMLEs, two bootstrap CIs, and MCMC CRIs. The comparison of them are made in terms of the average CI lengths/credible interval lengths (ACL) and coverage percentages (CP). For each simulated sample, we computed 95 % CIs and checked whether the true value lies within the interval and recorded the length of the CI. This procedure was repeated 1000 times. The estimated coverage probability was computed as the number of CIs that covered the true values divided by 1000, while the estimated expected width of the CI was computed as the sum of the lengths for all intervals divided by 1000. The results of MSE of estimates are shown in Tables 4, 5, 6, 7, 8 and 9 and the results of ACL and CP of 95 % CIs are shown in Tables 10, 11 and 12.

Conclusion
The purpose of this paper is to develop different methods to estimate and construct confidence intervals for the parameters as well as reliability function, hazard function and coefficient of variation of the Weibull-Gamma distributed under a progressively type-II censored samples. The MLEs of the unknown parameters are obtained and propose different confidence intervals using asymptotic distributions as well as parametric bootstrap methods. The Bayesian estimates of the unknown parameters are also proposed. It is observed that the Bayes estimators cannot be obtained in explicit forms and they can be obtained using the numerical integration. Because of that we have used MCMC technique and it is observed that the Bayes estimate with respect to informative prior works quite well in this case. Also, the Bayes estimates have been obtained under balanced loss functions. The theoretical results have been applied with the numerical example to illustrative purposes. A simulation study was conducted to examine and compare the performance of the proposed methods for different sample sizes (n, m) and different CSs (I, I I, I I I ). From the results, we observe the following: 1. It is observed that from Tables 4, 5, 6 7, 8 and 9, as sample size increases, the MSEs decrease and Bayes estimates have the smallest MSEs for α, β, λ, S(t), h (t) and C V . Hence, Bayes estimates perform better than the MLEs and bootstrap methods in all cases considered. 2. From Tables 4, 5, 6 7, 8 and 9. It can be seen that bootstrap-t perform better than percentile bootstrap and MLEs, because, bootstrap-t have the MSEs smaller than MSEs in percentile bootstrap and MLEs for α, β, λ, S(t), h (t) and C V . 3. When ω = 0, Bayes estimates are provides better estimates for α, β, λ, S(t), h (t) and C V in the sense of having smaller MSEs. 4. Bayes estimates under BLINEX with q = 0.5 are provides better estimates in the sense of having smaller MSEs when ω = 0 and 0.6. 5. For fixed values of the sample n and failure time sizes m, the scheme I perform better than scheme II and III in the sense of having smaller MSEs. 6. From Tables 10, 11 and 12. It can be seen that, the MCMC CRIs give more accurate results than the approximate CIs and bootstrap CIs since the lengths of the former are less than the lengths of latter, for different sample sizes, observed failures and schemes. 7. The bootstrap-t CIs is better than the percentile bootstrap CIs and ACIs in the sense of having smaller widths.   8. For fixed sample sizes and observed failures, the first scheme I, in which censoring occurs after the first observed failures, gives lower lengths for the three methods of the CIs other than the other two schemes.