Maximum likelihood estimation of the Weibull distribution with reduced bias

In this short note, we derive a new bias adjusted maximum likelihood estimate for the shape parameter of the Weibull distribution with complete data and type I censored data. The proposed estimate of the shape parameter is significantly less biased and more efficient than the corresponding maximum likelihood estimate, while being simple to compute using existing maximum likelihood software procedures.


Introduction
The Weibull distribution, with probability density function where θ = (k, λ) and k > 0 is the shape parameter and λ > 0 is the scale parameter, is a popular distribution in analysis of survival data.Given data y = (y 1 , . . ., y n ) , a common approach to estimating the parameters of a Weibull distribution, θ, is via the method of maximum likelihood (ML), in which the parameters are set to values that maximise the log-likelihood of the data The ML estimate of λ is and the ML estimate of k, k(y), is defined implicitly by the estimating equation and must be obtained by numerical optimisation.
The ML estimate of the Weibull distribution scale parameter λ has negligible bias, even for relatively small sample sizes.In contrast, the ML estimate of the shape parameter k is known to be strongly biased for small sample sizes.Ross [15] derived a simple bias-reduction adjustment formula for the ML estimate of k kR (y) = n − 2 n − 0.68 kML (y), (5) arXiv:2209.14567v2[stat.ME] 9 Feb 2023 and later extended his approach to censored data [16].Hirose [8] proposed an alternative bias correction method for data with no censoring that was derived by fitting a non-linear function to simulation results.Teimouri and Nadarajah [22] develop improved ML estimates for the Weibull distribution based on record statistics.In contrast, Yang and Xie [24] use the modified profile likelihood proposed by Cox and Reid [5,6] to derive an alternative ML estimate of k (MLC) from the estimating equation Using simulations, Yang and Xie showed that their estimate of k is less biased than the ML estimate and more efficient than the estimate (5) proposed by Ross.In a follow-up paper, Shen and Yang [17] developed a profile ML estimate of k in the case of complete and censored samples, and showed that it outperformed MLC in simulations with complete data.
In this paper, we introduce new bias adjusted maximum likelihood estimates for the Weibull distribution for both complete and type I censored data.In addition, we derive a novel formula for the Kullback-Leibler (KL) [9] divergence between two Weibull distributions under type I censoring, a result that does not appear to be widely known.

Type I Censored Data
In survival analysis, one typically does not observe complete data and instead has joint realisations of the random variables (Y = y, ∆ = δ) where Y = min(T, c) and where the random variable T denotes the survival time and c > 0 is the fixed censoring time.The data comprises the survival time T = t of an item if this is less than the corresponding censoring time c (i.e., T ≤ c); otherwise, we only know that the item survived beyond time c (i.e., T > c).
The log-likelihood of data D = {(y 1 , δ 1 ), . . ., (y n , δ n )} under type I censoring is where d = n i=1 δ i is the number of uncensored observations.The maximum likelihood (ML) estimate of λ is then and k(y) is obtained from the estimating equation As in the case of complete data, the ML estimate of k for type I censored data has large bias for small sample sizes, and for large amounts of censoring.Based on the modified profile likelihood approach, Yang and Xie [24] propose an alternative estimate of We note that the above score function requires that d > 1 to yield a positive estimate for k.Yang and Xie demonstrated that their proposed estimate of k is less biased and more efficient than the regular ML estimate.
Shen and Yang [17] derived a new second-and third-order bias correction formula for the shape parameter of the Weibull distribution without censoring and with general right-censoring models.Although the new estimate is shown to be effective in correcting bias, it must be computed through bootstrap simulation.The same procedure was later extended to include Weibull regression with complete and general right censoring [18].
More recently, Choi et al [3] examine a different problem of Weibull parameter overestimation caused by mass occurrences of (censored) events in the early time period and develop an expectation maximization (EM) algorithm to reduce bias.
Maximum likelihood estimation of the Weibull distribution under more sophisticated censoring schemes has also been studied.Progressive hybrid censoring and generalized progressively hybrid censored data was examined in [11] and [25], respectively.Ng and Wang [14] and Teimouri [21] study ML estimation of the Weibull distribution with progressively type I interval censored data.An R package for both progressively type I and type II censored data was developed in [20].Additionally, ML estimation of the Weibull distribution with generalized type I censored data and block censoring was examined in [19] and [26], respectively.
2 A simple adjustment to maximum likelihood estimates to reduce estimation bias In a landmark paper, Cox and Snell [7] derived an approximation to the finite sample bias of ML estimates for independent, but not necessarily identically distributed, data (see Appendix A for details).The ML estimate with reduced bias, θML , is given by where the Cox and Snell formula for Bias( θML ) is given in Appendix A, and is evaluated at the usual ML estimate θML .A benefit of this bias approximation formula is that it can be computed even if the ML estimate is not available in closed form.A similar approach to the above was used to derive bias adjusted ML estimates for the unit Weibull distribution [13] and the inverse Weibull distribution [12] with complete data only.We now extend these results to the Weibull distribution with complete data and Type I censored data.Theorem 1.The finite sample bias of the ML estimate (3) for the Weibull distribution with complete data is where ζ(•) is the Riemann zeta function.ML estimates of k and λ with reduced bias can be obtained from (11).
Proof.The proof involves the application of the Cordeiro and Klein [4] approach (see (22) and (23) in Appendix A), to the Weibull distribution (1).It is well known that expected Fisher information matrix for the Weibull distribution, and its inverse, are given by , where γ ≈ 0.5772 is the Euler-Mascheroni constant.Direct calculation shows that the 2 × 4 matrix A (see (23) in Appendix A) has entries Substituting K −1 and A into (22) and simplifying completes the proof.
From (12), we observe that the ML estimate of k is upwardly biased for any finite n.A key advantage of the proposed bias adjusted estimate is that it can be trivially computed in any software that implements ML Weibull estimation.We now derive a similar correction for the more complex case of type I censoring.Theorem 2. The finite sample bias of the maximum likelihood estimate (3) for the Weibull distribution with type I censored data is where p = 1 − exp(−z c ) is the proportion of uncensored observations, z c = (c/λ) k , and whose j-th derivative is For brevity, we use the shorthand notation γ j ≡ γ (j) (1, z c ) to denote the j-th derivative of the incomplete gamma function evaluated at (1, z c ).As in the case of complete data, the ML estimate of k with reduced bias can be obtained from (11).
Proof.The expected Fisher information matrix for the Weibull distribution with type I censoring is [23] By direct calculation we have We note that where ψ (2) (1) is the second derivative of the polygamma function evaluated at 1.As expected, the matrix A for type I censored data converges to the corresponding matrix with complete data as p → 1. Substituting K −1 and A into (22) and simplifying completes the proof.
Figure 1 shows the bias adjustment as a function of the proportion of uncensored observations p.As the proportion of uncensored observations p → 1 (i.e., no censoring), f (p) → (≈)1.3795 as expected.Additionally, f (p) → ∞ as the proportion of censored data is increased (i.e., p → 0).
Remark.As noted in the introduction, the ML estimate of the scale parameter λ has negligible bias even for small sample sizes.For complete data, this finite sample bias, computed using the Cox and Snell methodology, is: where γ ≈ 0.5772 is the Euler-Mascheroni constant.For type I censored data, the finite sample bias is: where p is the proportion of uncensored observations, and , with γ j ≡ γ (j) (1, z c ) again denoting the j-th derivative of the incomplete gamma function ( 16) evaluated at (1, z c ).

Simulation
We performed a simulation to examine the finite sample behaviour of the new bias adjusted ML estimates of k for both complete and type I censored data.In all simulations, the scale parameter of the data generating model was set to λ * = 1 without loss of generality.Due to the scale invariance of the maximum likelihood estimator and the negligible bias in estimating λ * , the simulation results for other values of λ * are expected to yield similar conclusions.

Complete data
For each run of the simulation, we generated n data points from the model Weibull(k * , λ * = 1) where n = {10, 20, 50} and the shape parameter was set to k * ∈ {0.5, 1, 5, 10}.Regular maximum likelihood (ML) estimates, our proposed bias adjusted maximum likelihood estimates (MMLE), conditional maximum likelihood estimates (MLC) proposed by Yang and Xie [24], and the profile maximum likelihood estimates of Shen and Yang (MLP) [17] were then computed from the data.We used the second-order bias reduction of Shen and Yang as it was virtually indistinguishable from the third-order formula in our tests.We performed 10 5 simulations for each combination of (k * , n) and recorded the average bias, mean squared error and Kullback-Leibler (KL) divergence [9] from the data generating model (see Appendix B).Simulation results are shown in Table 1 with the KL results omitted for ease of presentation.
All three bias adjusted ML estimates of k result in a significant reduction in bias compared to the usual ML estimate.KL divergence for the two estimates are virtually identical.Unlike both the MLC and MLP estimates, our bias adjusted ML estimate of k is simple to compute in software via existing Weibull ML estimation procedures and does not require the use of the parametric bootstrap.

Type I censored data
We also conducted a similar experiment in the setting of type I censored data.For each iteration of the simulation, we generated n data points from the model Weibull(k * , λ * = 1) where n = {10, 20, 30}; the shape parameter was again set to k * ∈ {0.5, 1, 5, 10}.The proportion of uncensored observations was p ∈ {0.3, 0.5, 0.7, 0.9}.In addition to the bias and the mean squared error in estimating the shape parameter, we computed the Kullback-Leibler (KL) divergence [9] between the data generating model and each estimated model (see Appendix B).
The newly proposed bias adjustment estimate of k (MMLE) was compared to the standard ML estimate, the conditional maximum likelihood estimate (MLC) proposed by Yang and Xie [24] and the profile maximum likelihood estimate (MLP) of Shen and Yang [17].The third-order profile ML estimate suffered from issues regarding numerical stability for small n and large amounts of censoring occasionally resulting in a negative estimate of k * ; hence all the comparisons were made with the second-order variant.We restricted the experiments to exclude data sets where the number of uncensored observations d(= i δ i ) < 2, as MLC may result in negative estimates of k for d < 2, though we note this does not cause a problem for our proposed MMLE method.The results of these simulations, averaged over 10 5 runs for each combination of (n, p, k * ), are shown in Table 2, with the KL results omitted for ease of presentation.
We observe that our MMLE estimate of k is more efficient and less biased than the standard ML estimate of k for all tested values of (n, p, k * ).The conditional ML estimate of k is, in general, more biased and has higher mean squared error compared to the MLP and our MMLE estimates.In terms of bias reduction, the profile ML estimate of k is virtually identical to our MMLE for n ≥ 30.For small sample sizes (n = 20) and higher levels of censoring (p ≤ 0.5), the MMLE estimate appears superior to MLC and MLP in terms of bias, mean squared error and KL divergence.Additionally, in contrast to the profile ML method, our MMLE estimate is easily computed without the need for numerical simulation, and as such can be easily integrated into any software that implements fitting of the Weibull distribution to complete and type I censored data.

Real data
To illustrate the usefulness of our new bias adjusted maximum likelihood estimates, we consider real data on failure voltages from [10] (pp., 240) that is also analysed in [17].The data consists of failure voltages of two types of electrical cable insulation (type 1 and type 2) of 20 specimens each, and is shown in The estimates of the shape parameter proposed in [24] and [17] are significantly closer to our bias adjusted estimates than to the original maximum likelihood estimates, which exhibit significant upward bias.As expected, bias adjusted estimates of the scale parameter are all approximately equal to the corresponding maximum likelihood estimates.
As a further example, we consider the criminal recidivism data first published in [1].This data consists of 432 survival times of individuals released from Maryland state prisons in the 1970s and followed up for 52 weeks after release (i.e., all censored observations were censored at 52 weeks).Approximately 75% of the observations were censored, indicating a relatively high degree of censoring.Assuming that the survival times are Weibull distributed, regular ML estimates of the shape and scale parameter were found to be kML = 1.37 and λML = 123.68,respectively.In contrast, our bias adjusted MMLE estimates were kMMLE = 1.35 and λMMLE = 123.68.As expected, all three bias adjusted estimates examined in this manuscript were similar to the ML estimates due to the relatively large sample size.
We then randomly sampled 20 observations from the original data without replacement; these were 9, 27, 35, 43 and 46 weeks, with the remaining 15 observations censored at 52 weeks.The ML estimate of the shape parameter of this subsample was found to be kML = 1.72; note this is substantially higher than the ML estimate of 1.37 obtained on the full data sample.In contrast, our bias adjusted MMLE estimate from this subsample was kMMLE = 1.39, which is very close to the estimate obtained on the full sample.This again demonstrates that the ML estimate is strongly upwards biased, especially in the case of smaller sample sizes and high degrees of censoring.

Discussion
Our proposed MMLE approach to first-order bias correction results in improved performance compared to the standard ML estimate in small to medium sample sizes for both complete and type I censored data.The methodology introduced here can also be extended to more sophisticated censoring plans.As an example, consider progressive type I interval censoring (PTIC) [2].Here, we have n items entering a life experiment at time T 0 = 0.The items are monitored at m > 0 pre-selected times T 1 < T 2 < . . .< T m only, with the experiment scheduled to terminate at the last observation time T m .During each inspection time T i (i = 1, . . ., m), the number of failures Y i for the time interval (T i−1 , T i ] is recorded and R i surviving items are removed from the experiment at random.The number of removed items may be pre-specified as a percentage p i of the remaining surviving items X i ; that is, R i = p i X i where 0 < p i ≤ 1, z is the largest integer less than or equal to z and p m = 1 as all surviving items are removed from the experiment at time t m .Thus, PTIC may be summarised by m triplets {Y i , R i , T i } m i=1 .To obtain first order bias adjusted ML estimates for the Weibull distribution under PTIC, we require the expected Fisher information matrix as well as the expected third order derivatives of the log-likelihood function.A general expression for the expected Fisher information matrix under PTIC is given by Theorem 3.3 [21] while Theorem 3.4 [21] derives the expected third order derivatives for an arbitrary log-likelihood under PTIC.These two formulas are easily specialised to Weibull distributed survival times.
A limitation of the Cox and Snell first order bias adjustment approach in the case of PTIC is that an analytic expression for the bias correction is not easily available and the estimator must instead be implemented in software.This is because the expected Fisher information matrix and the expected third order derivatives are somewhat long and cumbersome due to interval censoring and the summation over m monitoring times.To fit Weibull distributed data under PTIC, we recommended the R package bccp [20] which features a numerical implementation of the Cox and Snell bias adjustment approach under progressive type I and type II interval censoring for a wide range of distributions.

Appendices A Cox and Snell approximation
Let θ ∈ R p , where p > 0 is the number of free parameters, which is p = 2 in the case of the Weibull model.Cox and Snell showed that the bias for the s-th element of the ML estimate θML can be written as for s = 1, . . ., p, where the cumulants are for i, j = 1, . . ., p and κ i,j is the (i, j)-th entry of the inverse of the expected Fisher information matrix K = {−κ ij }.Following Cordeiro and Klein [4], we can compactly write this in matrix notation as where the matrix A is the (p × p 2 ) matrix given by for i, j, l = 1, . . ., p.

B Kullback-Leibler divergence
For the case of complete data, the Kullback-Leibler (KL) divergence between the data generating model Weibull(k 0 , λ 0 ) and the approximating model Weibull(k 1 , λ 1 ) is Assuming type I censoring, the KL divergence between two Weibull models Weibull(k 0 , λ 0 ) and Weibull(k 1 , λ 1 ) is where

Table 1 :
Compared to MLC, our proposed estimate yields smaller mean squared error and KL divergence, especially as k increases.The profile ML estimate has a slightly smaller bias than our estimate, while the mean squared error and the Bias and mean squared error for maximum likelihood (ML), conditional maximum likelihood of Yang and Xie (MLC), profile maximum likelihood of Shen and Yang (MLP) and our bias adjusted maximum likelihood (MMLE) estimates of k * computed over 10 5 simulations with λ Figure 1: Bias adjustment f (p) for the maximum likelihood estimate of the Weibull distribution shape parameter k of as a function of the proportion of uncensored observations p = 1 − exp(−(c/λ) k ).* = 1.

Table 2 :
Bias and mean squared error for maximum likelihood (ML), conditional maximum likelihood of Yang and Xie (MLC), profile maximum likelihood of Shen and Yang (MLP) and our bias adjusted maximum likelihood (MMLE) estimates of k * computed over 10 5 simulations with λ * = 1; p denotes the proportion of uncensored observations.

Table 3 :
Failure voltages (measured in kV/mm) for two types of electrical cable insulation (type 1 and type 2) of 20 specimens each.Assuming that the failure voltages can be modelled adequately by the Weibull distribution, the ML estimates of the shape and scale parameters for type 1 cables are kML = 9.38 and λML = 47.78,respectively, and for type 2 cables, the ML estimates are kML = 9.14 and λML = 59.12.The newly proposed MMLE estimates of the shape parameter for type 1 and type 2 cables are easily obtained from the corresponding ML estimates using (12):