On empirical cumulative residual entropy and a goodness-of-fit test for exponentiality

The cumulative residual entropy (CRE) is a new measure of information and an alternative to the Shannon differential entropy in which the density function is replaced by the survival function. This new measure overcomes deficiencies of the differential entropy while extending the Shannon entropy from the discrete random variable cases to the continuous counterpart. Some properties of the cumulative residual entropy, its estimation and applications has been studied by many researchers. The objective of this paper is twofold. In the first part, we give a central limit theorem result for the empirical cumulative residual entropy based on a right censored random sample from an unknown distribution. In the second part, we use the CRE of the comparison distribution function to propose a goodness-of-fit test for the exponential distribution. The performance of the test statistic is evaluated using a simulation study. Finally, some numerical examples illustrating the theory are also given.

outcome. Shannon (1948) was able to formulate the measurement of this uncertainty contained in a single event. The uncertainty contained in a discrete random variable is then considered as the weighted average of the uncertainty of each single event. Formally, for a discrete random variable X with probability mass function p(x) = P(X = x), the Shannon entropy is defined as (1) Note that H (X ) = −E[log p(X )] and an immediate extension leads us to its continuous analog called the differential entropy. That is, for a non-negative continuous random variable with density function f (x) the differential entropy, which we denote by H c (X ), is defined as ( 2) Nevertheless, it is well-known (cf. Di Crescenzo and Longobardi (2009) and references therein) that this extension does not preserve some basic properties of an information measure; for instance, the differential entropy can take on negative values. Recently, among various attempts to define possible alternative information theoretic measures, Rao et al. (2004) proposed the cumulative residual entropy (CRE) and studied its properties. This measure replaces density function by the survival function. For a non-negative random variable X with distribution function F and survival function F = 1 − F, the CRE is defined as follows: Properties of the CRE can be found in Rao (2005), Di Crescenzo andLongobardi (2009), andNavarro et al. (2010). Di Crescenzo and Longobardi (2009) also introduced and studied the Cumulative Entropy, denoted by CE(X ), as an analog to CRE by using distribution function in (3) instead of survival function. That is, Asadi and Zohrevand (2007) considered the corresponding dynamic properties of the CRE corresponding to the residual lifetime variable. Applications of CRE to image alignment and measurements of similarity between images can also be found in Wang and Vemuri (2007) and references therein. Due to the extensive applications of various information criteria in studying biological and engineering systems, it is incumbent on practitioners to estimate CRE when no prior information are available on the underlying distribution of X . Rao et al. (2004) consider empirical CRE as a plug-in estimator of CRE through replacing survival function by the empirical survival function and show that it is a strongly consistent estimator for CRE. Similarly, Di Crescenzo and Longobardi (2009) use the empirical cumulative entropy to estimate the cumulative entropy. They are also able to prove the strong consistency of the empirical cumulative entropy and provided a central limit theorem based on a random sample from an exponential distribution. It is also worthy to mention that similar results have been obtained for other information measures. For instance, Abraham and Sankaran (2006) introduced and studied Renyi's information measure for residual lifetime distributions. Maya et al. (2013) proposed several nonparametric estimators for the Renyi's information measure for the residual lifetime distribution based on complete and censored data and established their asymptotic properties under suitable regularity conditions. Let X 1 , . . . , X n be independent positive random variables with continuous distribution function F(t), survival FunctionF(t) = 1 − F(t), and cumulative hazard function (t) = − logF(t). Assume that X i s are censored on the right by independent and identically distributed positive random variables T i (with survival function C(x)) which are also independent of X i . Define Z i = min{X i , T i } and δ i = 1 or 0 according as to whether X i ≤ T i or X i > T i respectively. Then the available data are {(Z 1 , δ 1 ), . . . , (Z n , δ n )}. A well-known estimate ofF is the Kaplan-Meier estimator, F (Kaplan and Meier,1958) which is given bŷ In this paper, we replaceF and with their corresponding Kaplan-Meier and Nelson-Aalen estimators, respectively. Observe that E(X ) can also be written as where τ = sup{x :F(x) > 0}, and due to this, we propose the following estimator of the CRE: In this paper, we will prove that this estimator is a consistent estimator and its asymptotic distribution is normal. Testing for exponentiality has involved a great deal of current statistical research recently, and is of some importance in statistical inference. The tests are usually constructed by using the characterization results from reliability theory and also by using different information measures such as similarity or discrimination measures for comparing between distribution functions(cf. Baringhaus and Henze 2000;Baratpour and Habibi Rad 2012, and references therein). Let X and Y be non-negative random variables with distribution functions F and G, respectively. To compare between X and Y , the comparison distribution function (Parzen 1998) is defined as D(u) = F(G −1 (u)), for 0 ≤ u ≤ 1 (note that if G = F, then D(u) will be the cumulative distribution function of the uniform distribution). Our test statistic is motivated by considering the CRE for the comparison distribution function, that is If Y is a non-negative random variable with distribution function G and X is distributed as exponential distribution with mean λ, then will compare distribution function G with the exponential distribution. If Y is distributed as exponential distribution, then C(ex p, Y ) = 1 4 , which is indeed, the value of CRE for a standard uniform random variable. Viewing the difference C(ex p, Y ) − 1 4 as a measure of the deviation of the distribution of Y from the exponential distribution, we give another goodness-of-fit test for the exponential distribution.
The rest of the paper is organized as follows. In Sect. 2 we give the large sample properties of the empirical CRE. In Sect. 3 we apply the comparison CRE to construct a goodness-of-fit test for the exponential distribution. Section 4 is devoted to the simulation results and a couple of numerical examples and finally, some concluding remarks are given in Sect. 5.

Asymptotic properties of E(F)
In this section, we investigate the consistency and asymptotic normality of E(F). We first recall some notations from standard counting process methods. Let be the number of failures or deaths up to time t i.e, the number of uncensored samples, and be the number of at risk process. The Nelson-Aalen estimator of the cumulative hazard function is given by (cf. Kalbfleisch and Prentice 2002, p. 168 is a square integrable martingale with respect to the natural filtration. Theorem 2.1 Let y(t) =F(t)C(t). Then, as n → ∞,

) converges in distribution to a Gaussian random variable Z with mean zero and variance
where, and p −→ represents convergence in probability.
Proof First, one can easily show by using the Glivenko-Cantelli Theorem that This and the Rebolledo's Theorem (see Kalbfleisch and Prentice 2002, pp. 166-168) imply that √ n(ˆ (t) − (t)) converges to a Gaussian random variable with mean zero and variance v(t). The result now follows from Theorem 3.1 in Sengupta et al. (1998), as its one dimensional case, by replacing K (t) and X (t) byF(t), the Kaplan-Meier estimator ofF, and √ n(ˆ (t) − (t)), respectively. By the standard counting process method, an estimator of σ 2 can be given bŷ where, and t ∧ u stands for min{t, u}.
Remark 2.2 In the censored sample case, an analogue estimator of the cumulative entropy can also be given by By applying the same method, one can easily conclude that, as n → ∞, CE(F) converges in probability to CE(X ). Furthermore, √ n(CE(F) − CE(X )) converges in distribution to a zero mean Gaussian random variable with variance estimated bŷ where, , . This is an extension for the result by Di Crescenzo and Longobardi (2009) in which they provide a central limit theorem for the empirical cumulative entropy based on random samples from the exponential distribution.

A goodness-of-fit test for the exponential distribution
Let X 1 , X 2 , . . . , X n be a random sample from the population of a non-negative random variable X with continuous distribution function F. In this section, we apply the measure (8) to construct a test statistic for testing the hypothesis H 0 : F(x) = 1−e −x/λ versus the alternative H a : F(x) = 1 − e −x/λ . Under the null hypothesis C(ex p, X ) = 1 4 (which is indeed, the value of CRE for a standard uniform random variable), then large or small value of the difference C(ex p, X ) − 1 4 will lead us to reject the null hypothesis in favor of the alternative H a . Using the standard U-statistic theory (cf. Lee 1990), we propose the following statistic C n , an estimator of C(ex p, X ), as our test statistic: whereX = 1 n n i=1 X i . The following theorem gives the asymptotic distribution of the test statistic.
On the other hand, from the standard U-statistic theory (cf. Lehmann 1999, p. 369), under the null hypothesis we have λ . Now the result immediately follows from Theorem 2.13 in Randles (1982).
We reject H 0 in favor of H a at the significant level α if 382n percentile of the standard normal distribution. In the next section, we use the Monte Carlo simulation to compare the power of our test statistic with some other statistics for fitting the exponential distribution to a random sample data.

Simulation study
Recently, Baratpour and Habibi Rad (2012) provide a goodness-of-fit test statistic based on a discrimination measure arising from a version of the Kulback-Leibler information measure to test the hypothesis H 0 versus the alternative H a . Their test statistic is given by where X (i) is the ith ordered statistic related to the sample and H 0 is rejected at significant level α if T n ≥ T n,1−α , where T n,1−α is 100(1 − α)-percentile of T n under H 0 . They also provide a Monte Carlo simulation study to compare between the performance of T n , the statistic introduced by Van-Soest (1969) the statistic introduced by Finkelstein and Schafer (1971) , which are proposed for testing H 0 against H a . In K LC mn statistic, the window size m is a positive integer smaller than n 2 , X ( j) = X (1) , if j < 1, and X ( j) = X (n) , if j > n. H 0 is rejected of large values of W 2 , S * and of small values of K LC mn . We have undertaken a simulation exercise to investigate the performance of our test statistic comparing it with the above statistics T n , W 2 , S * , and K LC mn . In our simulation, we considered the following distribution functions and the empirical powers of the test statistics were compared for each of the distributions.
(i) a Weibull distribution with density function As in Baratpour and Habibi Rad (2012), for each case we set the parameters such that E(X 2 1 ) for the Weibull distribution, λ = 2 1+β for the gamma distribution, σ 2 = 2 3 (ln 2 − μ) for the lognormal distribution and λ = μ 2 2−μ for the inverse Gaussian distribution. The empirical power was computed for each statistic under a total of 100, 000 generated samples of sizes n = 5, 10, 15, 20, 25. The power was taken as the fractional number of times, out of 100, 000, the corresponding statistic exceeded the relevant threshold. Tables 1, 2, 3 and 4 summarize the results of the simulation for each example. One can see from the tables that the power of all tests against any alternative show an increasing pattern with respect to sample size. This reveals the consistency of the tests. In general, there is no big difference between the power of the test statistics C n and other tests, but it has the added advantages of having simple form and a known asymptotic distribution.

Data Analysis
In this section, we give a couple of numerical examples based on real life data set to illustrate the use of the test statistic C n for validating the goodness of an exponential distribution fitting to a real data set.

5
(C 29 − 0.25) gives a P-value of 0.379. Thus, the test does not reject the null hypothesis that the failure times follow an exponential distribution at significance level α = 0.05. Using three other test statistics, Lawless (1982) obtained the same result for the above failure data.

Conclusion
In this paper, we have considered the asymptotic behaviour of the empirical cumulative residual entropy. We were able to show that the empirical CRE converges in distribution to a normal random variable. It was also shown that the same result holds for the empirical cumulative entropy which extends the result by Di Crescenzo and Longobardi (2009). We used the CRE entropy of the comparison distribution function to propose a new goodness-of-fit test for an exponential distribution. An extensive simulation exercise was undertaken to compare between the performance of this test statistic and four other test statistics and the results revealed the consistency and high power of the proposed test statistic. Finally, using a couple of numerical examples, the use of the test statistic for testing goodness-of-fit for exponential distribution was illustrated.