Abstract
Zerbet and Nikulin (Commun Statist Theor Meth 32(3): 573–583, 2003) presented the new statistic \({Z}_{k}\) for detecting outliers in exponential distribution. They also compared this statistic with Dixon's statistic \({D}_{k}\). Jabbari et al. (Commun Statist Theor Meth 39(4): 698–706, 2010) expend this statistic (\({Z}_{k}\)) for Gamma distribution. In this paper, we generalize statistics \({Z}_{k}\)–\({Z}_{k}^{*}\), for detecting outliers in Rayligh distribution and compare the results with the generalized Dixon's statistic. Distribution of the test based on the statistic \({Z}_{k}^{*}\) under slippage alternatives is obtained. The criterion value and power of the new test are also calculated and compared with the criterion value of the Dixon's statistic. The results show that the test based on statistic \({Z}_{k}^{*}\) is more powerful than the test based on the statistic \({D}_{k}\).
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Bol'shev (1969) generalized the Chauvenet's test for rejecting outlier observations (see Bol'shev 1969; Voinov and Nikulin 1993, 1996). This method is suitable for detecting k outliers for univariate data set. The Chauvenet's test can be used for exponential case. Ibragimov and Khalna (1978) considered various modification of this test. Several authors considered the problem of testing one outlier in exponential distribution (Chikkagoudar and Kunchur 1983; Kabe 1970; Lewis and Fiellerm 1979; Likes 1966). Only two types of statistics for testing multiple outliers are exist. First is Dixon's while the second is based on the ratio of the some of the observations suspected to be outliers to the sum of all observations of the sample. Most of these authors have considered a general case of gamma model and the results for exponential model are given as a special case. This approach is focused on alternative models, namely slippage alternatives in exponential samples (see Barnett and Lewis 1978). Zerbet and Nikulin (2003) proposed a statistic which is different to the well-known Dixon's statistic \({D}_{k}\) to detect multiple outliers. In this paper, we generalize the statistics \({Z}_{k}\)–\({Z}_{k}^{*}\) for detecting outliers in Rayligh distribution. Distribution of the test based on these statistics under slippage alternatives is obtained and the tables of critical values are given for various sample size n and number of outliers k. The power of these tests are also calculated and compared. The results show that the test based on statistic \({Z}_{k}^{*}\) is more powerful than the test based on statistic \({D}_{k}\).
2 Statistical inference
Let \({X}_{1},{X}_{2},\dots ,{X}_{n}\) are arbitrary independent random variables. In this paper, we want to test the hypothesis: \({{H}_{0}:X}_{1},{X}_{2},\dots ,{X}_{n}\) derive from a Rayligh distribution as.
Therefore, the probability density function of these variables under null hypothesis is:
But, under the slippage alternative Hk, we have:
where \(\beta \ge 1 , \beta\) is unknown and \({X}_{(1)},{X}_{(2)},\dots ,{X}_{(n)}\) are the order statistics corresponding to the observations\({X}_{1},{X}_{2},\dots ,{X}_{n}\). This hypothesis could be considered as an important sub-hypothesis of the one saying that k of n observations are suspected to be outliers (for \(\beta >1\), these k observations are called upper outliers). The hypothesis \({H}_{0}\) corresponds to the \(\beta =1\) To test\({H}_{0}\), we propose the following statistics:
For \(m=1\), the above statistics (\({Z}_{k}^{*}\)) proposed by Zerbet and Nikulin (2003).
Following the idea of the Chauvenet's test, we assume that the decision criterion is: the hypothesis \({H}_{0}\) is rejected when \({Z}_{k}^{*}\) >\({z}_{c}\) with \({Z}_{c}={Z}_{c}(\alpha )\) being the critical value corresponding to the significance level \(\alpha\).
3 The distribution of the statistics \({Z}_{k}^{*}\) under alternatives
In this section, we find the distribution of the statistics \({Z}_{k}^{*}\), according to Zerbet and Nikulin (2003) method. Then the distributions of these statistics under the slippage alternative hypothesis \({H}_{k}\) are obtained by the following theorem.
Theorem 3.1.
The distribution of the statistic \({Z}_{k}^{*}\), under \({H}_{k}\) is given by:
where
Proof.
To proof this theorem, we must obtain the distribution of the statistic \({Z}_{k}^{*}\) under the alternative hypothesis \({H}_{k}\).
We first compute the corresponding alternative distribution of the statistic:
where \({V=X}_{(n-k)}^{2} -{X}_{(1)}^{ 2}\) and \(W=\sum_{j-n-k+1}^{n}({X}_{(j)}^{2} -{X}_{(n-k)}^{2})\).
let \({Y}_{j}={X}_{(j)}^{m} -{X}_{(j-1)}^{ m}\) we obviously obtain that:
\(\sum_{j-2}^{n-k}{Y}_{j}={X}_{(n-k)}^{2} -{X}_{(1)}^{2}\) and \(\sum_{j-n-k+1}^{n}{(n-j+1)Y}_{j}= {\sum_{j-n-k+1}^{n} (X}_{(j)}^{2} -{X}_{(n-k)}^{2}\) Then,
The characteristic function of \((v,w)\) is
Knowing that \({Y}_{i},j=\mathrm{1,2},\dots ,n-k\) follows the Rayligh distribution of parameters 1 and \(\theta (k\beta +n-k-j+1{)}^{-1}\), and \({Y}_{n-k+j},j=\mathrm{1,2},\dots ,k\) have the same distribution but with parameters 1 and \(\left(\frac{\theta }{\beta }\right)(k-j+1{)}^{-1}\)(see Chikkagoudar and Kunchur (1983)), then the characteristic function \(\phi {}_{(V,W)}\) is
Therefore we have,
with \({\alpha }_{j}=\theta (k\beta +n-k-j+1{)}^{-1}\) and \({b}_{j}=(\frac{\theta }{\beta })(k-j+1{)}^{-1}\). Therefore, the joint density function of \((v,w)\) can be obtained as follows:
To find the joint probability density function of variables \((v,w)\), we first calculate the following products:
And
By replacement Eqs. 2–5 in Eq. (1), the joint pdf of variables \((v,w)\) will be as follows:
in the process to find the joint pdf of variables (V,W), we know that
As a conclusion, the pdf of \({U}_{k}\) is
then,
Then the distribution function of \({Z}_{k}^{*}\) can be found from (1) using the relation
and the proof is complete.
Corollary:
Under \({H}_{0}\) the distribution of statistic \({Z}_{k}^{*}\) is obtained from the Theorem 3.1 using \(\beta =1\).
4 Power comparison of the tests and conclusions
The critical values of the statistics \({Z}_{k}^{*}\) and \({D}_{k}\), for the significance levels of \(\alpha =0.05\) and \(\alpha =0.1\), for \(k=\mathrm{1,2},\dots\) such that \(k<n\), \(n=8\left(1\right)12\) is given in Tables 1 and 2, respectively. Meantime, the Dixon's statistics \({D}_{k}\) is given by
for more details about the distribution of the Dixon's statistic, see Likes (1966) and Chikkagoudar and Kunchur (1983).
According to Tables 1 and 2, we can see the critical value of \({Z}_{k}^{*}\) increases when n is increase. But, the critical value of \({D}_{k}\) decreases when n is increase.
Also, the critical value of \({Z}_{k}^{*}\) decreases when k is increase. But, the critical value of \({D}_{k}\) increases when k is increase.
References
Barnett V, Lewis T (1978) Outlier in statistical data. John Wiley and Sons Inc, New York
Bol'shev LN (1969) On tests for rejecting outlying observations. Trudy Inta prikladnoi Mat. Tblissi Gosudart univ 2:159–177 (In Russian)
Bol'shev LN, Ubaidullaeva M (1974) Chauvinist ± test in the classical theory of errors. Theory Prob Appl 19:683–692
Chikkagoudar MS, Kunchur SH (1983) Distribution of test statistics for multiple outliers in exponential samples. Comm Stat Theory and Meth 12:2127–2142
Greenwood, and Nikulin PE (1996) A guide to chi-squared testing, John Wiley and Sons, Inc, New York
Ibrakimov IA, Khalna, (1978) Some asymptotic results concerning the Chauvenet test. Ter Veroyatnost i Primenen 23(3):593–597
Jabbari Nooghabi M, Jabbari Nooghabi H, Nasiri P (2010) Detecting outliers in gamma distribution. Commun Statist Theor Meth 39(4):698–706
Kabe DG (1970) Testing outliers from an exponential population. Metrika 15:15–18
Laurent, and Andre G (1963) Conditional distribution of order statistics and distribution of the reduced ith order statistic of the exponential model. Ann Math Statist 34: 652-657
Lewis T, Fiellerm NRJ (1979) A recursive algorithm for null distribution for outliers: I. Gamma samples Technometrics 21:371–376
Likes J (1966) Distribution of Dixon's statistics in the case of an exponential population. Metrika 11:46–54 (91, 96, 136, 198-200, 204, 209, 210)
Voinov VG, Nikulin MN (1993) Unbaised estimators and their applications, 1. Kluwer Academic Publishers, Dordrecht
Voinov VG, Nikulin MN (1996) Unbaised estimators and their applications, 2. Kluwer Academic Publishers, Dordrecht
Zerbet A, Nikulin MN (2003) A new statistic for detecting outliers in exponential case. Commun Statist Theor Meth 32(3):573–583
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Deiri, E. A new statistic for detecting outliers in Rayligh distribution. Life Cycle Reliab Saf Eng 10, 135–138 (2021). https://doi.org/10.1007/s41872-020-00150-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41872-020-00150-z