Exact distributions of tests of outliers for exponential samples

Kumar, Nirpeksh

doi:10.1007/s00362-017-0908-6

Exact distributions of tests of outliers for exponential samples

Regular Article
Published: 29 April 2017

Volume 60, pages 2031–2061, (2019)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Nirpeksh Kumar¹

280 Accesses
5 Citations
Explore all metrics

Abstract

In this paper, we propose an algorithm to derive the exact distributions of discordancy tests for exponential samples under the slippage alternative providing that their survival functions involve the linear combinations of independent and identically distributed exponential random variables with arbitrary real coefficients. In addition, we define the various performance measures in terms of conditional probabilities that the observed value of the test statistic exceeds the critical value given that the contaminants have the specific position numbers in the ordered sample. These make possible to calculate various performance measures of discordancy tests for the exponential samples to any desired degree of accuracy. For the purpose of illustration, we derive the distributions of the maximum likelihood ratio tests for testing single and multiple outliers in the exponential samples and then we calculate their performance measures accurately to six decimal places. Moreover, the definitions of the performance criteria are not restricted to the discordancy tests for exponential samples only, they are also equally applicable to the discordancy tests for samples from other distributions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discordancy tests for two-parameter exponential samples

Article 13 May 2014

Package mTEXO for testing the presence of outliers in exponential samples

Article 04 October 2018

A new statistic for detecting outliers in Rayligh distribution

Article Open access 23 September 2020

References

Balasooriya U, Gadag V (1994) Tests for upper outliers in the two-parameter exponential distribution. J Stat Comput Simul 50:249–259
Article Google Scholar
Barnett VA, Lewis T (1994) Outliers in statistical data. Wiley, Chichester
MATH Google Scholar
Chikkagoudar MS, Kunchur SM (1983) Distributions of test statistics for multiple outliers in exponential samples. Commun Stat 12:2127–2142
Article MathSciNet Google Scholar
Chikkagoudar MS, Kunchur SM (1987) Comparison of many outlier procedures for exponential samples. Commun Stat 16:627–645
Article MathSciNet Google Scholar
Dixon WJ (1950) Analysis of extreme values. Ann Math Stat 21:488–506
Article MathSciNet Google Scholar
Dumitrescu MEB, Enchescu DN, Hristea FT (1994) On the performances of an outlier test in the case of the exponential distribution. Comput Stat Data Anal 17(2):119–127
Article Google Scholar
Fieller N (2014) Multivariate outliers. Wiley, New York
Book Google Scholar
Fisher RA (1929) Tests of significance in harmonic analysis. Proc R Stat Soc Ser A 125:54–59
Article Google Scholar
Fung KY, Paul SR (1985) Comparison of outlier detection procedures in Weibull or extreme-value distribution. Commun Stat 14:895–917
Article Google Scholar
Giraudeau B, Chastang C (1999) Two discordancy tests for location slippage and dispersion slippage outlier detection in agreement studies. Statistician 48:517–527
Google Scholar
Hadi AS (1994) A modification of a method for the detection of outliers in multivariate samples. J Roy Stat Soc B 56:393–396
MATH Google Scholar
Hawkins DM (1972) Analysis of a slippage test for the chi squared distribution. S Afr Stat J 6:11–17
MATH Google Scholar
Hawkins DM (1980) Identification of outliers. Chapman and Hall, London
Book Google Scholar
Hayes K, Kinsella T (2003) Spurious and non-spurious power in performance criteria for tests of discordancy. J R Stat Soc Ser D (Stat) 52:69–82
Article MathSciNet Google Scholar
Huffer F (1988) Divided differences and the joint distribution of linear combinations of spacings. J Appl Probab 25:346–354
Article MathSciNet Google Scholar
Huffer FW, Lin CT (2001) Computing the joint distribution of general linear combinations of spacings or exponential variates. Stat Sin 11:1141–1157
MathSciNet MATH Google Scholar
Huffer FW, Lin CT (2006) Linear combinations of spacings. In: Kotz S, Balakrishnan N, Read CB, Vidakovic B (eds) Encyclopedia of statistical sciences, Wiley, Hoboken, vol 12, pp 7866–7875
Jain RB, Pingel LA (1981) A procedure for estimating the number of outliers. Commun Stat 10:1029–1041
Article Google Scholar
Johnson NL, Kotz S, Balakrishnan N (1994) Continuous univariate distributions. Wiley, New York
MATH Google Scholar
Kabe DG (1970) Testing outliers from an exponential population. Metrica 15:15–18
Article Google Scholar
Kimber AC (1982) Tests for many outliers in an exponential sample. Appl Stat 31:263–271
Article MathSciNet Google Scholar
Kimber AC (1988) Testing upper and lower outlier pairs in gamma samples. Commun Stat Simul 17:1055–1072
Article Google Scholar
Kimber AC, Stevens HJ (1981) The null distribution of a test for two upper outliers in an exponential samples. Appl Stat 30:153–157
Article MathSciNet Google Scholar
Kumar N (2013) A procedure for testing suspected observations. Stat Pap 54:471–478
Article MathSciNet Google Scholar
Kumar N (2015) Testing of suspected observations in an exponential sample with unknown origin. Commun Stat 44:3668–3679
Article MathSciNet Google Scholar
Kumar N, Lin CT (2017) Testing for multiple upper and lower outliers in an exponential sample. J Stat Comput Simul 87:870–881
Article MathSciNet Google Scholar
Lalitha S, Kumar N (2012) Multiple outlier test for upper outliers in an exponential sample. J Appl Stat 39:1323–1330
Article MathSciNet Google Scholar
Lewis T, Fieller NRJ (1979) A recursive algorithm for null distribution for outliers: I. gamma samples. Technometrics 21:371–376
Article MathSciNet Google Scholar
Lin CT, Balakrishnan N (2009) Exact computation of the null distribution of a test for multiple outliers in an exponential sample. Comput Stat Data Anal 53:3281–3290
Article MathSciNet Google Scholar
Lin CT, Wang SC (2015) Discordancy tests for two-parameter exponential samples. Stat Pap 56:569–582
Article MathSciNet Google Scholar
Meeker WQ, Escobar LA (1998) Statistical methods for reliability data. Wiley, New York
MATH Google Scholar
Prescott P (1979) Critical values for a sequential test for many outliers. Appl Stat 28:36–39
Article MathSciNet Google Scholar
Rosner B (1975) On the detection of many outliers. Technometrics 17:221–227
Article MathSciNet Google Scholar
Zerbet A, Nikulin M (2003) A new statistic for detecting outliers in exponential case. Commun Stat 32:573–583
Article MathSciNet Google Scholar
Zhang J (1998) Tests for multiple upper or lower outliers in an exponential sample. J Appl Stat 25:245–255
Article MathSciNet Google Scholar

Download references

Acknowledgements

The author would like to thank two anonymous reviewers and the editor for their helpful and constructive comments that have improved the paper.

Author information

Authors and Affiliations

Department of Statistics, Banaras Hindu University, Varanasi, India
Nirpeksh Kumar

Authors

Nirpeksh Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nirpeksh Kumar.

Appendices

Exact distribution of MLR test for testing a single outlier for the sample size $n=3$

Consider a single outlier problem with $n=3$. The probabilities $\Pr [T>d|B(r)]$ ($r=1,2,3)$ in (2) can be written as

$$\begin{aligned} \Pr [T>d|B(r)] =\Pr \left( \frac{1-3d}{u_1^r}Z_1+\frac{1-2d}{u_2^r}Z_2+\frac{1-d}{u_3^r}Z_3>0\right) . \end{aligned}$$

Using (1), we first calculate

$$\begin{aligned} \Pr [T>d|B(3)] =\Pr \left( \frac{1-3d}{b+2}Z_1+\frac{1-2d}{b+1}Z_2+\frac{1-d}{b}Z_3>0\right) = \Pr \,(\mathbf {AZ}>0), \end{aligned}$$

where $\mathbf {A}=((1-3d)/(b+2),(1-2d)/(b+1),(1-d)/(b))$. Clearly, $\Pr [T>d|B(3)]$ is 1 for $0<d\le 1/3$, and 0 for $d>1$. Letting $\mathbf {c}=(0,(b+1)(1-d)/[bd+(1-d)], -b(1-2d)/[bd+(1-d)])'$, we have $\mathbf {Ac}=0$. Now using recursion from (4) and the properties of deleting zero entries and renumbering the random variables again, we have

$$\begin{aligned}&\Pr [T>d|B(3)] =\frac{(b+1)(1-d)}{bd+(1-d)}\Pr \left( \frac{1-3d}{b+2}Z_1+\frac{1-d}{b}Z_2>0\right) \\&\quad +\frac{-b(1-2d)}{bd+(1-d)}\Pr \left( \frac{1-3d}{b+2}Z_1+\frac{1-2d}{b+1}Z_2>0\right) . \end{aligned}$$

The same process is applied to the obtained two probabilities with $\mathbf {A_1}=((1-3d)/(b+2),(1-d)/b)$ and $\mathbf {A_2}=((1-3d)/(b+2),(1-2d)/(b+1))$. Taking $\mathbf {c_1}=((b+2)(1-d)/[2(bd+(1-d))], -b(1-3d)/[2(bd+(1-d))])'$ to calculate $\Pr (\mathbf {A_1 Z}>0)$ and $\mathbf {c_2}=((b+2)(1-2d)/[(bd+(1-d))], -(b+1)(1-3d)/[(bd+(1-d))])'$ to calculate $\Pr (\mathbf {A_2 Z}>0)$, it follows that

$$\begin{aligned} \Pr [T>d|B(3)] ={\left\{ \begin{array}{ll} \frac{(b+1)(b+2)(1-d)^2}{2[bd+(1-d)]^2}-\frac{b(b+2)(1-2d)^2}{[bd+(1-d)]^2} &{} \frac{1}{3}<d\le \frac{1}{2},\\ {} \frac{(b+1)(b+2)(1-d)^2}{2[bd+(1-d)]^2} &{} \frac{1}{2}<d\le 1. \end{array}\right. } \end{aligned}$$

(31)

In a similar manner, we can obtain

$$\begin{aligned} \Pr [T>d|B(2)] ={\left\{ \begin{array}{ll} \frac{(b+1)(b+2)(1-d)^2}{[d+b(1-d)][1+d+b(1-d)]}-\frac{(b+2)(1-2d)^2}{[d+b(1-d)][bd+(1-d)]} &{} \frac{1}{3}<d\le \frac{1}{2},\\ {} \frac{(b+1)(b+2)(1-d)^2}{[d+b(1-d)][1+d+b(1-d)]} &{} \frac{1}{2}<d\le 1, \end{array}\right. } \end{aligned}$$

(32)

and

$$\begin{aligned} \Pr [T>d|B(1)] ={\left\{ \begin{array}{ll} \frac{2(b+2)(1-d)^2}{[1+d+b(1-d)]}-\frac{(b+2)(1-2d)^2}{[2d+b(1-2d)]} &{} \frac{1}{3}<d\le \frac{1}{2},\\ {} \frac{2(b+2)(1-d)^2}{[1+d+b(1-d)]} &{} \frac{1}{2}<d\le 1. \end{array}\right. } \end{aligned}$$

(33)

From (6), we have

$$\begin{aligned} \Pr [B(1)] = \frac{b}{b+2}, \quad \Pr [B(2)] = \frac{2b}{(b+1)(b+2)}, \quad \text {and}\quad \Pr [B(3)] = \frac{2}{(b+1)(b+2)}. \end{aligned}$$

(34)

Now using (31–34), we get the distribution function of T under H for $n=3$ which is

$$\begin{aligned} \Pr [T<d|H] = 1-\sum _{r=1}^{3}{\Pr [T>d|B(r)] \Pr [B(r)]}. \end{aligned}$$

(35)

The density function of T for $n=3$ can be obtained by differentiating (35) with respect to d. Also, the density function under the labelled slippage $H_n$ can be obtained by differentiating $1-\Pr [T>d|B(3)]$ with respect to d and is equivalent to the density obtained by Chikkagoudar and Kunchur (1983) for $n=3$.

The critical value of test T for a single outlier testing can be obtained from (35) by substituting $b=1$ for the significance level $\alpha =0.05$ which yields to be $d=0.8709$. Now, using (18–19), we can calculate the various performance measures. For example, for the size of the shift $b=0.5$, the P, $\textit{NSP}$, $\textit{SP}$, $\textit{NSE}$ and $\textit{SE}$ are 0.0701, 0.0523, 0.0178, 0.4810 and 0.4488 respectively.

Exact distribution of MLR test for testing two outliers for the sample size $n=3$

Consider two outliers problem with $n=3$. The probabilities $\Pr [T_2>d|B(r,s)]$$((r,s)\in S^{(2)}=\{(1,2),(1,3),(2,3)\})$ (14) can be written as

$$\begin{aligned} \Pr [T_2>d|B(r,s)] =\Pr \left( \frac{2-3d}{u_1^{r,s}}Z_1+\frac{2(1-d)}{u_2^{r,s}}Z_2+\frac{1-d}{u_3^{r,s}}Z_3 >0\right) . \end{aligned}$$

Using (10), we first calculate

$$\begin{aligned} \Pr [T_2>d|B(1,2)]{=} \Pr \left( \frac{2-3d}{2b+1}Z_1+\frac{2(1-d)}{2b}Z_2{+}\frac{1-d}{b}Z_3>0 \right) {=}\Pr (\mathbf {AZ}>0), \end{aligned}$$

where $\mathbf {A}=(((2-3d)/(2b+1),2(1-d)/(2b),(1-d)/b)$. Clearly, $\Pr [T>d|B(1,2)]$ is 1 for $0<d\le 2/3$ and 0 for $d>1$. Letting $\mathbf {c}=(-2(2b+1)(d-1)/(2b+d-bd), (3d-2)(b+1)/(2b+d-b d),0)'$, we have $\mathbf {AC}=0$. Now using recursion from (4) and the properties of deleting zero entries and renumbering the random variables again, we have

$$\begin{aligned} \Pr [T>d|B(1,2)]&=\frac{-2(2b+1)(d-1)}{2b+d-bd}\Pr \left( \frac{2(1-d)}{2b}Z_1+\frac{1-d}{b}Z_2>0\right) \\&\quad +\frac{(3d-2)(b+1)}{2b+d-b d}\Pr \left( \frac{2-3d}{2b+1}Z_1+\frac{1-d}{b}Z_2>0\right) . \end{aligned}$$

The same process is applied to calculate the two probabilities with $\mathbf {A_1}=(2(1-d)/(2b),(1-d)/b)$ and $\mathbf {A_2}=((2-3d)/(2b+1),(1-d)/b)$. Note that $\Pr (\mathbf {A_1 Z}>0)$ is 1 for $0<d\le 1$. Taking $\mathbf {c_2}=((2b+1)(1-d)/[(2b+2d-2bd-1)], (3d-2)/[(2b+2d-2bd-1)])'$ to calculate $\Pr (\mathbf {A_2 Z}>0)$, it follows that

$$\begin{aligned} \Pr [T>d|B(1,2)]= & {} \frac{-2(2b + 1)(d - 1)}{2b + d - bd}\nonumber \\&-\frac{(2b + 1)(3d - 2)(b + 1)(d - 1)}{(2b + d - bd)(2b + 2d - 2bd - 1)},\quad \frac{2}{3}<d\le 1.\nonumber \\ \end{aligned}$$

(36)

In similar manner, we can obtain

$$\begin{aligned} \Pr [T>d|B(1,3)]= & {} \frac{-2(2b + 1)(d - 1)}{2b + d - bd}\nonumber \\&-\frac{(2b + 1)(3d - 2)(b + 1)(d - 1)}{(bd - d + 1)(2b + d - bd)},\quad \frac{2}{3}<d\le 1. \end{aligned}$$

(37)

$$\begin{aligned} \Pr [T>d|B(2,3)]= & {} \frac{-(2b + 1)(d - 1))}{bd - d + 1}\nonumber \\&\quad -\frac{b(2b + 1)(3d - 2)(d - 1)}{(bd - d + 1)^2},\quad \frac{2}{3}<d\le 1.\qquad \end{aligned}$$

(38)

From (11), we have

$$\begin{aligned} \Pr [B(1,2)] =&\frac{2b^2}{(2b + 1)(b + 1)}, \quad \Pr [B(1,3)] = \frac{2b}{(2b + 1)(b + 1)}, \nonumber \\&\quad \text {and}\quad \Pr [B(2,3)] = \frac{1}{2b + 1}. \end{aligned}$$

(39)

Combining the probabilities in (36–38) and using (39), we obtain the distribution function of test statistic $T_2$ as follows.

$$\begin{aligned} \Pr [T_2<d]&= 1-\Pr [T>d|B(1,2)]\Pr [B(1,2)]-\Pr [T>d|B(1,3)]\Pr [B(1,3)]\nonumber \\&\quad -\Pr [T>d|B(2,3)]\Pr [B(2,3)]. \end{aligned}$$

(40)

The null distribution of $T_2$ can be obtained from (40) by letting $b=1$. For the significance level $\alpha =0.05$, the critical value for the test $T_2$ can be calculated from the null distribution of $T_2$ which is equal to $d=0.991559$.

Now, using (22–24) and plugging (36–39) into them, we can calculate the various performance measures for the test. For example, for the size of the shift $b=0.5$, the P, $\textit{NSP}$, $\textit{SP}$, $\textit{SW}$, $\textit{NSE}$, $\textit{SE}$ and $\textit{PSE}$ are 0.0579, 0.0329, 0, 0.0250, 0.4671, 0, 0.4750 and 0.0658 respectively.

Exact distribution of MLR sequential test for testing two outliers for the sample size $n=3$

To calculate the joint probability expression $\Pr [U_1<d_1, U_2<d_2|B(2,3)]$, we first need to calculate the probabilities $\Pr [U_1<d_1|B(2,3)]$ and $\Pr [U_2<d_2|B(2,3)]$. Thus

$$\begin{aligned} \Pr [U_1>d_1|B(2,3)]&= P\left[ \sum _{j=1}^{n}[1-(n-j+1)d_1]\frac{Z_j}{u^{2,3}_j}>0\right] , \end{aligned}$$

and

$$\begin{aligned} \Pr [U_2>d_2|B(2,3)]&= P\left[ \sum _{j=1}^{n-1}[1-(n-j)d_2]\frac{Z_j}{u^{2,3}_j}>0\right] , \end{aligned}$$

where $u^{2,3}_1 =2 b+1$, $u^{2,3}_2 = 2 b$ and $u^{2,3}_3 = b$ have been defined previously in (10).

Following the lines of arguments in calculating the probability in appendix A, we have

$$\begin{aligned} \Pr [U_1>d_1|B(2,3)] = {\left\{ \begin{array}{ll} - \frac{(2b + 1)(2d_1 - 1)}{2bd_1 - 2d_1 + 1} - \frac{2b(2b + 1)(3d_1 - 1)(d_1 - 1)}{(2bd_1 - 2d_1 + 1)(b - d_1 + bd_1 + 1)} &{} 1/3<d_1\le 1/2,\\ {} \frac{(2b + 1)(2d_1 - 1)(2d_1 - 2)}{2bd_1 - 2d_1 + 1} - \frac{2b(2b + 1)(3d_1 - 1)(d_1 - 1)}{(2bd_1 - 2d_1 + 1)(b - d_1 + bd_1 + 1)} &{} 1/2<d_1\le 1, \end{array}\right. } \end{aligned}$$

(41)

and

$$\begin{aligned} \Pr [U_2>d_2|B(2,3)] = -\frac{(2b + 1)(d_2 - 1)}{2bd_2 - d_2 + 1} \quad 1/2<d_2\le 1. \end{aligned}$$

(42)

For $0<d_1\le 1, 0<d_2\le 1$, the joint probability $\Pr [U_1>d_1, U_2>d_2|B(2,3)]$ can be written as $P(\mathbf {AZ}>0|B(2,3))$ using (10) and (14) where

$$\begin{aligned} \mathbf {A} =\begin{pmatrix} \frac{1-3 d_1}{u^{2,3}_1} &{} \frac{1-2 d_1}{u^{2,3}_2} &{} \frac{1- d_1}{u^{2,3}_3}\\ \frac{1-2 d_2}{u^{2,3}_1} &{} \frac{1- d_2}{u^{2,3}_2} &{} 0 \end{pmatrix} \end{aligned}$$

and $u^{2,3}_1 =2 b+1$, $u^{2,3}_2 = 2 b$ and $u^{2,3}_3 = b$.

Let $\mathbf {c}=(-((2b + 1)(d_2 - 1))/(2bd_2 - d_2 + 1),(2b(2d_2 - 1))/(2bd_2 - d_2 + 1),0)'$, then $\mathbf {AC}=((d_2 - d_1 + d_1 d_2)/(2b d_2 - d_2 + 1),0)'$ and using recursion (4), we get

$$\begin{aligned} \Pr [\mathbf {AZ}>0|B(2,3)]= & {} -\frac{(2b + 1)(d_2 - 1)}{2bd_2 - d_2 + 1}\Pr [\mathbf {A_1 Z}>0|B(2,3)]\\&+ \frac{2b(2d_2 - 1)}{2bd_2 - d_2 + 1}\Pr [\mathbf {A_2 Z}>0|B(2,3)], \end{aligned}$$

where

$$\begin{aligned} \mathbf {A_1} =\begin{pmatrix} \frac{d_2 - d_1 + d_1 d_2}{2b d_2 - d_2 + 1} &{} \frac{1-2 d_1}{u^{2,3}_2} &{} \frac{1- d_1}{u^{2,3}_3}\\ 0 &{} \frac{1- d_2}{u^{2,3}_2} &{} 0 \end{pmatrix} \end{aligned}$$

and

$$\begin{aligned} \mathbf {A_2} =\begin{pmatrix} \frac{1-3 d_1}{u^{2,3}_1} &{} -\frac{d_1 - d_2 - d_1 d_2}{2b d_2 - d_2 + 1} &{} \frac{1- d_1}{u^{2,3}_3}\\ \frac{1-2 d_2}{u^{2,3}_1} &{} 0 &{} 0 \end{pmatrix}. \end{aligned}$$

Since $1/2 < d_2 \le 1$, all the entries in second row of $\mathbf {A_2}$ are less than or equal to 0 which implies that $\Pr [\mathbf {A_2 Z} > 0|B(2,3)] = 0$. Moreover, all the entries in second row of $\mathbf {A_1}$ are greater than or equal to 0 which implies that second row of $\mathbf {A_1}$ can be deleted.

Thus, we have

$$\begin{aligned} \Pr [\mathbf {A_1 Z}>0|B(2,3)]= \Pr \left[ \frac{d_2 - d_1 - d_1 d_2}{2b d_2 - d_2 + 1}Z_1+\frac{1-2 d_1}{u^{2,3}_2} Z_2+\frac{1- d_1}{u^{2,3}_3} Z_3 >0\right] . \end{aligned}$$

Before proceeding further, we first check the value of $(d_2 - d_1 - d_1 d_2)/(2b d_2 - d_2 + 1)$. It can be easily shown that

$$\begin{aligned} \frac{d_2 - d_1 - d_1 d_2}{2b d_2 - d_2 + 1}&={\left\{ \begin{array}{ll} -\frac{\beta + 2\sqrt{1 - \beta } - 1}{6b + 2\beta - 2b\beta } &{} \quad \text {for}\; 1/3<d_1\le 1/2, 1/2<d_2\le 1, \\ {} -\frac{\beta + 2\sqrt{3 \beta } + 3}{6b + 2\beta - 2b\beta } &{} \quad \text {for}\; 1/2<d_1 \le 1, 1/2<d_2\le 1, \end{array}\right. }\\&<0, \end{aligned}$$

where

$$\begin{aligned} d_1&= {\left\{ \begin{array}{ll} \frac{1+\sqrt{1 - \beta }}{3} &{} \quad \text {for}\; 1/3<d_1\le 1/2, \\ 1 - \frac{\sqrt{3 \beta }}{3} &{} \quad \text {for}\; 1/2<d_1 \le 1, \end{array}\right. } \end{aligned}$$

(43)

$$\begin{aligned} d_2&=\frac{3-\beta }{\beta + 3} \quad \text {for}\; 1/2 <d_2\le 1. \end{aligned}$$

(44)

The values of $d_1$ and $d_2$ are obtained such that under the null hypothesis $H_0$, $\Pr [U_1>d_1]=\Pr [U_2>d_2]=\beta \ $ to satisfy the condition $\Pr [U_1<d_1,U_2<d_2]=1-\Pr [U_1>d_1]-\Pr [U_2>d_2]+\Pr [U_1>d_1,U_2>d_2]=1-\alpha $ where $\alpha $ is the significance level. Note that under the null hypothesis the $\Pr [U_1>d_1]$ and $\Pr [U_2>d_2]$ are equivalent to the $\Pr [U_1>d_1|B(2,3)]$ and $\Pr [U_2>d_2|B(2,3)]$ respectively.

Using the recursion (4), we obtain

$$\begin{aligned}&\Pr [U_1>d_1,U_2>d_2|B(2,3)] \nonumber \\&\quad ={\left\{ \begin{array}{ll} -\frac{(2b + 1)(3 d_1 + d_2 - 3b d_1 + b d_2 - 3 d_1 d_2 + 4b d_1^2 + 2d_1^2 d_2 - 2d_1^2 - b d_1 d_2 - 1)}{(2b d_1 - 2d_1 {+} 1)(b d_1 - d_2 - d_1 + b d_2 + d_1 d_2 - b d_1 d_2 + 1)} &{} 1/3<d_1 {\le } 1/2, 1/2 {<}d_2\le 1,\\ {} -\frac{2 (2b + 1)(d_1 - 1)^2 (d_2 - 1)}{b d_1 - d_2 - d_1 + b d_2 + d_1 d_2 - b d_1 d_2 + 1} &{} 1/2<d_1 \le 1, 1/2 <d_2\le 1. \end{array}\right. } \end{aligned}$$

(45)

Combining the probabilities obtained in (41), (42) and (45), we can obtain the required probability using

$$\begin{aligned}&\Pr [U_1<d_1,U_2<d_2|B(2,3)]=1-\Pr [U_1>d_1|B(2,3)]-\Pr [U_2>d_2|B(2,3)]\\&\quad +\Pr [U_1>d_1,U_2>d_2|B(2,3)]. \end{aligned}$$

It is also worthwhile to mention that the range of $\beta $ can be determined using Bonferroni’s inequality (Lin and Balakrishnan 2009). Since under $H_0$,

$$\begin{aligned} \Pr \left[ \bigcup \limits _{i=1}^{k}(U_i>d_i)|B(2,3)\right] \le \sum \limits _{i=1}^{k}\Pr [U_i>d_i|B(2,3)] \end{aligned}$$

which gives $\beta \ge \alpha /k$. In particular, when $k=2$, we have $\Pr [U_1>d_1,U_2>d_2|B(2,3)]=2\beta -\alpha \le 1$ which leads to have the range of $\beta $ for testing two upper outliers as $\alpha /k \le \beta \le (1+\alpha )/2$.

In similar way, in order to calculate the probability $\Pr [U_1<d_1,U_2<d_2|B(1,2)]$, we can obtain

$$\begin{aligned} \Pr [U_1>d_1|B(1,2)] = {\left\{ \begin{array}{ll} - \frac{(2b + 1)(2 d_1 - 1)}{b + d_1 - b d_1} - \frac{(2 b + 1)(3 d_1 - 1)(b + 1)(d_1 - 1)}{2(b + d_1 - b d_1)^2} &{} 1/3<d_1\le 1/2,\\ {} \frac{(2b + 1)(2 d_1 - 1)(b + 1)(d1 - 1)}{(b + d_1 - b d_1)^2} - \frac{(2b + 1)(3 d_1 - 1)(b + 1)(d_1 - 1)}{2(b + d_1 - b d_1)^2} &{} 1/2<d_1 \le 1. \end{array}\right. } \end{aligned}$$

(46)

$$\begin{aligned} \Pr [U_2>d_2|B(1,2)] = \frac{(2b + 1)(1-d_2)}{b + d_2} \quad 1/2<d_2 \le 1. \end{aligned}$$

(47)

$$\begin{aligned}&\Pr [U_1>d_1,U_2>d_2|B(1,2)] \nonumber \\&\quad ={\left\{ \begin{array}{ll} -\frac{(2b + 1)(d_2 - 2 d_1 - b + 2b d_1 + b d_2 - 2d_1 d_2 - b d_1^2 + d_1^2 d_2 + 3 d_1^2 - 2 b d_1 d_2 + b d_1^2 d_2)}{(b + d_1 - b d_1)^2} &{} 1/3{<}d_1 \le 1/2, 1/2 {<}d_2\le 1,\\ {} \frac{(2b {+} 1)(b {+} 1)(1{-}d_1)^2 (1-d_2)}{(b + d_1 - b d_1)^2} &{} 1/2<d_1 \le 1, 1/2 <d_2\le 1. \end{array}\right. } \end{aligned}$$

(48)

To calculate, $\Pr [U_1<d_1,U_2<d_2|B(1,3)]$, we get

$$\begin{aligned} \Pr [U_1>d_1|B(1,3)] = {\left\{ \begin{array}{ll} \frac{(2b + 1)(1-2 d_1)}{b + d_1 - b d_1} - \frac{(2b + 1)(1-3 d_1)(b + 1)(1-d_1)}{(b + d_1 - b d_1) (b - d_1 + b d_1 + 1)} &{} 1/3<d_1 \le 1/2,\\ {} \frac{(2b + 1)(1-2 d_1)(b + 1)(1-d_1)}{(b d_1 - d_1 + 1)(b + d_1 - b d_1)} - \frac{(2b + 1)(1-3 d_1)(b + 1)(1-d_1)}{(b + d_1 - b d_1)(b - d_1 + b d_1 + 1)} &{} 1/2<d_1 \le 1. \end{array}\right. } \end{aligned}$$

(49)

$$\begin{aligned} \Pr [U_2>d_2|B(1,3)] = \frac{(2b + 1)(1-d_2)}{b + d_2} \quad 1/2<d_2 \le 1. \end{aligned}$$

(50)

$$\begin{aligned} Z&\Pr [U_1>d_1,U_2>d_2|B(1,3)] \nonumber \\ \quad&={\left\{ \begin{array}{ll} -\frac{(2b + 1)(b d_1 - d_1 - b + 2b d_2 + d_1 d_2 + b d_1^2 - d_1^2 d_2 + d_1^2 - 5b d_1 d_2 + 3b d_1^2 d_2)}{(b + d_1 - b d_1)(b + d_2 - b d_2 - d_1 d_2 + b d_1 d_2)} &{} 1/3{<}d_1\le 1/2, 1/2 {<}d_2\le 1,\\ {} \frac{(2b + 1)(b + 1)(1-d_1)^2(1-d_2)}{(b d_1 - d_1 + 1)(b + d_2 - b d_2 - d_1 d_2 + b d_1 d_2)} &{} 1/2<d_1\le 1, 1/2 <d_2\le 1. \end{array}\right. } \end{aligned}$$

(51)

Finally, we can obtain the joint distribution of $(U_1,U_2)$ by plugging the probabilities in (41–42), (45–51) and (39) as follows

$$\begin{aligned} \Pr [U_1<d_1,U_2<d_2|\bar{H}]&=1-\sum _{(r,s)\in S^{(2)}}{\Pr [U_1>d_1|B(r,s)]\Pr [B(r,s)]}\\&\quad - \sum _{(r,s)\in S^{(2)}}{\Pr [U_2>d_2|B(r,s)]\Pr [B(r,s)]} \\&\quad +\sum _{(r,s)\in S^{(2)}}{\Pr [U_1>d_1,U_2>d_2|B(r,s)]\Pr [B(r,s)]}, \end{aligned}$$

where $S^{(2)}=\{(1,2), (1,3), (2,3)\}$.

Note that the exact critical values of the sequential tests for testing $k=2$ upper outliers can be obtained by plugging the expressions of $d_1$ in (43) and $d_2$ in (44) into

$$\begin{aligned} \Pr [U_1<d_1,U_2<d_2|\bar{H}] =1-2 \beta + \Pr [U_1>d_1,U_2>d_2|\bar{H}] =1-\alpha , \end{aligned}$$

where $\alpha $ is the significance level.

Therefore, for $\alpha =0.05$, we obtain the value of $\beta =0.025427$ which yields the critical values to be $d_1= 0.907936$ and $d_2=0.983191$ using (43) and (44).

Once we calculate the critical values, we can compute the various performance measures discussed in Sect. 6 by plugging the probabilities in (41–51) and (39) for different size of shifts $b<1$. For example, for $b=0.5$, the P, $\textit{NSP}$, $\textit{SP}$, $\textit{NSE}$, ${ PL}_1$, ${ PL}_2$ and ${ PL}_3$ are 0.00113, 0.00057, 0, 0.48379, 0.00008, 0.00048 and 0.01564 respectively.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kumar, N. Exact distributions of tests of outliers for exponential samples. Stat Papers 60, 2031–2061 (2019). https://doi.org/10.1007/s00362-017-0908-6

Download citation

Received: 11 September 2015
Revised: 05 April 2017
Published: 29 April 2017
Issue Date: December 2019
DOI: https://doi.org/10.1007/s00362-017-0908-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exact distributions of tests of outliers for exponential samples

Abstract

Access this article

Similar content being viewed by others

Discordancy tests for two-parameter exponential samples

Package mTEXO for testing the presence of outliers in exponential samples

A new statistic for detecting outliers in Rayligh distribution

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Exact distribution of MLR test for testing a single outlier for the sample size \(n=3\)

Exact distribution of MLR test for testing two outliers for the sample size \(n=3\)

Exact distribution of MLR sequential test for testing two outliers for the sample size \(n=3\)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Exact distributions of tests of outliers for exponential samples

Abstract

Access this article

Similar content being viewed by others

Discordancy tests for two-parameter exponential samples

Package mTEXO for testing the presence of outliers in exponential samples

A new statistic for detecting outliers in Rayligh distribution

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Exact distribution of MLR test for testing a single outlier for the sample size \(n=3\)

Exact distribution of MLR test for testing two outliers for the sample size \(n=3\)

Exact distribution of MLR sequential test for testing two outliers for the sample size \(n=3\)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation