Skip to main content
Log in

Exact distributions of tests of outliers for exponential samples

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

In this paper, we propose an algorithm to derive the exact distributions of discordancy tests for exponential samples under the slippage alternative providing that their survival functions involve the linear combinations of independent and identically distributed exponential random variables with arbitrary real coefficients. In addition, we define the various performance measures in terms of conditional probabilities that the observed value of the test statistic exceeds the critical value given that the contaminants have the specific position numbers in the ordered sample. These make possible to calculate various performance measures of discordancy tests for the exponential samples to any desired degree of accuracy. For the purpose of illustration, we derive the distributions of the maximum likelihood ratio tests for testing single and multiple outliers in the exponential samples and then we calculate their performance measures accurately to six decimal places. Moreover, the definitions of the performance criteria are not restricted to the discordancy tests for exponential samples only, they are also equally applicable to the discordancy tests for samples from other distributions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Balasooriya U, Gadag V (1994) Tests for upper outliers in the two-parameter exponential distribution. J Stat Comput Simul 50:249–259

    Article  Google Scholar 

  • Barnett VA, Lewis T (1994) Outliers in statistical data. Wiley, Chichester

    MATH  Google Scholar 

  • Chikkagoudar MS, Kunchur SM (1983) Distributions of test statistics for multiple outliers in exponential samples. Commun Stat 12:2127–2142

    Article  MathSciNet  Google Scholar 

  • Chikkagoudar MS, Kunchur SM (1987) Comparison of many outlier procedures for exponential samples. Commun Stat 16:627–645

    Article  MathSciNet  Google Scholar 

  • Dixon WJ (1950) Analysis of extreme values. Ann Math Stat 21:488–506

    Article  MathSciNet  Google Scholar 

  • Dumitrescu MEB, Enchescu DN, Hristea FT (1994) On the performances of an outlier test in the case of the exponential distribution. Comput Stat Data Anal 17(2):119–127

    Article  Google Scholar 

  • Fieller N (2014) Multivariate outliers. Wiley, New York

    Book  Google Scholar 

  • Fisher RA (1929) Tests of significance in harmonic analysis. Proc R Stat Soc Ser A 125:54–59

    Article  Google Scholar 

  • Fung KY, Paul SR (1985) Comparison of outlier detection procedures in Weibull or extreme-value distribution. Commun Stat 14:895–917

    Article  Google Scholar 

  • Giraudeau B, Chastang C (1999) Two discordancy tests for location slippage and dispersion slippage outlier detection in agreement studies. Statistician 48:517–527

    Google Scholar 

  • Hadi AS (1994) A modification of a method for the detection of outliers in multivariate samples. J Roy Stat Soc B 56:393–396

    MATH  Google Scholar 

  • Hawkins DM (1972) Analysis of a slippage test for the chi squared distribution. S Afr Stat J 6:11–17

    MATH  Google Scholar 

  • Hawkins DM (1980) Identification of outliers. Chapman and Hall, London

    Book  Google Scholar 

  • Hayes K, Kinsella T (2003) Spurious and non-spurious power in performance criteria for tests of discordancy. J R Stat Soc Ser D (Stat) 52:69–82

    Article  MathSciNet  Google Scholar 

  • Huffer F (1988) Divided differences and the joint distribution of linear combinations of spacings. J Appl Probab 25:346–354

    Article  MathSciNet  Google Scholar 

  • Huffer FW, Lin CT (2001) Computing the joint distribution of general linear combinations of spacings or exponential variates. Stat Sin 11:1141–1157

    MathSciNet  MATH  Google Scholar 

  • Huffer FW, Lin CT (2006) Linear combinations of spacings. In: Kotz S, Balakrishnan N, Read CB, Vidakovic B (eds) Encyclopedia of statistical sciences, Wiley, Hoboken, vol 12, pp 7866–7875

  • Jain RB, Pingel LA (1981) A procedure for estimating the number of outliers. Commun Stat 10:1029–1041

    Article  Google Scholar 

  • Johnson NL, Kotz S, Balakrishnan N (1994) Continuous univariate distributions. Wiley, New York

    MATH  Google Scholar 

  • Kabe DG (1970) Testing outliers from an exponential population. Metrica 15:15–18

    Article  Google Scholar 

  • Kimber AC (1982) Tests for many outliers in an exponential sample. Appl Stat 31:263–271

    Article  MathSciNet  Google Scholar 

  • Kimber AC (1988) Testing upper and lower outlier pairs in gamma samples. Commun Stat Simul 17:1055–1072

    Article  Google Scholar 

  • Kimber AC, Stevens HJ (1981) The null distribution of a test for two upper outliers in an exponential samples. Appl Stat 30:153–157

    Article  MathSciNet  Google Scholar 

  • Kumar N (2013) A procedure for testing suspected observations. Stat Pap 54:471–478

    Article  MathSciNet  Google Scholar 

  • Kumar N (2015) Testing of suspected observations in an exponential sample with unknown origin. Commun Stat 44:3668–3679

    Article  MathSciNet  Google Scholar 

  • Kumar N, Lin CT (2017) Testing for multiple upper and lower outliers in an exponential sample. J Stat Comput Simul 87:870–881

    Article  MathSciNet  Google Scholar 

  • Lalitha S, Kumar N (2012) Multiple outlier test for upper outliers in an exponential sample. J Appl Stat 39:1323–1330

    Article  MathSciNet  Google Scholar 

  • Lewis T, Fieller NRJ (1979) A recursive algorithm for null distribution for outliers: I. gamma samples. Technometrics 21:371–376

    Article  MathSciNet  Google Scholar 

  • Lin CT, Balakrishnan N (2009) Exact computation of the null distribution of a test for multiple outliers in an exponential sample. Comput Stat Data Anal 53:3281–3290

    Article  MathSciNet  Google Scholar 

  • Lin CT, Wang SC (2015) Discordancy tests for two-parameter exponential samples. Stat Pap 56:569–582

    Article  MathSciNet  Google Scholar 

  • Meeker WQ, Escobar LA (1998) Statistical methods for reliability data. Wiley, New York

    MATH  Google Scholar 

  • Prescott P (1979) Critical values for a sequential test for many outliers. Appl Stat 28:36–39

    Article  MathSciNet  Google Scholar 

  • Rosner B (1975) On the detection of many outliers. Technometrics 17:221–227

    Article  MathSciNet  Google Scholar 

  • Zerbet A, Nikulin M (2003) A new statistic for detecting outliers in exponential case. Commun Stat 32:573–583

    Article  MathSciNet  Google Scholar 

  • Zhang J (1998) Tests for multiple upper or lower outliers in an exponential sample. J Appl Stat 25:245–255

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The author would like to thank two anonymous reviewers and the editor for their helpful and constructive comments that have improved the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nirpeksh Kumar.

Appendices

Exact distribution of MLR test for testing a single outlier for the sample size \(n=3\)

Consider a single outlier problem with \(n=3\). The probabilities \(\Pr [T>d|B(r)]\) (\(r=1,2,3)\) in (2) can be written as

$$\begin{aligned} \Pr [T>d|B(r)] =\Pr \left( \frac{1-3d}{u_1^r}Z_1+\frac{1-2d}{u_2^r}Z_2+\frac{1-d}{u_3^r}Z_3>0\right) . \end{aligned}$$

Using (1), we first calculate

$$\begin{aligned} \Pr [T>d|B(3)] =\Pr \left( \frac{1-3d}{b+2}Z_1+\frac{1-2d}{b+1}Z_2+\frac{1-d}{b}Z_3>0\right) = \Pr \,(\mathbf {AZ}>0), \end{aligned}$$

where \(\mathbf {A}=((1-3d)/(b+2),(1-2d)/(b+1),(1-d)/(b))\). Clearly, \(\Pr [T>d|B(3)]\) is 1 for \(0<d\le 1/3\), and 0 for \(d>1\). Letting \(\mathbf {c}=(0,(b+1)(1-d)/[bd+(1-d)], -b(1-2d)/[bd+(1-d)])'\), we have \(\mathbf {Ac}=0\). Now using recursion from (4) and the properties of deleting zero entries and renumbering the random variables again, we have

$$\begin{aligned}&\Pr [T>d|B(3)] =\frac{(b+1)(1-d)}{bd+(1-d)}\Pr \left( \frac{1-3d}{b+2}Z_1+\frac{1-d}{b}Z_2>0\right) \\&\quad +\frac{-b(1-2d)}{bd+(1-d)}\Pr \left( \frac{1-3d}{b+2}Z_1+\frac{1-2d}{b+1}Z_2>0\right) . \end{aligned}$$

The same process is applied to the obtained two probabilities with \(\mathbf {A_1}=((1-3d)/(b+2),(1-d)/b)\) and \(\mathbf {A_2}=((1-3d)/(b+2),(1-2d)/(b+1))\). Taking \(\mathbf {c_1}=((b+2)(1-d)/[2(bd+(1-d))], -b(1-3d)/[2(bd+(1-d))])'\) to calculate \(\Pr (\mathbf {A_1 Z}>0)\) and \(\mathbf {c_2}=((b+2)(1-2d)/[(bd+(1-d))], -(b+1)(1-3d)/[(bd+(1-d))])'\) to calculate \(\Pr (\mathbf {A_2 Z}>0)\), it follows that

$$\begin{aligned} \Pr [T>d|B(3)] ={\left\{ \begin{array}{ll} \frac{(b+1)(b+2)(1-d)^2}{2[bd+(1-d)]^2}-\frac{b(b+2)(1-2d)^2}{[bd+(1-d)]^2} &{} \frac{1}{3}<d\le \frac{1}{2},\\ {} \frac{(b+1)(b+2)(1-d)^2}{2[bd+(1-d)]^2} &{} \frac{1}{2}<d\le 1. \end{array}\right. } \end{aligned}$$
(31)

In a similar manner, we can obtain

$$\begin{aligned} \Pr [T>d|B(2)] ={\left\{ \begin{array}{ll} \frac{(b+1)(b+2)(1-d)^2}{[d+b(1-d)][1+d+b(1-d)]}-\frac{(b+2)(1-2d)^2}{[d+b(1-d)][bd+(1-d)]} &{} \frac{1}{3}<d\le \frac{1}{2},\\ {} \frac{(b+1)(b+2)(1-d)^2}{[d+b(1-d)][1+d+b(1-d)]} &{} \frac{1}{2}<d\le 1, \end{array}\right. } \end{aligned}$$
(32)

and

$$\begin{aligned} \Pr [T>d|B(1)] ={\left\{ \begin{array}{ll} \frac{2(b+2)(1-d)^2}{[1+d+b(1-d)]}-\frac{(b+2)(1-2d)^2}{[2d+b(1-2d)]} &{} \frac{1}{3}<d\le \frac{1}{2},\\ {} \frac{2(b+2)(1-d)^2}{[1+d+b(1-d)]} &{} \frac{1}{2}<d\le 1. \end{array}\right. } \end{aligned}$$
(33)

From (6), we have

$$\begin{aligned} \Pr [B(1)] = \frac{b}{b+2}, \quad \Pr [B(2)] = \frac{2b}{(b+1)(b+2)}, \quad \text {and}\quad \Pr [B(3)] = \frac{2}{(b+1)(b+2)}. \end{aligned}$$
(34)

Now using (3134), we get the distribution function of T under H for \(n=3\) which is

$$\begin{aligned} \Pr [T<d|H] = 1-\sum _{r=1}^{3}{\Pr [T>d|B(r)] \Pr [B(r)]}. \end{aligned}$$
(35)

The density function of T for \(n=3\) can be obtained by differentiating (35) with respect to d. Also, the density function under the labelled slippage \(H_n\) can be obtained by differentiating \(1-\Pr [T>d|B(3)]\) with respect to d and is equivalent to the density obtained by Chikkagoudar and Kunchur (1983) for \(n=3\).

The critical value of test T for a single outlier testing can be obtained from (35) by substituting \(b=1\) for the significance level \(\alpha =0.05\) which yields to be \(d=0.8709\). Now, using (1819), we can calculate the various performance measures. For example, for the size of the shift \(b=0.5\), the P, \(\textit{NSP}\), \(\textit{SP}\), \(\textit{NSE}\) and \(\textit{SE}\) are 0.0701, 0.0523, 0.0178, 0.4810 and 0.4488 respectively.

Exact distribution of MLR test for testing two outliers for the sample size \(n=3\)

Consider two outliers problem with \(n=3\). The probabilities \(\Pr [T_2>d|B(r,s)]\)\(((r,s)\in S^{(2)}=\{(1,2),(1,3),(2,3)\})\) (14) can be written as

$$\begin{aligned} \Pr [T_2>d|B(r,s)] =\Pr \left( \frac{2-3d}{u_1^{r,s}}Z_1+\frac{2(1-d)}{u_2^{r,s}}Z_2+\frac{1-d}{u_3^{r,s}}Z_3 >0\right) . \end{aligned}$$

Using (10), we first calculate

$$\begin{aligned} \Pr [T_2>d|B(1,2)]{=} \Pr \left( \frac{2-3d}{2b+1}Z_1+\frac{2(1-d)}{2b}Z_2{+}\frac{1-d}{b}Z_3>0 \right) {=}\Pr (\mathbf {AZ}>0), \end{aligned}$$

where \(\mathbf {A}=(((2-3d)/(2b+1),2(1-d)/(2b),(1-d)/b)\). Clearly, \(\Pr [T>d|B(1,2)]\) is 1 for \(0<d\le 2/3\) and 0 for \(d>1\). Letting \(\mathbf {c}=(-2(2b+1)(d-1)/(2b+d-bd), (3d-2)(b+1)/(2b+d-b d),0)'\), we have \(\mathbf {AC}=0\). Now using recursion from (4) and the properties of deleting zero entries and renumbering the random variables again, we have

$$\begin{aligned} \Pr [T>d|B(1,2)]&=\frac{-2(2b+1)(d-1)}{2b+d-bd}\Pr \left( \frac{2(1-d)}{2b}Z_1+\frac{1-d}{b}Z_2>0\right) \\&\quad +\frac{(3d-2)(b+1)}{2b+d-b d}\Pr \left( \frac{2-3d}{2b+1}Z_1+\frac{1-d}{b}Z_2>0\right) . \end{aligned}$$

The same process is applied to calculate the two probabilities with \(\mathbf {A_1}=(2(1-d)/(2b),(1-d)/b)\) and \(\mathbf {A_2}=((2-3d)/(2b+1),(1-d)/b)\). Note that \(\Pr (\mathbf {A_1 Z}>0)\) is 1 for \(0<d\le 1\). Taking \(\mathbf {c_2}=((2b+1)(1-d)/[(2b+2d-2bd-1)], (3d-2)/[(2b+2d-2bd-1)])'\) to calculate \(\Pr (\mathbf {A_2 Z}>0)\), it follows that

$$\begin{aligned} \Pr [T>d|B(1,2)]= & {} \frac{-2(2b + 1)(d - 1)}{2b + d - bd}\nonumber \\&-\frac{(2b + 1)(3d - 2)(b + 1)(d - 1)}{(2b + d - bd)(2b + 2d - 2bd - 1)},\quad \frac{2}{3}<d\le 1.\nonumber \\ \end{aligned}$$
(36)

In similar manner, we can obtain

$$\begin{aligned} \Pr [T>d|B(1,3)]= & {} \frac{-2(2b + 1)(d - 1)}{2b + d - bd}\nonumber \\&-\frac{(2b + 1)(3d - 2)(b + 1)(d - 1)}{(bd - d + 1)(2b + d - bd)},\quad \frac{2}{3}<d\le 1. \end{aligned}$$
(37)
$$\begin{aligned} \Pr [T>d|B(2,3)]= & {} \frac{-(2b + 1)(d - 1))}{bd - d + 1}\nonumber \\&\quad -\frac{b(2b + 1)(3d - 2)(d - 1)}{(bd - d + 1)^2},\quad \frac{2}{3}<d\le 1.\qquad \end{aligned}$$
(38)

From (11), we have

$$\begin{aligned} \Pr [B(1,2)] =&\frac{2b^2}{(2b + 1)(b + 1)}, \quad \Pr [B(1,3)] = \frac{2b}{(2b + 1)(b + 1)}, \nonumber \\&\quad \text {and}\quad \Pr [B(2,3)] = \frac{1}{2b + 1}. \end{aligned}$$
(39)

Combining the probabilities in (3638) and using (39), we obtain the distribution function of test statistic \(T_2\) as follows.

$$\begin{aligned} \Pr [T_2<d]&= 1-\Pr [T>d|B(1,2)]\Pr [B(1,2)]-\Pr [T>d|B(1,3)]\Pr [B(1,3)]\nonumber \\&\quad -\Pr [T>d|B(2,3)]\Pr [B(2,3)]. \end{aligned}$$
(40)

The null distribution of \(T_2\) can be obtained from (40) by letting \(b=1\). For the significance level \(\alpha =0.05\), the critical value for the test \(T_2\) can be calculated from the null distribution of \(T_2\) which is equal to \(d=0.991559\).

Now, using (2224) and plugging (3639) into them, we can calculate the various performance measures for the test. For example, for the size of the shift \(b=0.5\), the P, \(\textit{NSP}\), \(\textit{SP}\), \(\textit{SW}\), \(\textit{NSE}\), \(\textit{SE}\) and \(\textit{PSE}\) are 0.0579, 0.0329, 0, 0.0250, 0.4671, 0, 0.4750 and 0.0658 respectively.

Exact distribution of MLR sequential test for testing two outliers for the sample size \(n=3\)

To calculate the joint probability expression \(\Pr [U_1<d_1, U_2<d_2|B(2,3)]\), we first need to calculate the probabilities \(\Pr [U_1<d_1|B(2,3)]\) and \(\Pr [U_2<d_2|B(2,3)]\). Thus

$$\begin{aligned} \Pr [U_1>d_1|B(2,3)]&= P\left[ \sum _{j=1}^{n}[1-(n-j+1)d_1]\frac{Z_j}{u^{2,3}_j}>0\right] , \end{aligned}$$

and

$$\begin{aligned} \Pr [U_2>d_2|B(2,3)]&= P\left[ \sum _{j=1}^{n-1}[1-(n-j)d_2]\frac{Z_j}{u^{2,3}_j}>0\right] , \end{aligned}$$

where \(u^{2,3}_1 =2 b+1\), \(u^{2,3}_2 = 2 b\) and \(u^{2,3}_3 = b\) have been defined previously in (10).

Following the lines of arguments in calculating the probability in appendix A, we have

$$\begin{aligned} \Pr [U_1>d_1|B(2,3)] = {\left\{ \begin{array}{ll} - \frac{(2b + 1)(2d_1 - 1)}{2bd_1 - 2d_1 + 1} - \frac{2b(2b + 1)(3d_1 - 1)(d_1 - 1)}{(2bd_1 - 2d_1 + 1)(b - d_1 + bd_1 + 1)} &{} 1/3<d_1\le 1/2,\\ {} \frac{(2b + 1)(2d_1 - 1)(2d_1 - 2)}{2bd_1 - 2d_1 + 1} - \frac{2b(2b + 1)(3d_1 - 1)(d_1 - 1)}{(2bd_1 - 2d_1 + 1)(b - d_1 + bd_1 + 1)} &{} 1/2<d_1\le 1, \end{array}\right. } \end{aligned}$$
(41)

and

$$\begin{aligned} \Pr [U_2>d_2|B(2,3)] = -\frac{(2b + 1)(d_2 - 1)}{2bd_2 - d_2 + 1} \quad 1/2<d_2\le 1. \end{aligned}$$
(42)

For \(0<d_1\le 1, 0<d_2\le 1\), the joint probability \(\Pr [U_1>d_1, U_2>d_2|B(2,3)]\) can be written as \(P(\mathbf {AZ}>0|B(2,3))\) using (10) and (14) where

$$\begin{aligned} \mathbf {A} =\begin{pmatrix} \frac{1-3 d_1}{u^{2,3}_1} &{} \frac{1-2 d_1}{u^{2,3}_2} &{} \frac{1- d_1}{u^{2,3}_3}\\ \frac{1-2 d_2}{u^{2,3}_1} &{} \frac{1- d_2}{u^{2,3}_2} &{} 0 \end{pmatrix} \end{aligned}$$

and \(u^{2,3}_1 =2 b+1\), \(u^{2,3}_2 = 2 b\) and \(u^{2,3}_3 = b\).

Let \(\mathbf {c}=(-((2b + 1)(d_2 - 1))/(2bd_2 - d_2 + 1),(2b(2d_2 - 1))/(2bd_2 - d_2 + 1),0)'\), then \(\mathbf {AC}=((d_2 - d_1 + d_1 d_2)/(2b d_2 - d_2 + 1),0)'\) and using recursion (4), we get

$$\begin{aligned} \Pr [\mathbf {AZ}>0|B(2,3)]= & {} -\frac{(2b + 1)(d_2 - 1)}{2bd_2 - d_2 + 1}\Pr [\mathbf {A_1 Z}>0|B(2,3)]\\&+ \frac{2b(2d_2 - 1)}{2bd_2 - d_2 + 1}\Pr [\mathbf {A_2 Z}>0|B(2,3)], \end{aligned}$$

where

$$\begin{aligned} \mathbf {A_1} =\begin{pmatrix} \frac{d_2 - d_1 + d_1 d_2}{2b d_2 - d_2 + 1} &{} \frac{1-2 d_1}{u^{2,3}_2} &{} \frac{1- d_1}{u^{2,3}_3}\\ 0 &{} \frac{1- d_2}{u^{2,3}_2} &{} 0 \end{pmatrix} \end{aligned}$$

and

$$\begin{aligned} \mathbf {A_2} =\begin{pmatrix} \frac{1-3 d_1}{u^{2,3}_1} &{} -\frac{d_1 - d_2 - d_1 d_2}{2b d_2 - d_2 + 1} &{} \frac{1- d_1}{u^{2,3}_3}\\ \frac{1-2 d_2}{u^{2,3}_1} &{} 0 &{} 0 \end{pmatrix}. \end{aligned}$$

Since \(1/2 < d_2 \le 1\), all the entries in second row of \(\mathbf {A_2}\) are less than or equal to 0 which implies that \(\Pr [\mathbf {A_2 Z} > 0|B(2,3)] = 0\). Moreover, all the entries in second row of \(\mathbf {A_1}\) are greater than or equal to 0 which implies that second row of \(\mathbf {A_1}\) can be deleted.

Thus, we have

$$\begin{aligned} \Pr [\mathbf {A_1 Z}>0|B(2,3)]= \Pr \left[ \frac{d_2 - d_1 - d_1 d_2}{2b d_2 - d_2 + 1}Z_1+\frac{1-2 d_1}{u^{2,3}_2} Z_2+\frac{1- d_1}{u^{2,3}_3} Z_3 >0\right] . \end{aligned}$$

Before proceeding further, we first check the value of \((d_2 - d_1 - d_1 d_2)/(2b d_2 - d_2 + 1)\). It can be easily shown that

$$\begin{aligned} \frac{d_2 - d_1 - d_1 d_2}{2b d_2 - d_2 + 1}&={\left\{ \begin{array}{ll} -\frac{\beta + 2\sqrt{1 - \beta } - 1}{6b + 2\beta - 2b\beta } &{} \quad \text {for}\; 1/3<d_1\le 1/2, 1/2<d_2\le 1, \\ {} -\frac{\beta + 2\sqrt{3 \beta } + 3}{6b + 2\beta - 2b\beta } &{} \quad \text {for}\; 1/2<d_1 \le 1, 1/2<d_2\le 1, \end{array}\right. }\\&<0, \end{aligned}$$

where

$$\begin{aligned} d_1&= {\left\{ \begin{array}{ll} \frac{1+\sqrt{1 - \beta }}{3} &{} \quad \text {for}\; 1/3<d_1\le 1/2, \\ 1 - \frac{\sqrt{3 \beta }}{3} &{} \quad \text {for}\; 1/2<d_1 \le 1, \end{array}\right. } \end{aligned}$$
(43)
$$\begin{aligned} d_2&=\frac{3-\beta }{\beta + 3} \quad \text {for}\; 1/2 <d_2\le 1. \end{aligned}$$
(44)

The values of \(d_1\) and \(d_2\) are obtained such that under the null hypothesis \(H_0\), \(\Pr [U_1>d_1]=\Pr [U_2>d_2]=\beta \ \) to satisfy the condition \(\Pr [U_1<d_1,U_2<d_2]=1-\Pr [U_1>d_1]-\Pr [U_2>d_2]+\Pr [U_1>d_1,U_2>d_2]=1-\alpha \) where \(\alpha \) is the significance level. Note that under the null hypothesis the \(\Pr [U_1>d_1]\) and \(\Pr [U_2>d_2]\) are equivalent to the \(\Pr [U_1>d_1|B(2,3)]\) and \(\Pr [U_2>d_2|B(2,3)]\) respectively.

Using the recursion (4), we obtain

$$\begin{aligned}&\Pr [U_1>d_1,U_2>d_2|B(2,3)] \nonumber \\&\quad ={\left\{ \begin{array}{ll} -\frac{(2b + 1)(3 d_1 + d_2 - 3b d_1 + b d_2 - 3 d_1 d_2 + 4b d_1^2 + 2d_1^2 d_2 - 2d_1^2 - b d_1 d_2 - 1)}{(2b d_1 - 2d_1 {+} 1)(b d_1 - d_2 - d_1 + b d_2 + d_1 d_2 - b d_1 d_2 + 1)} &{} 1/3<d_1 {\le } 1/2, 1/2 {<}d_2\le 1,\\ {} -\frac{2 (2b + 1)(d_1 - 1)^2 (d_2 - 1)}{b d_1 - d_2 - d_1 + b d_2 + d_1 d_2 - b d_1 d_2 + 1} &{} 1/2<d_1 \le 1, 1/2 <d_2\le 1. \end{array}\right. } \end{aligned}$$
(45)

Combining the probabilities obtained in (41), (42) and (45), we can obtain the required probability using

$$\begin{aligned}&\Pr [U_1<d_1,U_2<d_2|B(2,3)]=1-\Pr [U_1>d_1|B(2,3)]-\Pr [U_2>d_2|B(2,3)]\\&\quad +\Pr [U_1>d_1,U_2>d_2|B(2,3)]. \end{aligned}$$

It is also worthwhile to mention that the range of \(\beta \) can be determined using Bonferroni’s inequality (Lin and Balakrishnan 2009). Since under \(H_0\),

$$\begin{aligned} \Pr \left[ \bigcup \limits _{i=1}^{k}(U_i>d_i)|B(2,3)\right] \le \sum \limits _{i=1}^{k}\Pr [U_i>d_i|B(2,3)] \end{aligned}$$

which gives \(\beta \ge \alpha /k\). In particular, when \(k=2\), we have \(\Pr [U_1>d_1,U_2>d_2|B(2,3)]=2\beta -\alpha \le 1\) which leads to have the range of \(\beta \) for testing two upper outliers as \(\alpha /k \le \beta \le (1+\alpha )/2\).

In similar way, in order to calculate the probability \(\Pr [U_1<d_1,U_2<d_2|B(1,2)]\), we can obtain

$$\begin{aligned} \Pr [U_1>d_1|B(1,2)] = {\left\{ \begin{array}{ll} - \frac{(2b + 1)(2 d_1 - 1)}{b + d_1 - b d_1} - \frac{(2 b + 1)(3 d_1 - 1)(b + 1)(d_1 - 1)}{2(b + d_1 - b d_1)^2} &{} 1/3<d_1\le 1/2,\\ {} \frac{(2b + 1)(2 d_1 - 1)(b + 1)(d1 - 1)}{(b + d_1 - b d_1)^2} - \frac{(2b + 1)(3 d_1 - 1)(b + 1)(d_1 - 1)}{2(b + d_1 - b d_1)^2} &{} 1/2<d_1 \le 1. \end{array}\right. } \end{aligned}$$
(46)
$$\begin{aligned} \Pr [U_2>d_2|B(1,2)] = \frac{(2b + 1)(1-d_2)}{b + d_2} \quad 1/2<d_2 \le 1. \end{aligned}$$
(47)
$$\begin{aligned}&\Pr [U_1>d_1,U_2>d_2|B(1,2)] \nonumber \\&\quad ={\left\{ \begin{array}{ll} -\frac{(2b + 1)(d_2 - 2 d_1 - b + 2b d_1 + b d_2 - 2d_1 d_2 - b d_1^2 + d_1^2 d_2 + 3 d_1^2 - 2 b d_1 d_2 + b d_1^2 d_2)}{(b + d_1 - b d_1)^2} &{} 1/3{<}d_1 \le 1/2, 1/2 {<}d_2\le 1,\\ {} \frac{(2b {+} 1)(b {+} 1)(1{-}d_1)^2 (1-d_2)}{(b + d_1 - b d_1)^2} &{} 1/2<d_1 \le 1, 1/2 <d_2\le 1. \end{array}\right. } \end{aligned}$$
(48)

To calculate, \(\Pr [U_1<d_1,U_2<d_2|B(1,3)]\), we get

$$\begin{aligned} \Pr [U_1>d_1|B(1,3)] = {\left\{ \begin{array}{ll} \frac{(2b + 1)(1-2 d_1)}{b + d_1 - b d_1} - \frac{(2b + 1)(1-3 d_1)(b + 1)(1-d_1)}{(b + d_1 - b d_1) (b - d_1 + b d_1 + 1)} &{} 1/3<d_1 \le 1/2,\\ {} \frac{(2b + 1)(1-2 d_1)(b + 1)(1-d_1)}{(b d_1 - d_1 + 1)(b + d_1 - b d_1)} - \frac{(2b + 1)(1-3 d_1)(b + 1)(1-d_1)}{(b + d_1 - b d_1)(b - d_1 + b d_1 + 1)} &{} 1/2<d_1 \le 1. \end{array}\right. } \end{aligned}$$
(49)
$$\begin{aligned} \Pr [U_2>d_2|B(1,3)] = \frac{(2b + 1)(1-d_2)}{b + d_2} \quad 1/2<d_2 \le 1. \end{aligned}$$
(50)
$$\begin{aligned} Z&\Pr [U_1>d_1,U_2>d_2|B(1,3)] \nonumber \\ \quad&={\left\{ \begin{array}{ll} -\frac{(2b + 1)(b d_1 - d_1 - b + 2b d_2 + d_1 d_2 + b d_1^2 - d_1^2 d_2 + d_1^2 - 5b d_1 d_2 + 3b d_1^2 d_2)}{(b + d_1 - b d_1)(b + d_2 - b d_2 - d_1 d_2 + b d_1 d_2)} &{} 1/3{<}d_1\le 1/2, 1/2 {<}d_2\le 1,\\ {} \frac{(2b + 1)(b + 1)(1-d_1)^2(1-d_2)}{(b d_1 - d_1 + 1)(b + d_2 - b d_2 - d_1 d_2 + b d_1 d_2)} &{} 1/2<d_1\le 1, 1/2 <d_2\le 1. \end{array}\right. } \end{aligned}$$
(51)

Finally, we can obtain the joint distribution of \((U_1,U_2)\) by plugging the probabilities in (4142), (4551) and (39) as follows

$$\begin{aligned} \Pr [U_1<d_1,U_2<d_2|\bar{H}]&=1-\sum _{(r,s)\in S^{(2)}}{\Pr [U_1>d_1|B(r,s)]\Pr [B(r,s)]}\\&\quad - \sum _{(r,s)\in S^{(2)}}{\Pr [U_2>d_2|B(r,s)]\Pr [B(r,s)]} \\&\quad +\sum _{(r,s)\in S^{(2)}}{\Pr [U_1>d_1,U_2>d_2|B(r,s)]\Pr [B(r,s)]}, \end{aligned}$$

where \(S^{(2)}=\{(1,2), (1,3), (2,3)\}\).

Note that the exact critical values of the sequential tests for testing \(k=2\) upper outliers can be obtained by plugging the expressions of \(d_1\) in (43) and \(d_2\) in (44) into

$$\begin{aligned} \Pr [U_1<d_1,U_2<d_2|\bar{H}] =1-2 \beta + \Pr [U_1>d_1,U_2>d_2|\bar{H}] =1-\alpha , \end{aligned}$$

where \(\alpha \) is the significance level.

Therefore, for \(\alpha =0.05\), we obtain the value of \(\beta =0.025427\) which yields the critical values to be \(d_1= 0.907936\) and \(d_2=0.983191\) using (43) and (44).

Once we calculate the critical values, we can compute the various performance measures discussed in Sect. 6 by plugging the probabilities in (4151) and (39) for different size of shifts \(b<1\). For example, for \(b=0.5\), the P, \(\textit{NSP}\), \(\textit{SP}\), \(\textit{NSE}\), \({ PL}_1\), \({ PL}_2\) and \({ PL}_3\) are 0.00113, 0.00057, 0, 0.48379, 0.00008, 0.00048 and 0.01564 respectively.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, N. Exact distributions of tests of outliers for exponential samples. Stat Papers 60, 2031–2061 (2019). https://doi.org/10.1007/s00362-017-0908-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-017-0908-6

Keywords

Navigation