A procedure for outlier identification in data sets from continuous distributions
- 48 Downloads
We propose a procedure, based on sums of reciprocals ofp-values, for the identification of outliers in univariate or multivariate data sets coming from continuous distributions. Using results of Csörgő (1990), we find the limiting distribution of the relevant statistic for completely specified models. By simulations, we obtain approximate quantiles for the asymptotic distribution, (which does not depend on the specific model or the dimension where the data live) and for the finite sample distribution in different dimensions of our statistic when parameters are estimated, for the multivariate Gaussian model and a multivariate double exponential model with independent coordinates. Monte Carlo evaluation shows that the procedure proposed is effective in the identification of outliers, and that it is sensitive to sample size, a feature seldom found in outlier identification methods.
Key WordsOutlier identification St. Petersburg paradox continuous distributions
AMS subject classification62H99 62G35
Unable to display preview. Download preview PDF.
- Balakrishnan, N. andCutler, C. D. (1996). Maximum likelihood estimation of the Laplace parameters based on Type-II censored samples. In H. N. Nagaraja, P. K. Sen, and D. F. Morrison, eds.,Statistical Theory and Applications: Papers in Honor of Herbert A. David, pp. 145–151. Springer-Verlag, New York.Google Scholar
- Barnett, V. andLewis, T. (1993).Outliers in Statistical Data, John Wiley & Sons, New York, 3rd ed.Google Scholar
- Csörgő, S. andDodunekova, R. (1991). Limit theorems for the Petersburg game. In M. G. Hahn, D. M. Mason, and D. C. Wiener, eds.,Sums, Trimmed Sums and Extremes, pp. 285–315. Birkhäuser, Boston.Google Scholar
- Fang, K. T., Kotz, S., andNg, K. W. (1990).Elliptically Symmetric Multivariate and Related Distributions, vol. 36 ofMonographs on Statistics and Applied Probability. Chapman and Hall, London.Google Scholar
- Shafer, G. (1988). The St. Petersburg paradox. In S. Kotz, N. L. Johnson, and C. B. Read, eds.,Encyclopedia of Statistical Sciences, vol. 8, pp. 865–870. Wiley, New York.Google Scholar