Abstract
In one dimension, order statistics and ranks are widely used because they form a basis for distribution free tests and some robust estimation procedures. In more than one dimension, the concept of order statistics and ranks is not clear and several definitions have been proposed in the last years. The proposed definitions are based on different concepts of depth. In this paper, we define a new notion of order statistics and ranks for multivariate data based on density estimation. The resulting ranks are invariant under affinc transformations and asymptotically distribution free. We use the corresponding order statistics to define a class of multivariate estimators of location that can be regarded as multivariate L-estimators. Under mild assumptions on the underlying distribution, we show the asymptotic normality of the estimators. A modification of the proposed estimates results in a high breakdown point procedure that can deal with patches of outliers. The main idea is to order the observations according to their likelihoodf(X 1),...,f(X n ). If the densityf happens to be cllipsoidal, the above ranking is similar to the rankings that are derived from the various notions of depth. We propose to define a ranking based on a kernel estimate of the densityf. One advantage of estimating the likelihoods is that the underlying distribution does not need to have a density. In addition, because the approximate likelihoods are only used to rank the observations, they can be derived from a density estimate using a fixed bandwidth. This fixed bandwidth overcomes the curse of dimensionality that typically plagues density estimation in high dimension.
Similar content being viewed by others
References
Brown, B.M. (1983). Statistical uses of the spatial median.Journal of the Royal Statistical Society, B,45, 25–30.
Brown, B.M. and T.P. Hettmansperger (1987). Affinc invariant rank methods in the bivariate location model.Journal of the Royal Statistical Society, B,49, 301–310.
Chaudhuri, P. (1992). Multivariate location estimation using extension of R-Estimates through U-statistics type approach.The Annals of Statistics,20, 897–916.
Donoho, D.L. and M. Gasko (1992). Breakdown properties of location estimates based on halfspace depth and projected outlyingnessThe Annals of Statistics,20, 1803–1827.
Fraiman, R. and J. Meloche (1998). Multivariate L estimation. Technical Report of the Department of Statistics at the University of British Columbia.
Gower, J.C. (1974). The mediancenter.Journal of the Royal Statistical Society A,23, 466–470.
Hettmansperger, T.P., J. Nyblom and H. Oja (1994). Affine invariant multivariate one-sample sign tests.Journal of the Royal Statistical Society, B,56, 221–234.
Liu, R. (1988). On a notion of simplicial depth.Proceedings of the National Academy of Sciences, U.S.A.,85, 1732–1734.
Liu, R. (1990). On a notion of data depth based on random simplices.The Annals of Statistics,18, 405–414.
Liu, R. and K. Singh (1993). A quality index based on data depth and multivariate rank tests.Journal of the American Statistical Association,421, 252–260.
Mahalanobis, P.C. (1936). On the generalized distance in Statistics.Proceedings of the National Academy of India,12, 49–55.
Oja, H. (1983). Descriptive statistics for multivariate distributions.Statistic and Probability Letters,1, 327–332.
Oja, H. and Nyblom, H. (1989). On bivariate sign tests.Journal of the American Statistical Association,84, 249–259.
Rao, C.R. (1988). Methodology based onL 1-norm in statistical inference.Sankhya, A,50, 289–313.
Roussceuw, P.J. (1984). Least median of squares regression.Journal of the American Statistical Association,79, 871–880.
Rousseeuw, P.J. (1986). Multivariate estimation with high breakdown point. InMathematical Statistics and Applications (W. Grossman, G. Pfug, I. Vincze and W. Wertz eds.) Dordrecht: Reidel, 283–297.
Rousseeuw, P.J. and A.M. Leroy (1987).Robust Regression and Outlier Detection. John Wiley, New York.
Serfling, R. (1980).Approximation Theorems of Mathematical Statistics. John Wiley, New York.
Singh, K. (1991). A notion of majority depth. Technical Report, Rutgers University, Department of Statistics.
Small, C.G. (1990). A survey of multidimensional medians.International Statistical Review,58, 263–277.
Tukey, J.W. (1975). Mathematics and picturing data.Proceedings of the International Congress of Mathematics, Vancouver,2, 523–531.
References
Chaudhuri, P. (1996). On a geometric notion of quantiles for multivariate data.Journal of the American Statistical Association,91, 862–872.
Cuesta-Albertos, J.A., A. Gordaliza and C. Matrán (1997). Trimmedk-means: An attempt to robustify quantizers.The Annals of Statistics,25, 553–576.
Cuesta-Albertos, J.A., A. Gordaliza and C. Matrán, C. (1998). Trimmed bestk-nets: A robustified version of aL ∞-based clustering method.Statistics and Probability Letters,36, 401–413.
Cuevas, A. and R. Fraiman (1997). A plug-in approach to support estimation.The Annals of Statistics,25, 2300–2312.
Cuevas, A., M. Febrero, M. and R. Fraiman (1998). Cluster analysis: a further approach based on density estimation. Preprint.
Fraiman, R. and J. Meloche (1991). Counting bumps. Technical Report, 112, Department of Statistics, University of British Columbia.
García-Escudero, L.A., and A. Gordaliza (1999). Robustness properties ofk-means and trimmedk-means.Journal of the American Statistical Association (to appear).
García-Escudero, L.A., A. Gordaliza and C. Matrán (1999). A central limit theoem for multivariate generalized trimmedk-means.The Annals of Statistics,27, 1061–1079.
Good, I.J. and R.A. Gaskins (1980). Density estimation and bump-hunting by the penalized maximum likelihood method exemplified by scattering and meteorite data (with discussion).Journal of the American Statistical Association,75, 42–73.
Gordaliza, A. (1991). Best approximations to random variables based on trimming procedures.Journal of Approximation Theory,64, 162–180.
Rousseeuw, P.J. (1986). Multivariate estimation with high breakdown point. InMathematical Statistics and Applications (W. Grossmann, G. Pflug, I. Vincze and W. Wertz eds.) Dordrecht: Reidel, 283–297.
Small, C.G. (1990). A survey of multidimensional medians.International Statistical Review,58, 263–277.
References
He, X. and Q.M. Shao (1996). A general Bahadur representation of M-estimators and its application to lincar regression with nonstochastic designs,Annals of Statistics,24, 2608–2630.
He, X. and G. Wang (1997). Convergence of depth contours for multivariate datasets,Annals of Statistics,25, 495–504.
Kim, J. and D. Pollard (1990). Cube root asymptotics,Annals of Statistics,18, 191–219.
Liu, R.Y., J.M. Parclius, J.M. and K. Singh (1999). Multivariate analysis by data depth: descriptive statistics, graphics and inference,Annals of Statistics (in press).
Mizera, I. (1998). On depth and deep points: a calculus. Preprint.
Zuo, Y. and R. Serfling (1998). General notions of statistical depth function and some related convergence results. Preprint.
References
Davies, P.L. (1987). Asymptotic behavior of S-estimates of multivariate location parameters and dispersion matrices.Annals of Statistics,15, 1969–1292.
Donoho, D. (1982). Breakdown properties of multivariate location estimators. Ph.D. qualifying paper, Harvard University.
Stahel, W. (1981). Breakdown of covariance estimates. Research report 31
References
Hettmansperger, T.P. and J.W. McKean (1998).Robust Nonparametric Statistical Methods. Arnold, London.
Hettmansperger, T.P., J. Möttönen and H. Oja (1997). Affine-invariant multivariate one-sample signed-rank tests.Journal of the American Statistical Association,92, 1591–1600.
Parzen, E. (1979). Nonparametric statistical data modeling.Journal of the American Statistical Association,74, 105–131.
Sheather, S.J. and J.S. Marron (1990). Kernel quantile estimators.Journal of the American Statistical Association,85, 410–416.
Yang, S.S. (1985). A smooth nonparametric estimator of the quantile function.Journal of the American Statistical Association,80, 1004–1011.
References
Brown, B.M. (1983). Statistical uses of the spatial median.Journal of the Royal Statistical Society, B,45, 25–30.
Liu, R.Y. and K. Singh (1992). Ordering directional data: concepts of data depth on circles and spheres.Annals of Statistics,20, 1468–1484.
Niimimaa, A., H. Oja and M. Tableman (1990). The finite sample breakdown point of the Oja bivariate median.Statistics and Probability Letters,10, 325–328.
Small, C.G. (1987). Measures of centrality of multivariate and directional distributions.Canadian Journal of Statistics,15, 31–39.
Small, C.G. (1997). Multidimensional medians arising from geodesics on graphs.Annals of Statistics,25, 478–494.
Tukey, J.W. (1975). Mathematics and the picturing of data. InProceedings of the International Congress of Mathematicians, Vancouver 1974,2, 523-531.
References
Bednarski, T. and B.R. Clarke (1993). Trimmed likelihood estimation of location and scale of the normal distribution.Australian Journal of Statistics,35, 141–153.
Author information
Authors and Affiliations
Corresponding author
Additional information
The research was partially supported by grant #37 from the CONICYT and by grant 5-81089 from NSERC.
Rights and permissions
About this article
Cite this article
Fraiman, R., Meloche, J., García-Escudero, L.A. et al. Multivariate L-estimation. Test 8, 255–317 (1999). https://doi.org/10.1007/BF02595872
Issue Date:
DOI: https://doi.org/10.1007/BF02595872