Test

, Volume 8, Issue 2, pp 255–317

Multivariate L-estimation

  • Ricardo Fraiman
  • Jean Meloche
  • Luis A. García-Escudero
  • Alfonso Gordaliza
  • Xuming He
  • Ricardo Maronna
  • Víctor J. Yohai
  • Simon J. Sheather
  • Joseph W. McKean
  • Christopher G. Small
  • Andrew Wood
  • R. Fraiman
  • Jean Meloche
Article

Abstract

In one dimension, order statistics and ranks are widely used because they form a basis for distribution free tests and some robust estimation procedures. In more than one dimension, the concept of order statistics and ranks is not clear and several definitions have been proposed in the last years. The proposed definitions are based on different concepts of depth. In this paper, we define a new notion of order statistics and ranks for multivariate data based on density estimation. The resulting ranks are invariant under affinc transformations and asymptotically distribution free. We use the corresponding order statistics to define a class of multivariate estimators of location that can be regarded as multivariate L-estimators. Under mild assumptions on the underlying distribution, we show the asymptotic normality of the estimators. A modification of the proposed estimates results in a high breakdown point procedure that can deal with patches of outliers. The main idea is to order the observations according to their likelihoodf(X1),...,f(Xn). If the densityf happens to be cllipsoidal, the above ranking is similar to the rankings that are derived from the various notions of depth. We propose to define a ranking based on a kernel estimate of the densityf. One advantage of estimating the likelihoods is that the underlying distribution does not need to have a density. In addition, because the approximate likelihoods are only used to rank the observations, they can be derived from a density estimate using a fixed bandwidth. This fixed bandwidth overcomes the curse of dimensionality that typically plagues density estimation in high dimension.

Key Words

Approximate likelihood depth asymptotic normality equivariance multivariate order statistics 

AMS subject classification

Primary 62G05 secondary 62G20 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brown, B.M. (1983). Statistical uses of the spatial median.Journal of the Royal Statistical Society, B,45, 25–30.MATHGoogle Scholar
  2. Brown, B.M. and T.P. Hettmansperger (1987). Affinc invariant rank methods in the bivariate location model.Journal of the Royal Statistical Society, B,49, 301–310.MATHMathSciNetGoogle Scholar
  3. Chaudhuri, P. (1992). Multivariate location estimation using extension of R-Estimates through U-statistics type approach.The Annals of Statistics,20, 897–916.MATHMathSciNetGoogle Scholar
  4. Donoho, D.L. and M. Gasko (1992). Breakdown properties of location estimates based on halfspace depth and projected outlyingnessThe Annals of Statistics,20, 1803–1827.MATHMathSciNetGoogle Scholar
  5. Fraiman, R. and J. Meloche (1998). Multivariate L estimation. Technical Report of the Department of Statistics at the University of British Columbia.Google Scholar
  6. Gower, J.C. (1974). The mediancenter.Journal of the Royal Statistical Society A,23, 466–470.Google Scholar
  7. Hettmansperger, T.P., J. Nyblom and H. Oja (1994). Affine invariant multivariate one-sample sign tests.Journal of the Royal Statistical Society, B,56, 221–234.MATHMathSciNetGoogle Scholar
  8. Liu, R. (1988). On a notion of simplicial depth.Proceedings of the National Academy of Sciences, U.S.A.,85, 1732–1734.MATHCrossRefGoogle Scholar
  9. Liu, R. (1990). On a notion of data depth based on random simplices.The Annals of Statistics,18, 405–414.MATHMathSciNetGoogle Scholar
  10. Liu, R. and K. Singh (1993). A quality index based on data depth and multivariate rank tests.Journal of the American Statistical Association,421, 252–260.CrossRefMathSciNetGoogle Scholar
  11. Mahalanobis, P.C. (1936). On the generalized distance in Statistics.Proceedings of the National Academy of India,12, 49–55.Google Scholar
  12. Oja, H. (1983). Descriptive statistics for multivariate distributions.Statistic and Probability Letters,1, 327–332.MATHCrossRefMathSciNetGoogle Scholar
  13. Oja, H. and Nyblom, H. (1989). On bivariate sign tests.Journal of the American Statistical Association,84, 249–259.MATHCrossRefMathSciNetGoogle Scholar
  14. Rao, C.R. (1988). Methodology based onL 1-norm in statistical inference.Sankhya, A,50, 289–313.MATHGoogle Scholar
  15. Roussceuw, P.J. (1984). Least median of squares regression.Journal of the American Statistical Association,79, 871–880.CrossRefMathSciNetGoogle Scholar
  16. Rousseeuw, P.J. (1986). Multivariate estimation with high breakdown point. InMathematical Statistics and Applications (W. Grossman, G. Pfug, I. Vincze and W. Wertz eds.) Dordrecht: Reidel, 283–297.Google Scholar
  17. Rousseeuw, P.J. and A.M. Leroy (1987).Robust Regression and Outlier Detection. John Wiley, New York.MATHCrossRefGoogle Scholar
  18. Serfling, R. (1980).Approximation Theorems of Mathematical Statistics. John Wiley, New York.MATHGoogle Scholar
  19. Singh, K. (1991). A notion of majority depth. Technical Report, Rutgers University, Department of Statistics.Google Scholar
  20. Small, C.G. (1990). A survey of multidimensional medians.International Statistical Review,58, 263–277.CrossRefGoogle Scholar
  21. Tukey, J.W. (1975). Mathematics and picturing data.Proceedings of the International Congress of Mathematics, Vancouver,2, 523–531.MathSciNetGoogle Scholar

References

  1. Chaudhuri, P. (1996). On a geometric notion of quantiles for multivariate data.Journal of the American Statistical Association,91, 862–872.MATHCrossRefMathSciNetGoogle Scholar
  2. Cuesta-Albertos, J.A., A. Gordaliza and C. Matrán (1997). Trimmedk-means: An attempt to robustify quantizers.The Annals of Statistics,25, 553–576.MATHCrossRefMathSciNetGoogle Scholar
  3. Cuesta-Albertos, J.A., A. Gordaliza and C. Matrán, C. (1998). Trimmed bestk-nets: A robustified version of aL -based clustering method.Statistics and Probability Letters,36, 401–413.MATHCrossRefMathSciNetGoogle Scholar
  4. Cuevas, A. and R. Fraiman (1997). A plug-in approach to support estimation.The Annals of Statistics,25, 2300–2312.MATHCrossRefMathSciNetGoogle Scholar
  5. Cuevas, A., M. Febrero, M. and R. Fraiman (1998). Cluster analysis: a further approach based on density estimation. Preprint.Google Scholar
  6. Fraiman, R. and J. Meloche (1991). Counting bumps. Technical Report, 112, Department of Statistics, University of British Columbia.Google Scholar
  7. García-Escudero, L.A., and A. Gordaliza (1999). Robustness properties ofk-means and trimmedk-means.Journal of the American Statistical Association (to appear).Google Scholar
  8. García-Escudero, L.A., A. Gordaliza and C. Matrán (1999). A central limit theoem for multivariate generalized trimmedk-means.The Annals of Statistics,27, 1061–1079.MATHCrossRefMathSciNetGoogle Scholar
  9. Good, I.J. and R.A. Gaskins (1980). Density estimation and bump-hunting by the penalized maximum likelihood method exemplified by scattering and meteorite data (with discussion).Journal of the American Statistical Association,75, 42–73.MATHCrossRefMathSciNetGoogle Scholar
  10. Gordaliza, A. (1991). Best approximations to random variables based on trimming procedures.Journal of Approximation Theory,64, 162–180.MATHCrossRefMathSciNetGoogle Scholar
  11. Rousseeuw, P.J. (1986). Multivariate estimation with high breakdown point. InMathematical Statistics and Applications (W. Grossmann, G. Pflug, I. Vincze and W. Wertz eds.) Dordrecht: Reidel, 283–297.Google Scholar
  12. Small, C.G. (1990). A survey of multidimensional medians.International Statistical Review,58, 263–277.Google Scholar

References

  1. He, X. and Q.M. Shao (1996). A general Bahadur representation of M-estimators and its application to lincar regression with nonstochastic designs,Annals of Statistics,24, 2608–2630.MATHCrossRefMathSciNetGoogle Scholar
  2. He, X. and G. Wang (1997). Convergence of depth contours for multivariate datasets,Annals of Statistics,25, 495–504.MATHCrossRefMathSciNetGoogle Scholar
  3. Kim, J. and D. Pollard (1990). Cube root asymptotics,Annals of Statistics,18, 191–219.MATHMathSciNetGoogle Scholar
  4. Liu, R.Y., J.M. Parclius, J.M. and K. Singh (1999). Multivariate analysis by data depth: descriptive statistics, graphics and inference,Annals of Statistics (in press).Google Scholar
  5. Mizera, I. (1998). On depth and deep points: a calculus. Preprint.Google Scholar
  6. Zuo, Y. and R. Serfling (1998). General notions of statistical depth function and some related convergence results. Preprint.Google Scholar

References

  1. Davies, P.L. (1987). Asymptotic behavior of S-estimates of multivariate location parameters and dispersion matrices.Annals of Statistics,15, 1969–1292.Google Scholar
  2. Donoho, D. (1982). Breakdown properties of multivariate location estimators. Ph.D. qualifying paper, Harvard University.Google Scholar
  3. Stahel, W. (1981). Breakdown of covariance estimates. Research report 31Google Scholar

References

  1. Hettmansperger, T.P. and J.W. McKean (1998).Robust Nonparametric Statistical Methods. Arnold, London.MATHGoogle Scholar
  2. Hettmansperger, T.P., J. Möttönen and H. Oja (1997). Affine-invariant multivariate one-sample signed-rank tests.Journal of the American Statistical Association,92, 1591–1600.MATHCrossRefMathSciNetGoogle Scholar
  3. Parzen, E. (1979). Nonparametric statistical data modeling.Journal of the American Statistical Association,74, 105–131.MATHCrossRefMathSciNetGoogle Scholar
  4. Sheather, S.J. and J.S. Marron (1990). Kernel quantile estimators.Journal of the American Statistical Association,85, 410–416.MATHCrossRefMathSciNetGoogle Scholar
  5. Yang, S.S. (1985). A smooth nonparametric estimator of the quantile function.Journal of the American Statistical Association,80, 1004–1011.MATHCrossRefMathSciNetGoogle Scholar

References

  1. Brown, B.M. (1983). Statistical uses of the spatial median.Journal of the Royal Statistical Society, B,45, 25–30.MATHGoogle Scholar
  2. Liu, R.Y. and K. Singh (1992). Ordering directional data: concepts of data depth on circles and spheres.Annals of Statistics,20, 1468–1484.MATHMathSciNetGoogle Scholar
  3. Niimimaa, A., H. Oja and M. Tableman (1990). The finite sample breakdown point of the Oja bivariate median.Statistics and Probability Letters,10, 325–328.CrossRefMathSciNetGoogle Scholar
  4. Small, C.G. (1987). Measures of centrality of multivariate and directional distributions.Canadian Journal of Statistics,15, 31–39.MATHMathSciNetGoogle Scholar
  5. Small, C.G. (1997). Multidimensional medians arising from geodesics on graphs.Annals of Statistics,25, 478–494.MATHCrossRefMathSciNetGoogle Scholar
  6. Tukey, J.W. (1975). Mathematics and the picturing of data. InProceedings of the International Congress of Mathematicians, Vancouver 1974,2, 523-531.Google Scholar

References

  1. Bednarski, T. and B.R. Clarke (1993). Trimmed likelihood estimation of location and scale of the normal distribution.Australian Journal of Statistics,35, 141–153.MATHMathSciNetGoogle Scholar

Copyright information

© Sociedad Española de Estadistica e Investigación Operativa 1999

Authors and Affiliations

  • Ricardo Fraiman
    • 2
  • Jean Meloche
    • 1
  • Luis A. García-Escudero
    • 3
  • Alfonso Gordaliza
    • 3
  • Xuming He
    • 4
  • Ricardo Maronna
    • 5
  • Víctor J. Yohai
    • 6
  • Simon J. Sheather
    • 7
  • Joseph W. McKean
    • 8
  • Christopher G. Small
    • 9
  • Andrew Wood
    • 10
  • R. Fraiman
  • Jean Meloche
  1. 1.Department of StatisticsThe University of BritishCanada
  2. 2.Departamento de MatemáticaUniversidad de San AndrésArgentina
  3. 3.University of ValladolidSpain
  4. 4.University of IllinoisUSA
  5. 5.Universidad de La Plata and Comision de Investigaciones Científicas Provincia de Buenos AiresArgentina
  6. 6.Universidad de Buenos Aires and CONICETArgentina
  7. 7.University of New South WalesAustralia
  8. 8.Western Michigan UniversityUSA
  9. 9.University of WaterlooCanada
  10. 10.University of NottinghamUK

Personalised recommendations