Abstract
Multivariate outlier identification is often based on robust location and scatter estimates and usually performed relative to an elliptically shaped distribution. On the other hand, the idea of outlying observations is closely related to the notion of data depth, where observations with minimum depth are potential outliers. Here, we are not generally bound to the idea of an elliptical shape of the underlying distribution. Koshevoy and Mosler (1997) introduced zonoid trimmed regions which define a data depth. Recently, Paris Scholz (2002) and Becker and Paris Scholz (2004) investigated a new approach for robust estimation of convex bodies resulting from zonoids. We follow their approach and explore how the minimum volume zonoid (MZE) estimators can be used for multivariate outlier identification in the case of non-elliptically shaped null distributions.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
ATKINSON, A.C., RIANI, M., CERIOLI, A. (2004): Exploring multivariate data with the forward search. Springer, New York.
BARNETT, V., and LEWIS, T. (1994): Outliers in statistical data. 3rd ed., Wiley, New York.
BECKER, C., and GATHER, U. (1999): The masking breakdown point of multivariate outlier identification rules. J. Amer. Statist. Assoc., 94, 947–955.
BECKER, C., and GATHER, U. (2001): The largest nonidentifiable outlier: A comparison of multivariate simultaneous outlier identification rules. Comput. Statist. and Data Anal., 36, 119–127.
BECKER, C., and PARIS SCHOLZ, S. (2004): MVE, MCD, and MZE: A simulation study comparing convex body minimizers. Allgemeines Statistisches Archiv, 88, 155–162.
CROUX, C., and HAESBROECK, G. (2000): Principal component analysis based on robust estimators of the covariance or correlation matrix: Influence functions and efficiencies. Biometrika, 87, 603–618.
DAVIES, P.L. (1987): Asymptotic behaviour of S-estimates of multivariate location parameters and dispersion matrices. Ann. Statist., 15, 1269–1292.
DAVIES, P.L., and GATHER, U. (1993): The identification of multiple outliers. Invited paper with discussion and rejoinder. J. Amer. Statist. Assoc., 88, 782–801.
DAVIES, P.L., and GATHER, U. (2005a): Breakdown and groups (with discussion and rejoinder. To appear in Ann. Statist.
DAVIES, P.L., and GATHER, U. (2005b): Breakdown and groups II. To appear in Ann. Statist.
GATHER, U., and BECKER, C. (1997): Outlier identification and robust methods. In: G.S. Maddala and C.R. Rao (Eds.): Handbook of statistics, Vol. 15: Robust inference. Elsevier, Amsterdam, 123–143.
HEALY, M.J.R. (1968): Multivariate normal plotting. Applied Statistics 17, 157–161.
KOSHEVOY, G., and MOSLER, K. (1997): Zonoid trimming for multivariate distributions. Ann. Statist., 9, 1998–2017.
KOSHEVOY, G., and MOSLER, K. (1998): Lift zonoids, random convex hulls, and the variability of random vectors. Bernoulli, 4, 377–399.
KOSHEVOY, G., MÖTTÖNEN, J., and OJA, H. (2003): A scatter matrix estimate based on the zonotope, Ann. Statist., 31, 1439–1459.
LIU, R.Y. (1992): Data depth and multivariate rank tests. In: Y. Dodge (Ed.): L1 — Statistical analysis and related methods. North Holland, Amsterdam, 279–294.
LOPUHAÄ, H.P., and ROUSSEEUW, P.J. (1991): Breakdown points of affine equivariant estimators of multivariate location and covariance matrices. Ann. Statist., 19, 229–248.
PARIS SCHOLZ, S. (2002): Robustness concepts and investigations for estimators of convex bodies. Thesis, Department of Statistics, University of Dortmund (in German).
ROCKE, D.M. (1996): Robustness properties of S-estimators of multivariate location and shape in high dimension. Ann. Statist., 24, 1327–1345.
ROUSSEEUW, P.J. (1985): Multivariate estimation with high breakdown point. In: W. Grossmann, G. Pflug, I. Vincze, W. Wertz (Eds.): Mathematical statistics and applications, Vol. 8. Reidel, Dordrecht, 283–297.
ROUSSEEUW, P.J., and VAN DRIESSEN, K. (1999): A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41, 212–223.
ROUSSEEUW, P.J., and LEROY, A.M. (1987): Robust regression and outlier detection. Wiley, New York.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer Berlin · Heidelberg
About this paper
Cite this paper
Becker, C., Scholz, S.P. (2006). Deepest Points and Least Deep Points: Robustness and Outliers with MZE. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W. (eds) From Data and Information Analysis to Knowledge Engineering. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-31314-1_30
Download citation
DOI: https://doi.org/10.1007/3-540-31314-1_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31313-7
Online ISBN: 978-3-540-31314-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)