Statistics and Computing

, Volume 23, Issue 6, pp 677–688 | Cite as

RDELA—a Delaunay-triangulation-based, location and covariance estimator with high breakdown point

  • Steffen Liebscher
  • Thomas Kirschstein
  • Claudia Becker
Article

Abstract

We propose an approach that utilizes the Delaunay triangulation to identify a robust/outlier-free subsample. Given that the data structure of the non-outlying points is convex (e.g. of elliptical shape), this subsample can then be used to give a robust estimation of location and scatter (by applying the classical mean and covariance). The estimators derived from our approach are shown to have a high breakdown point. In addition, we provide a diagnostic plot to expand the initial subset in a data-driven way, further increasing the estimators’ efficiency.

Keywords

Breakdown point Delaunay triangulation Minimum covariance determinant Robust estimation 

References

  1. Allard, D., Fraley, C.: Nonparametric maximum likelihood estimation of features in spatial point processes using Voronoi tessellation. J. Am. Stat. Assoc. 92(440), 1485–1493 (1997) MATHGoogle Scholar
  2. Alqallaf, F., Van Aelst, S., Yohai, V., Zamar, R.: Propagation of outliers in multivariate data. Ann. Stat. 37(1), 311–331 (2009) CrossRefMATHGoogle Scholar
  3. Amenta, N., Attali, D., Devillers, O.: Complexity of Delaunay triangulation for points on lower-dimensional polyhedra. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete algorithms, SODA’07, pp. 1106–1113. Society for Industrial and Applied Mathematics, Philadelphia (2007) Google Scholar
  4. Amenta, N., Attali, D., Devillers, O.: A tight bound for the Delaunay triangulation of points on a polyhedron. Rapport de recherche RR-6522, INRIA (2008) Google Scholar
  5. Atkinson, A., Riani, M., Cerioli, A.: Exploring Multivariate Data with the Forward Search. Springer, Berlin (2004) CrossRefMATHGoogle Scholar
  6. Attali, D., Boissonnat, J.-D., Lieutier, A.: Complexity of the Delaunay triangulation of points on surfaces the smooth case. In: Proceedings of the Nineteenth Annual Symposium on Computational Geometry, SCG’03, pp. 201–210. ACM, New York (2003) Google Scholar
  7. Barnett, V., Lewis, T.: Outliers in Statistical Data, 3rd edn. Wiley, New York (2000) Google Scholar
  8. Becker, C., Gather, U.: The masking breakdown point of multivariate outlier identification rules. J. Am. Stat. Assoc. 94, 947–955 (1999) MathSciNetCrossRefMATHGoogle Scholar
  9. Becker, C., Paris Scholz, S.: MVE, MCD, and MZE: a simulation study comparing convex body minimizers. Allg. Stat. Arch. 88(2), 155–162 (2004) MathSciNetCrossRefGoogle Scholar
  10. Becker, C., Paris Scholz, S.: Deepest points and least deep points: robustness and outliers with MZE. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W. (eds.) From Data and Information Analysis to Knowledge Engineering, pp. 254–261. Springer, Berlin (2006) CrossRefGoogle Scholar
  11. Cignoni, P., Montani, C., Scopigno, R.: Dewall: a fast divide and conquer Delaunay triangulation algorithm in ed. Comput. Aided Des. 30(5), 333–341 (1998) CrossRefMATHGoogle Scholar
  12. Croux, C., Haesbroeck, G.: Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies. Biometrika 87(3), 603 (2000) MathSciNetCrossRefMATHGoogle Scholar
  13. Davies, P.: Asymptotic behaviour of S-estimates of multivariate location parameters and dispersion matrices. Ann. Stat. 15(3), 1269–1292 (1987) CrossRefMATHGoogle Scholar
  14. Davies, P.: The asymptotics of Rousseeuw’s minimum volume ellipsoid estimator. Ann. Stat. 20(4), 1828–1843 (1992) CrossRefMATHGoogle Scholar
  15. Davies, P., Gather, U.: Breakdown and groups. Ann. Stat. 33(3), 977–988 (2005) MathSciNetCrossRefMATHGoogle Scholar
  16. Davies, P., Gather, U.: Addendum to the discussion of “breakdown and groups”. Ann. Stat. 34(3), 1577–1579 (2006) MathSciNetCrossRefMATHGoogle Scholar
  17. De Berg, M., Cheong, O., Van Kreveld, M., Overmars, M.: Computational Geometry: Algorithms and Applications. Springer, New York (2008) Google Scholar
  18. Delaunay, B.: Sur la sphere vide. Izv. Akad. Nauk SSSR, Ser. VII, Otd. Mat. Est. Nauk 7, 793–800 (1934) Google Scholar
  19. Donoho, D.: Breakdown properties of multivariate location estimators. Ph.D. thesis (1982) Google Scholar
  20. Donoho, D., Huber, P.: The notion of breakdown point. In: A Festschrift for Erich Lehmann, pp. 157–184 (1983) Google Scholar
  21. Gather, U., Becker, C.: Outlier identification and robust methods. In: Maddala, G., Rao, C. (eds.) Robust Inference. Handbook of Statistics, vol. 15, pp. 123–143 (1997) CrossRefGoogle Scholar
  22. Gnanadesikan, R., Kettenring, J.R.: Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 28(1), 81–124 (1972) CrossRefGoogle Scholar
  23. Gower, J.: Euclidean distance geometry. Math. Sci. 7, 1–14 (1982) MathSciNetMATHGoogle Scholar
  24. Gower, J.C.: Algorithm as 78: the mediancentre. J. R. Stat. Soc., Ser. C, Appl. Stat. 23(3), 466–470 (1974) Google Scholar
  25. Gower, J.C.: Properties of Euclidean and non-Euclidean distance matrices. In: Linear Algebra and its Applications, vol. 67, pp. 81–97 (1985) Google Scholar
  26. Hampel, F., Ronchetti, E., Rousseeuw, P., Stahel, W.: Robust Statistics: The Approach Based on Influence Functions. Wiley, New York (2005) CrossRefGoogle Scholar
  27. Hubert, M., Rousseeuw, P., Aelst, S.: High-breakdown robust multivariate methods. Stat. Sci. 23(1), 92–119 (2008) CrossRefGoogle Scholar
  28. Kirschstein, T., Liebscher, S., Becker, C.: Robust Estimation of Location and Scatter by Pruning the Minimum Spanning Tree (2012, submitted for publication) Google Scholar
  29. Leach, G.: Improving worst-case optimal Delaunay triangulation algorithms. In: 4th Canadian Conference on Computational Geometry, p. 15 (1992) Google Scholar
  30. Liebscher, S., Kirschstein, T., Becker, C.: The flood algorithm—a multivariate, self-organizing-map-based, robust location and covariance estimator. Stat. Comput. 22, 325–336 (2012). doi: 10.1007/s11222-011-9250-3 MathSciNetCrossRefGoogle Scholar
  31. Lopuhaä, H.: Asymptotics of reweighted estimators of multivariate location and scatter. Ann. Stat. 27(5), 1638–1665 (1999) CrossRefMATHGoogle Scholar
  32. Lopuhaä, H., Rousseeuw, P.: Breakdown points of affine equivariant estimators of multivariate location and covariance matrices. Ann. Stat. 19(1), 229–248 (1991) CrossRefMATHGoogle Scholar
  33. Maronna, R., Martin, D., Yohai, V.: Robust Statistics: Theory and Methods. John Wiley and Sons, Chichester (2006) CrossRefGoogle Scholar
  34. Maronna, R.A., Zamar, R.H.: Robust estimates of location and dispersion for high-dimensional datasets. Technometrics 44(4), 307–317 (2002) MathSciNetCrossRefGoogle Scholar
  35. McMullen, P.: The maximum numbers of faces of a convex polytope. Mathematika 17(02), 179–184 (1970) MathSciNetCrossRefMATHGoogle Scholar
  36. Paris Scholz, S.: Robustness concepts and investigations for estimators of convex bodies. Ph.D. thesis (2002) Google Scholar
  37. Pison, G., van Aelst, S., Willems, G.: Small sample corrections for LTS and MCD. Metrika 55, 111–123 (2002) MathSciNetCrossRefGoogle Scholar
  38. Riani, M., Atkinson, A., Cerioli, A.: Finding an unknown number of multivariate outliers. J. R. Stat. Soc., Ser. B, Stat. Methodol. 71(2), 447–466 (2009) MathSciNetCrossRefMATHGoogle Scholar
  39. Rocke, D.: Robustness properties of S-estimators of multivariate location and shape in high dimension. Ann. Stat. 24(3), 1327–1345 (1996) MathSciNetCrossRefMATHGoogle Scholar
  40. Rocke, D., Woodruff, D.: Computation of robust estimators of multivariate location and shape. Stat. Neerl. 47(1), 27–42 (1993) MathSciNetCrossRefGoogle Scholar
  41. Rousseeuw, P.: Multivariate estimation with high breakdown point. Math. Stat. Appl. 8, 283–297 (1985) MathSciNetCrossRefGoogle Scholar
  42. Rousseeuw, P., Leroy, A.: Robust Regression and Outlier Detection. John Wiley and Sons, New York (1987) CrossRefMATHGoogle Scholar
  43. Rousseeuw, P., van Driessen, K.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223 (1999) CrossRefGoogle Scholar
  44. Su, P., Drysdale, R.L.S.: A comparison of sequential Delaunay triangulation algorithms. Comput. Geom. 7(5–6), 361–385. 11th ACM Symposium on Computational Geometry (1997) MathSciNetCrossRefMATHGoogle Scholar
  45. Tyler, D.E.: A distribution-free m-estimator of multivariate scatter. Ann. Stat. 15(1), 234–251 (1987) MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Steffen Liebscher
    • 1
  • Thomas Kirschstein
    • 1
  • Claudia Becker
    • 1
  1. 1.Martin-Luther-UniversityHalle (Saale)Germany

Personalised recommendations