Random Start Forward Searches with Envelopes for Detecting Clusters in Multivariate Data

  • Anthony Atkinson
  • Marco Riani
  • Andrea Cerioli
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

During a forward search the plot of minimum Mahalanobis distances of observations not in the subset provides a test for outliers. However, if clusters are present in the data, their simple identification requires that there arc searches that initially include a preponderance of observations from each of the unknown clusters. We use random starts to provide such searches, combined with simulation envelopes for precise inference about clustering.

Keywords

Mahalanobis Distance Subset Size Random Start Forward Search Outlier Test 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ATKINSON, A.C. and RIANI, M (2000): Robust Diagnostic Regression Analysis, Springer-Verlag, New York.MATHGoogle Scholar
  2. ATKINSON, A.C., RIANI, M. and CERIOLI, A. (2004): Exploring Multivariate Data with the Forward Search, Springer-Verlag, New York.MATHGoogle Scholar
  3. RIANI, M and ATKINSON, A.C. (2006): Finding an unknown number of multivariate outliers in larger data sets, (Submitted).Google Scholar
  4. ZANI, S., RIANI, M and CORBELLINI, A. (1998): Robust bivariate boxplots and multiple outlier detection, Computational Statistics and Data Analysis, 28, 257–270.MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Heidelberg 2006

Authors and Affiliations

  • Anthony Atkinson
    • 1
  • Marco Riani
    • 2
  • Andrea Cerioli
    • 2
  1. 1.Department of StatisticsLondon School of EconomicsLondon
  2. 2.Department of EconomicsUniversity of ParmaItaly

Personalised recommendations