Advertisement

High Dimensional Clustering Using Parallel Coordinates and the Grand Tour

  • Edward J. Wegman
  • Qiang Luo
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Summary

In this paper, we present some graphical techniques for cluster analysis of high-dimensional data. Parallel coordinate plots and parallel coordinate density plots are graphical techniques which map multivariate data into a two-dimensional display. The method has some elegant duality properties with ordinary Cartesian plots so that higher-dimensional mathematical structures can be analyzed. Our high interaction software allows for rapid editing of data to remove outliers and isolate clusters by brushing. Our brushing techniques allow not only for hue adjustment, but also for saturation adjustment. Saturation adjustment allows for the handling of comparatively massive data sets by using the α-channel of the Silicon Graphics workstation to compensate for heavy overplotting.

The grand tour is a generalized rotation of coordinate axes in a high-dimensional space. Coupled with the full-dimensional plots allowed by the parallel coordinate display, these techniques allow the data analyst to explore data which is both high-dimensional and massive in size. In this paper we give a description of both techniques and illustrate their use to do inverse regression and clustering. We have used these techniques to analyze data on the order of 250,000 observations in 8 dimensions. Because the analysis requires the use of color graphics, in the present paper we illustrate the methods with a more modest data set of 3848 observations. Other illustrations are available on our web page.

Keywords

Generalize Rotation High Density Region Inverse Regression Explanatory Covariates Brushing Technique 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ASIMOV, D. (1985): The grand tour: a tool for viewing multidimensional data, SIAM J. Sci. Statist Comput., 6, 128–143.CrossRefGoogle Scholar
  2. BUJA, A. and ASIMOV, D. (1985): Grand tour methods: an outline, Computer Science and Statistics: Proceedings of the Seventeenth Symposium on the Interface, 63–67, (D. Allen, ed.), New York: North Holland Publishing Company.Google Scholar
  3. INSELBERG, A. (1985): The plane with parallel coordinates, The Visual Computer, 1, 69–91.CrossRefGoogle Scholar
  4. MILLER, J. J. and WEGMAN, E. J. (1991): Construction of line densities for parallel coordinate plots, Computing and Graphics in Statistics, (A. Buja and P. Tukey, eds.), 107–123, Springer-Verlag: New York.Google Scholar
  5. WEGMAN, E. J. (1990): Hyperdimensional data analysis using parallel coordinates, J. American Statist. Assoc., 85, 664–675.CrossRefGoogle Scholar
  6. WEGMAN, E. J. (1991): The grand tour in k-dimensions, Computing Science and Statistics: Proceedings of the 22nd Symposium on the Interface, 127-136.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • Edward J. Wegman
    • 1
  • Qiang Luo
    • 1
  1. 1.Center for Computational StatisticsGeorge Mason UniversityFairfaxUSA

Personalised recommendations