Statistics and Computing

, Volume 3, Issue 4, pp 171–177 | Cite as

Density based exploration of bivariate data

  • Adrian Bowman
  • Peter Foster
Papers

Abstract

The difficulties of assessing details of the shape of a bivariate distribution, and of contrasting subgroups, from a raw scatterplot are discussed. The use of contours of a density estimate in highlighting features of distributional shape is illustrated on data on the development of aircraft technology. The estimated density height at each observation imposes an ordering on the data which can be used to select contours which contain specified proportions of the sample. This leads to a display which is reminiscent of a boxplot and which allows simple but effective comparison of different groups. Some simple properties of this technique are explored.

Interesting features of a distribution such as ‘arms’ and multimodality are found along the directions where the largest probability mass is located. These directions can be quantified through the modes of a density estimate based on the direction of each observation.

Keywords

Boxplot contour kernel density estimate mode circular data 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aitchison, J. and Kay, J. W. (1975) Principles, practice and performance in decision making in clinical medicine. In The Role and Effectiveness of Theories of Decision in Practice, D. J. White and K. C. Bowen (eds.) pp. 252–72, Hodder & Stoughton, London.Google Scholar
  2. Becketti, S. and Gould, W. (1987) Rangefinder boxplots: a note. American Statistician 41, 149.Google Scholar
  3. Bowman, A. W. (1981) Some Aspects of Density Estimation by the Kernel Method. Ph.D. thesis, University of Glasgow.Google Scholar
  4. Bowman, A. W. (1985) A comparative study of some kernel-based nonparametric density estimates. J. Stat. Comp. Sim. 21, 313–327.Google Scholar
  5. Friedman, J. H. (1987) Exploratory projection pursuit. J. Amer. Statist. Assoc. 82, 249–266.Google Scholar
  6. Goldberg, K. M. and Inglewicz, I. (1992) Bivariate extensions of the boxplot. Technometrics 34, 307–20.Google Scholar
  7. Gower, J. C. (1974) The mediancentre. Applied Statistics 23, 466–470.Google Scholar
  8. Green, P. J. (1981) Peeling bivariate data. In Interpreting Multivariate Data (Chapter 1) V. Barnett (ed.) Wiley, Chichester.Google Scholar
  9. Jane (1978) Jane's Encyclopaedia of Aviation. Jane's, London.Google Scholar
  10. Kittler, J. (1976) A locally sensitive method for cluster analysis. Pattern Recognition 8, 23–33.Google Scholar
  11. Saviotti, P. P. and Bowman, A. W. (1984) Indicators of output of technology. Proc. ICSSR/SSRC Workshop on Science and Technology Policy in the 1980s; M. Gibbons et al. (eds.) Harvester Press, Brighton.Google Scholar
  12. Scott, D. W. (1978) Plasma lipids as collateral risk factors in coronary artery disease — a study of 371 males with chest pain. J. Chron. Dis. 31, 337–345.Google Scholar
  13. Scott, D. W. (1992) Multivariate Density Estimation: Theory, Practice and Visualisation. Wiley, New York.Google Scholar
  14. Scott, D. W. and Factor, L. E. (1981) Monte Carlo study of three data-based nonparametric density estimators. J. Amer. Statist. Assoc. 76, 9–15.Google Scholar
  15. Silverman, B. W. (1986) Density Estimation for Statistics and Data Analysis. Chapman and Hall, London.Google Scholar
  16. Tukey, P. A. and Tukey, J. W. (1981) Data-driven view selection; agglomeration and sharpening. In Interpreting Multivariate Data (Chapter 11), V. Barnett (ed). Wiley, Chichester.Google Scholar

Copyright information

© Chapman & Hall 1993

Authors and Affiliations

  • Adrian Bowman
    • 1
  • Peter Foster
    • 2
  1. 1.Statistics DepartmentThe UniversityGlasgowUK
  2. 2.Statistical Laboratory, Mathematics DepartmentThe UniversityManchesterUK

Personalised recommendations