The difficulties of assessing details of the shape of a bivariate distribution, and of contrasting subgroups, from a raw scatterplot are discussed. The use of contours of a density estimate in highlighting features of distributional shape is illustrated on data on the development of aircraft technology. The estimated density height at each observation imposes an ordering on the data which can be used to select contours which contain specified proportions of the sample. This leads to a display which is reminiscent of a boxplot and which allows simple but effective comparison of different groups. Some simple properties of this technique are explored.
Interesting features of a distribution such as ‘arms’ and multimodality are found along the directions where the largest probability mass is located. These directions can be quantified through the modes of a density estimate based on the direction of each observation.
KeywordsBoxplot contour kernel density estimate mode circular data
Unable to display preview. Download preview PDF.
- Aitchison, J. and Kay, J. W. (1975) Principles, practice and performance in decision making in clinical medicine. In The Role and Effectiveness of Theories of Decision in Practice, D. J. White and K. C. Bowen (eds.) pp. 252–72, Hodder & Stoughton, London.Google Scholar
- Becketti, S. and Gould, W. (1987) Rangefinder boxplots: a note. American Statistician 41, 149.Google Scholar
- Bowman, A. W. (1981) Some Aspects of Density Estimation by the Kernel Method. Ph.D. thesis, University of Glasgow.Google Scholar
- Bowman, A. W. (1985) A comparative study of some kernel-based nonparametric density estimates. J. Stat. Comp. Sim. 21, 313–327.Google Scholar
- Friedman, J. H. (1987) Exploratory projection pursuit. J. Amer. Statist. Assoc. 82, 249–266.Google Scholar
- Goldberg, K. M. and Inglewicz, I. (1992) Bivariate extensions of the boxplot. Technometrics 34, 307–20.Google Scholar
- Gower, J. C. (1974) The mediancentre. Applied Statistics 23, 466–470.Google Scholar
- Green, P. J. (1981) Peeling bivariate data. In Interpreting Multivariate Data (Chapter 1) V. Barnett (ed.) Wiley, Chichester.Google Scholar
- Jane (1978) Jane's Encyclopaedia of Aviation. Jane's, London.Google Scholar
- Kittler, J. (1976) A locally sensitive method for cluster analysis. Pattern Recognition 8, 23–33.Google Scholar
- Saviotti, P. P. and Bowman, A. W. (1984) Indicators of output of technology. Proc. ICSSR/SSRC Workshop on Science and Technology Policy in the 1980s; M. Gibbons et al. (eds.) Harvester Press, Brighton.Google Scholar
- Scott, D. W. (1978) Plasma lipids as collateral risk factors in coronary artery disease — a study of 371 males with chest pain. J. Chron. Dis. 31, 337–345.Google Scholar
- Scott, D. W. (1992) Multivariate Density Estimation: Theory, Practice and Visualisation. Wiley, New York.Google Scholar
- Scott, D. W. and Factor, L. E. (1981) Monte Carlo study of three data-based nonparametric density estimators. J. Amer. Statist. Assoc. 76, 9–15.Google Scholar
- Silverman, B. W. (1986) Density Estimation for Statistics and Data Analysis. Chapman and Hall, London.Google Scholar
- Tukey, P. A. and Tukey, J. W. (1981) Data-driven view selection; agglomeration and sharpening. In Interpreting Multivariate Data (Chapter 11), V. Barnett (ed). Wiley, Chichester.Google Scholar