Abstract
Much of the classification literature ignores notions of probability. In our view, this is due in part to a dominant tendency in the early days of computers for developing heuristic clustering algorithms and in part due to long traditions in classification outside the statistical/probabilistic orbit, of which biological taxonomy and book classification are primary examples. Statisticians have rightly stressed the role of probabilistic concepts in formulating classification problems and in interpreting classifications but we believe that they are wrong in suggesting, as they sometimes seem to, that other approaches are unsatisfactory. Probability has its proper place in classification but it is neither an essential nor always an appropriate tool. We discuss circumstances where non- probabilistically-based classifications are fully justified.
Considerations influencing the differences between the two approaches include: 1) Irrespective of whether things are to be assembled into classes (arranged hierarchically or not) or assigned to previously recognised classes, methodology depends on whether the things may be regarded as representing groups or as samples from groups; 2) Models are basic to the formulation of statistically based classifications, but they may also underpin nonprobabilistic classifications; overt models are not a characteristic of heuristic classification algorithms; 3) In principle, probabilistic models allow the significance and number of clusters justified by data to be assessed. In non-probabilistic classifications (probabilistic too), the eighteenth century concept of approximation offers a good basis for assessing the adequacy and stability of clusters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gilmour, J. S. L. (1937). A taxonomic problem. Nature, 134, 1040–1042.
Gower, J. C. (1971). A general coefficient of similarity and some of its properties. Biometrics, 27, 857–871.
Gower, J. C. (1975). Maximal predictive classification. Biometrics, 30, 643–654.
Gower, J. C. (1998). Classification in: Encyclopaedia of Biostatistics, Armitage, P. and Colton, T. (Eds.), Wiley, Chichester, (in press).
Payne, R. W. and Preece, D. A. (1980). Identification keys and diagnostic tables: a review (with discussion). R. Statist. Sac. A., 143, 253–292.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin · Heidelberg
About this paper
Cite this paper
Gower, J.C., Ross, G.J.S. (1998). Non-probabilistic Classification. In: Rizzi, A., Vichi, M., Bock, HH. (eds) Advances in Data Science and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-72253-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-72253-0_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64641-9
Online ISBN: 978-3-642-72253-0
eBook Packages: Springer Book Archive