Non-probabilistic Classification

Gower, John C.; Ross, Gavin J. S.

doi:10.1007/978-3-642-72253-0_3

John C. Gower⁸ &
Gavin J. S. Ross⁹

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

903 Accesses
2 Citations

Abstract

Much of the classification literature ignores notions of probability. In our view, this is due in part to a dominant tendency in the early days of computers for developing heuristic clustering algorithms and in part due to long traditions in classification outside the statistical/probabilistic orbit, of which biological taxonomy and book classification are primary examples. Statisticians have rightly stressed the role of probabilistic concepts in formulating classification problems and in interpreting classifications but we believe that they are wrong in suggesting, as they sometimes seem to, that other approaches are unsatisfactory. Probability has its proper place in classification but it is neither an essential nor always an appropriate tool. We discuss circumstances where non- probabilistically-based classifications are fully justified.

Considerations influencing the differences between the two approaches include: 1) Irrespective of whether things are to be assembled into classes (arranged hierarchically or not) or assigned to previously recognised classes, methodology depends on whether the things may be regarded as representing groups or as samples from groups; 2) Models are basic to the formulation of statistically based classifications, but they may also underpin nonprobabilistic classifications; overt models are not a characteristic of heuristic classification algorithms; 3) In principle, probabilistic models allow the significance and number of clusters justified by data to be assessed. In non-probabilistic classifications (probabilistic too), the eighteenth century concept of approximation offers a good basis for assessing the adequacy and stability of clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Gilmour, J. S. L. (1937). A taxonomic problem. Nature, 134, 1040–1042.
Article Google Scholar
Gower, J. C. (1971). A general coefficient of similarity and some of its properties. Biometrics, 27, 857–871.
Article Google Scholar
Gower, J. C. (1975). Maximal predictive classification. Biometrics, 30, 643–654.
Article Google Scholar
Gower, J. C. (1998). Classification in: Encyclopaedia of Biostatistics, Armitage, P. and Colton, T. (Eds.), Wiley, Chichester, (in press).
Google Scholar
Payne, R. W. and Preece, D. A. (1980). Identification keys and diagnostic tables: a review (with discussion). R. Statist. Sac. A., 143, 253–292.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, The Open University, Walton Hall, Milton Keynes, MK7 6AA, UK
John C. Gower
Statistics Department, IACR Rothamsted, Harpenden, Herts, AL5 2JQ, UK
Gavin J. S. Ross

Authors

John C. Gower
View author publications
You can also search for this author in PubMed Google Scholar
Gavin J. S. Ross
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Statisticà, Probabilità e Statistiche Applicate, Università di Roma “La Sapienza”, Piazzale Aldo Moro 5, I-00185, Roma, Italia
Alfredo Rizzi
Dipartimento di Metodi Quantitativi e Teoria Economica, Università “G. D’Annunzio” di Chieti, Viale Pindaro 42, I-65127, Pescara, Italia
Maurizio Vichi
Institut für Statistik und Wirtschaftsmathematik, Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Wüllnerstraße 3, D-52056, Aachen, Germany
Hans-Hermann Bock

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gower, J.C., Ross, G.J.S. (1998). Non-probabilistic Classification. In: Rizzi, A., Vichi, M., Bock, HH. (eds) Advances in Data Science and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-72253-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-72253-0_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64641-9
Online ISBN: 978-3-642-72253-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics