Summary
A problem common to all clustering techniques is the difficulty of deciding the number of clusters present in the data. The aim of this paper is to compare three methods based on the hypervolume criterion with four other well-known methods. This evaluation of procedures for determining the number of clusters is conducted on artificial data sets. To provide a variety of solutions the data sets are analysed by six clustering methods. We finally conclude by pointing out the performance of each method and by giving some guidance for making choices between them.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
ANDERBERG, M.R. (1973): Cluster Analysis for Applications. Academic Press, New York.
BOCK, H.H. (1985): On some significance tests in cluster analysis. Journal of Classification, 2, 77–108.
DIDAY, E. et Collaborateurs (1979): Optimisation en Classification Automatique. INRIA, Paris.
EVERITT, B. (1980): Cluster analysis. Halsted Press, London.
GORDON, A.D. (1981): Classification. Chapman and Hall, London.
HARDY, A., and RASSON, J.P. (1982): Une nouvelle approche des problèmes de classification automatique. Statistique et Analyse des données, 7, 41–56.
HARDY, A. (1983): Une nouvelle approche des problèmes de classification automatique. Un modèle — Un nouveau critère — Des algorithmes — Des applications. Ph.D Thesis, F.U.N.D.P., Namur, Belgium.
HARDY, A. (1993): Criteria for determining the number of groups in a data set based on the hypervolume criterion. Technical report, FUNDP, Namur, Belgium.
MOORE, M. (1984): On the estimation of a convex set. The Annals of Statistics, 12, 3, 1090–1099.
NEVEU, J. (1974): Processus ponctuels. Technical report, Laboratoire de Calcul des Probabilités, Université Paris VI.
RIPLEY, B.D., and RASSON, J.P. (1977): Finding the edge of a Poisson Forest. Journal of Applied Probability, 14, 483–491.
WISHART, D. (1978): CLUSTAN User Manual, 3rd ed., Program Library Unit, University of Edimburgh.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1994 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hardy, A. (1994). An examination of procedures for determining the number of clusters in a data set. In: Diday, E., Lechevallier, Y., Schader, M., Bertrand, P., Burtschy, B. (eds) New Approaches in Classification and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-51175-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-51175-2_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58425-4
Online ISBN: 978-3-642-51175-2
eBook Packages: Springer Book Archive