The Effects of Initial Values and the Covariance Structure on the Recovery of some Clustering Methods

Hajnal, Istvan; Loosveldt, Geert

doi:10.1007/978-3-642-59789-3_7

Istvan Hajnal⁸ &
Geert Loosveldt⁸

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

1833 Accesses
1 Citations

Abstract

Some clustering methods are compared in a simulation study. The data used in the analysis are generated in a mixture modeling framework. The methods included are some hierarchical methods, A:-means as implemented in the FASTCLUS procedure of SAS and cluster analysis by means of normal mixtures with the NORMIX program. We demonstrate that the poor recovery found in some studies for normal mixture type of clustering is partly due to the use of bad initial values, and partly due to the specification of covariance structure within the cluster. We further find that an important factor in the relative success of FASTCLUS lies in the initial seed selection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

BAYNE, C.K., BEAUCHAMP, J.J., BEGOVICH, C.L. and KANE, V.E. (1980): Monte carlo comparisons of selected clustering procedures. Pattern Recognition, 12, 51–6.
Article Google Scholar
DONOGHUE, J.R. (1995): The effects of within-group covariance structure on recovery in cluster analysis. I. The bivariate case. Multivariate Behavioral Research, 30(2):227–254.
Article Google Scholar
EVERITT, B.S. (1974): Cluster Analysis. Heinemann Educational Books, London, UK.
Google Scholar
HUBERT, L. and ARABIE, P. (1985): Comparing partitions. Journal of Classification, 2, 193–218.
Article Google Scholar
MCLACHLAN, G.J. and BASFORD, K.E. (1988): Mixture Models. Inference and applications to Clustering. Marcel Dekker, New York.
Google Scholar
MEZZICH, J. E. (1978): Evaluating clustering methods for psychiatric diagnosis. Biological Psychiatry, 13(2), 265–281.
Google Scholar
MILLIGAN, G.W. (1980): An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika, 45(3), 325–342.
Article Google Scholar
MILLIGAN, G.W. (1981): A review of monte carlo tests of cluster analysis. Multivariate Behavioral Research, 16, 379–407.
Article Google Scholar
MILLIGAN, G.W. (1996): Clustering validation: Results and implications for applied analysis. In: G. De Soete, P. Arabie and L.J. Hubert (Eds.): Clustering and Classification. World Scientific Publ., River Edge, NJ, 341–375.
Google Scholar
PRICE L.J. (1993): Identifying cluster overlap with normix population membership probabilities. Multivariate Behavorial Research, 28(2). 235–262
Google Scholar
SAS Institute Inc. (1989): SAS/STAT User’s Guide, Version 6, Fourth Edition, Volume 1, ANOVA-FREQ. SAS Institute, Cary, NC.
Google Scholar
WOLFE, J.H. (1970): Pattern clustering by multivariate mixture analysis. Multivariate Behavioral Research, 5, 329–350.
Article Google Scholar
WOLFE, J.H. (1978): Comparative cluster analysis of patterns of vocational interest. Multivariate Behavioral Research, 13, 33–44.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Sociology, University of Leuven, Edward Van Evenstraat 2B, 3000, Leuven, Belgium
Istvan Hajnal & Geert Loosveldt

Authors

Istvan Hajnal
View author publications
You can also search for this author in PubMed Google Scholar
Geert Loosveldt
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Groningen Heymans Institute (PA), Grote Kruisstraat 2/1, NL-9712 TS, Groningen, The Netherlands
Henk A. L. Kiers
Facultés Universitaires Notre-Dame de la Paix, University of Namur, Rempart de la Vierge, 8, B-5000, Namur, Belgium
Jean-Paul Rasson (Directeur du Department de Mathématique) (Directeur du Department de Mathématique)
Data Theory Group Department of Education, Leiden University, P.O. Box 9555, NL-2300 RB, Leiden, The Netherlands
Patrick J. F. Groenen
Lehrstuhl für Wirtschaftsinformatik III Schloß, University of Mannheim, D-68131, Mannheim, Germany
Martin Schader

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hajnal, I., Loosveldt, G. (2000). The Effects of Initial Values and the Covariance Structure on the Recovery of some Clustering Methods. In: Kiers, H.A.L., Rasson, JP., Groenen, P.J.F., Schader, M. (eds) Data Analysis, Classification, and Related Methods. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-59789-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-59789-3_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67521-1
Online ISBN: 978-3-642-59789-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics