Abstract
We assess the performance of a new clustering method for Hierarchical Factor Classification of variables, which is based on the evaluation of the least differences among representative variables of groups, as defined by a set of two-dimensional Principal Components Analysis. As an additional feature the method gives at each step a principal plane where both grouped variables and units, as seen only by these variables, can be projected. We compare the method results with both single and complete linkage clustering, applied to simulated data with known correlation structure and we evaluate the results with a coherence measure based on the entropy between the expected partitions and those found by the methods. We found that the Hierarchical Factor Classification method performed as good as, and in some cases better than, both single and complete linkage clustering in detecting the known group structures in simulated data, with the advantage that the groups of variables and the units can be viewed on principal planes where usual interpretations apply.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Abbreviations
- HFC:
-
Hierarchical Factor Classification
- PCA:
-
Principal Components Analysis
References
Anderberg, M.R. 1973. Cluster Analysis for Applications. Academic Press, New York.
Camiz, S., J.J. Denimal and V.D. Pillar. 2006. Hierarchical factor classification of quantitative variables and count data. Community Ecology. 7: 165–179.
Denimal, J.J. 2001. Hierarchical Factorial Analysis. Proceedings of the 10th International Symposium onApplied Stochastic Models and Data Analysis. Compiègne, 12–15 Juin 2001.
Florek, K., J. Lukaszewicz, J. Perkal, H. Steinhaus and S. Zubrzycki. 1951. Sur la liason et la division des points d’un ensemble fini. Colloquium Mathematicae 2: 282–285.
Ganeshanandam, S. and W.J. Krzanowski. 1990. Error-rate estimation in two-group discriminant analysis using linear discriminant function. Journal of Statistical Computation and Simulation 36: 157–175.
Gordon, A.D. 1999. Classification. 2nd ed. Chapman and Hall, London.
Lance, G.N. and W.T. Williams. 1967. A general theory of classificatory sorting strategies. I. Hierarchical systems. Computer J. 9: 373–380.
Legendre, P. and L. Legendre. 1998. Numerical Ecology, 2nd English edition. Elsevier, Amsterdam.
Lerman, I.C. 1991. Foundations of the likelihood linkage analysis (LLA) classification method. Applied Stochastic Models and Data Analysis 7: 63–76.
Milligan, G.W. and M.C. Cooper. 1985. An examination of procedures for determining the number of clusters in a data set. Psychometrika 50: 159–179.
Orlóci, L. 1991. Entropy and Information. SPB Academic Publishing, The Hague.
Peres-Neto, P.R. and D.A. Jackson. 2001. How well do multivariate datasets match? The advantages of a Procrustean superimposition approach over the Mantel test. Oecologia 129: 169–178.
Pillar, V.D. 1999. The bootstrapped ordination re-examined. J. Veg. Sci. 10: 895–902.
Pillar, V.D. 2006. MULTIV: Multivariate Exploratory Analysis, Randomization Testing and Bootstrap Resampling, User’s Guide v. 2.4. Universidade Federal do Rio Grande do Sul, Porto Alegre.
Pillar, V.D. and L. Orlóci. 1996. On randomization testing in vegetation science: multifactor comparisons of relevé groups. J. Veg. Sci. 7: 585–592.
Podani, J. 2000. Introduction to the Exploration of Multivariate Biological Data. Backhuys, Leiden.
SAS Institute. 1999. SAS Online Doc, Version 8. SAS Institute Inc, Cary, North Carolina.
Sneath, P.H.A. 1957. The application of computers to taxonomy. J. Gen. Microbiol. 17: 201–226.
Sørensen, T. 1948. A method for establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyses of the vegetation on Danish commons. Biologiske Skrifter 5(4): 1–34.
Vigneau, E., E.M. Qannari, K. Sahmer and D. Ladiray. 2006. Classification de variables autour de composantes latentes. Rev. Statistique Appliquée 54(1): 27–45.
Ward, J.H. 1963. Hierarchical grouping to optimize an objective function. J. Amer. Stat. Assoc. 58: 236–244.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Camiz, S., Pillar, V.D. Comparison of single and complete linkage clustering with the hierarchical factor classification of variables. COMMUNITY ECOLOGY 8, 25–30 (2007). https://doi.org/10.1556/ComEc.8.2007.1.4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1556/ComEc.8.2007.1.4