Abstract
In the analysis of multidimensional ecological data, it is often relevant to identify groups of variables since these groups may reflect similar ecological processes. The usual approach, the application of well-known clustering procedures using an appropriate similarity measure among the variables, may be criticized, but specific methods for clustering variables are neither investigated in detail nor used broadly. Here we introduce a new clustering method, the Hierarchical Factor Classification of variables, which is based on the evaluation of the least differences among representative variables of groups, as revealed by a two-dimensional Principal Components Analysis. As an additional feature, the method gives at each step a principal plane where both the grouped variables and the units, considered only according to these variables, can be projected. This method can be adapted to count data, so that it may be used for classifying both rows and columns of a contingency data table, by using the chi-square metric. In an example, we apply both methods to vegetation and soil data from the Campos in Southern Brazil.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Abbreviations
- CA:
-
Correspondence Analysis
- HFC:
-
Hierarchical Factor Classification
- PCA:
-
Principal Components Analysis
References
Anderberg, M.R. 1973. Cluster Analysis for Applications. Academic Press, New York.
Austin, M.P. and L. Belbin. 1982. A new approach to the species classification problem in floristic analysis. Aust. J. Ecol. 7: 75–89.
Baccini, A. 1984. Étude comparative des representations graphiques en analyses factorielles des correspondances simples et multiples. Université Paul Sabatier, Toulouse, Publications du Laboratoire de Statistique et Probabilité n. 02/84.
Benzécri, J.P. et al. 1973–82. L’analyse des données. Volume 2: L’analyse des correspondances. Dunod, Paris.
Boldrini, I.I. 1997. Campos no Rio Grande do Sul. Fisionomia e problemática ocupacional. Boletim do Instituto de Biociencias, Universidad Federal do Rio Grande do Sul 56: 1–39.
Camiz, S. 1994. A procedure for structuring vegetation tables. Abstracta Botanica 18: 57–70.
Camiz, S. 2005. The Guttman effect: its interpretation and a new redressing method. Tetradia Analyses Dedomenon (Data Analysis Bulletin) 5:7–34.
Camiz, S., and J.J. Denimal. 2001. Statistical evaluation of cross-classifications derived from rearranged community data matrices. Community Ecol. 1: 81–92.
Camiz, S. and J.J. Denimal. 2003. Nouvelle technique de segmentation associée a une classification de variables., XXXVèmes Journées de Statistique, Lyon, Société Française de Statistique, Université Lumiere Lyon 2, Tome 1: 293–296.
Denimal, J.J. 2000. Correspondances hiérarchiques: une nouvelle approche. XXXII Journées de la Société Française de Statistique, Fes, Maroc.
Denimal, J.J. 2001. Hierarchical Factorial Analysis. Proceedings of the 10th International Symposium in Applied Stochastic Models and Data Analysis. Compiègne, June 12–15, 2001.
Denimal, J.J. and S. Camiz. 2001. Exact conditional tests for a reciprocal interpretation of hierarchical classifications built on a two way contingecy table. Metron 59: 157–178.
EMBRAPA, 1999. Classificação dos Solos Brasileiros. EM-BRAPA, Brasília.
Forgy, E.W. 1965. Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21:768–769.
Godement, R. 1966. Cours d’Algèbre. Hermann, Paris.
Gordon, A.D. 1999. Classification. Chapman and Hall, London.
Greenacre, M.J. 1984. Theory and Applications of Correspondence Analysis. Academic Press, London.
Harman, H.H. 1976. Modern Factor Analysis. University of Chicago Press, Chicago.
Hill, M.O. 1973. Reciprocal averaging: an eigenvector method of ordination. J. Ecol. 61:237–249.
Höppner, F. , F. Klawonn, R. Kruse, and T. Runkler 1999. Fuzzy Cluster Analysis. Wiley, Chichester.
Jackson, J.E. 1991. A Users Guide to Principal Components Analysis. Wiley, New York.
Köppen, W. P. and R. Geiger. 1930–1939. Handbuch der Klimatologie. 6 volumes, Gebruder Borntraeger, Berlin.
Lance, G.N. and W.T. Williams. 1967. A general theory of classificatory sorting strategies. I. Hierarchical systems. Computer J. 9: 373–380.
Lang, S. 1972. Linear Algebra. Addison-Wesley, Reading, MA.
Lebart, L., A. Morineau and M. Piron. 1995. Statistique exploratoire multidimensionnelle. Dunod, Paris.
Legendre, P. and L. Legendre. 1998. Numerical Ecology, 2nd English edition. Elsevier, Amsterdam.
Lerman, I.C. 1981. Classification et analyse ordinale des données. Dunod, Paris.
McQuitty, L.L. 1967. Expansion of similarity analysis by reciprocal pairs for discrete and continuous data. Educational and Psychological Measurement 27: 253–255.
Milligan, G.W. and M.C. Cooper. 1985. An examination of procedures for determining the number of clusters in a data set. Psychometrika 50: 159–179.
Nelson, B.D. 2001. Variable Reduction for Modeling using PROC VARCLUS. Proceedings of the 26th annual SAS User Group International Conference. SAS Institute, Cory, NC.
Orlóci, L. 1978. Multivariate Analysis in Vegetation Research. 2nd ed. Junk, The Hague.
Pillar, V.D. 1988. Fatores de ambiente relacionados à variação da vegetação de um campo natural. M.Sc. Dissertation. Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.
Pillar, V.D. 1999. The bootstrapped ordination re-examined. J. Veg. Sci. 10: 895–902.
Pillar, V.D. 2006. MULTIV: Multivariate Exploratory Analysis, Randomization Testing and Bootstrap Resampling, User’s Guide version 2.4. Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.
Pillar, V.D., A.V.A. Jacques and I.I. Boldrini. 1992. Fatores de ambiente relacionados à variação da vegetação de um campo natural. Pesquisa Agropecuária Brasileira 27: 1089–1101.
Podani, J. 2000. Introduction to the Exploration of Multivariate Biological Data. Backhuys, Leiden.
Podani, J. 2001. SYN-TAX 2000 - Computer Programs for Data Analysis in Ecology and Systematics. Scientia Publishing, Budapest.
Qannari, E.M., E. Vigneau and Ph. Courcoux. 1999. Classification des variables autour de composantes principales; applications. XXXI Journées de Statistique - 17–21 mai 1999 - Grenoble France. Résumés, Société Française de Statistique: 677–679.
SAS Institute. 1999. SAS Online Doc, Version 8. SAS Institute Inc, Cary, North Carolina.
Shaffer, R. et al. 1999. SPSS for Windows 9.0: A Basic Tutorial. McGraw-Hill, New York.
Sokal, R.R. and C.D. Michener. 1958. A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin 38: 1409–1438.
Tedesco, M.J., S.J. Volkweiss and H. Bohnen. 1985. Análises de Solos, Plantas e Outros Materiais. Faculdade de Agronomia, UFRGS, Porto Alegre.
Torgerson, W.S. 1958. Theory and Methods of Scaling. Wiley, New York.
Vigneau, E., E.M. Qannari, K. Sahmer and D. Ladiray. 2006. Classification de variables autour de composantes latentes. Rev. Statistique Appliquée 54: 27–45.
Wallace, C.S. and M.B. Dale 2005. Hierarchical clusters of vegetation types. Community Ecol. 6:57–74.
Ward, J.H. 1963. Hierarchical Grouping to Optimize an Objective Function. J. Amer. Stat. Assoc. 58:236–244.
Wildi, O. and L. Orlóci. 1996. Numerical Exploration of Community Patterns. A guide to the use of MULVA-5. 2nd edition. SPB Academic Publishing b.v., Amsterdam.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Camiz, S., Denimal, J.J. & Pillar, V.D. Hierarchical factor classification of variables in ecology. COMMUNITY ECOLOGY 7, 165–179 (2006). https://doi.org/10.1556/ComEc.7.2006.2.4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1556/ComEc.7.2006.2.4
Keywords
- Classification of variables
- Correspondence analysis
- Hierarchical classification
- Principal components analysis