Advertisement

New combinatorial clustering methods

  • János Podani
Chapter
  • 63 Downloads
Part of the Advances in vegetation science book series (AIVS, volume 10)

Abstract

Sixteen clustering methods are compatible with the general recurrence equation of combinatorial SAHN (sequential, agglomerative, hierarchical and nonoverlapping) classificatory strategies. These are subdivided into two classes: the d-SAHN methods seek for minimal between-cluster distances the h-SAHN strategies for maximal within-cluster homogeneity. The parameters and some basic features of all combinatorial methods are listed to allow comparisons between these two families of clustering procedures. Interest is centred on the h-SAHN techniques; the derivation of updating parameters is presented and the monotonicity properties are examined. Three new strategies are described, a weighted and an unweighted variant of the minimization of the increase of average distance within clusters and a homogeneity-optimizing flexible method. The performance of d- and h-SAHN techniques is compared using field data from the rock grassland communities of the Sashegy Nature Reserve, Budapest, Hungary.

Keywords

Cluster homogeneity Dendrogram Flexible method Hierarchical classification Rock grassland Ultrametric 

Abbreviations

CP

Closest pair

RNN

Reciprocal nearest neighbor

SAHN

Sequential, agglomerative, hierarchical and nonoverlapping

References

  1. Anderberg, M.R. 1973. Cluster analysis for applications. Wiley, New York.Google Scholar
  2. Batagelj, V. 1981. Note on ultrametric clustering algorithms. Psychometrika 46: 351–352.CrossRefGoogle Scholar
  3. Bruynooghe, M. 1978. Classification ascendante hiérarchique des grands ensembles de données: une algorithme rapide fondé sur la construction des voisinages réductibles. Les Cahiers de l’analyse des Données 3: 7–33.Google Scholar
  4. D’Andrade, R. 1978. U-statistic hierarchical clustering. Psychometrika 43: 59–68.CrossRefGoogle Scholar
  5. Day, W.H.E. & Edelsbrunner, H. 1984. Efficient algorithms for agglomerative hierarchical clustering. J. Classif. 1: 7–24.CrossRefGoogle Scholar
  6. Diday, E. 1983. Inversions en classification hiérarchique: application à la construction adaptive d’indices d’agrégation. Rev. Stat. Appl. 31: 45–62.Google Scholar
  7. Diday, E., Lemaire, J., Pouget, J. & Testu, F. 1982. Eléments d’analyse de données. Dunod, Paris.Google Scholar
  8. DuBien, J.L. & Warde, W.D. 1979. A mathematical comparison of an infinite family of agglomerative clustering algorithms. Can. J. Stat. 7: 29–38.CrossRefGoogle Scholar
  9. Gauch, H.G. & Whittaker, R.H. 1981. Hierarchical classification of community data. J. Ecol. 69: 537–557.CrossRefGoogle Scholar
  10. Greig-Smith, P. 1983. Quantitative plant ecology. 3rd ed. Blackwell, Oxford.Google Scholar
  11. Gordon, A.D. 1987. A review of hierarchical classification. J. Roy. Stat. Soc, Ser. A. 150: 119–137.CrossRefGoogle Scholar
  12. Gower, J.C. 1967. A comparison of some methods of cluster analysis. Biometrics 23: 623–638.PubMedCrossRefGoogle Scholar
  13. Jambu, M. 1978. Classification automatique pour l’analyse des données. Tome 1. Dunod, Paris.Google Scholar
  14. Jambu, M. & Lebeaux, M.-O.1983. Cluster analysis and data analysis. North Holland Publ. Company, Amsterdam.Google Scholar
  15. Johnson, S.C. 1967. Hierarchical clustering schemes. Psychometrika 32: 241–254.PubMedCrossRefGoogle Scholar
  16. Lance, G.N. & Williams, W.T. 1966. A generalized sorting strategy for computer classifications. Nature 212: 218.CrossRefGoogle Scholar
  17. Lance, G.N. & Williams, W.T. 1967. A general theory of classificatory sorting strategies. I. Hierarchical systems. Comput. J. 9: 373–380.Google Scholar
  18. Morena-Casasola, P. & Espejel, I. 1986. Classification and ordination of coastal sand dune vegetation along the Gulf and Caribbean Sea of Mexico. Vegetado 66: 147–182.CrossRefGoogle Scholar
  19. Mucina, L. 1982. Numerical classification and ordination of ruderal plant communities (Sisymbrietalia, Onopordetalia) in the western part of Slovakia. Vegetatio 48: 267–275.CrossRefGoogle Scholar
  20. Milligan, G.W. 1979. Ultrametric hierarchical clustering algorithms. Psychometrika 44: 343–346.CrossRefGoogle Scholar
  21. Milligan, G.W. 1987. A study of the beta-flexible clustering method. Working paper No WPS 87–61. College of Business, Ohio State University.Google Scholar
  22. Murtagh, F. 1983. A survey of recent advances in hierarchical clustering. Comput. J. 26: 354–359.Google Scholar
  23. Orlóci, L. 1967. An agglomerative method for classification of plant communities. J. Ecol. 55: 193–205.CrossRefGoogle Scholar
  24. Orlóci, L. 1978. Multivariate analysis in vegetation research, 2nd ed. Junk, The Hague.Google Scholar
  25. Orlóci, L. & Stanek, W. 1979. Vegetation survey of the Alaska Highway, Yukon Territory: types and gradients. Vegetatio 41: 1–56.CrossRefGoogle Scholar
  26. Podani, J. 1978. Hierarchical clustering methods for the analysis of binary phytosociological data. Ph. D. thesis, L. Eötvös University, Budapest (manuscript in Hungarian).Google Scholar
  27. Podani, J. 1979. Generalized strategy for homogeneity-optimizing hierarchical classificatory methods. In: Orlóci, L., Rao, C.R. & Stiteler, W.M. (eds), Multivariate methods in ecological work. pp. 203–209. International Co-operative Publishing House, Burtonsville, Maryland.Google Scholar
  28. Podani, J. 1985. Syntaxonomical congruence in a small-scale vegetation survey. Abstr. Bot. 9: 99–128.Google Scholar
  29. Podani, J. 1986. Comparison of partitions in vegetation studies. Abstr. Bot. 10: 235–290.Google Scholar
  30. Podani, J. A method for generating consensus partitions and its application to community classification. Coenoses (in press).Google Scholar
  31. Podani, J. 1988. SYN-TAX III. A package of programs for data analysis in ecology and systematics. Coenoses 3: 111–119.Google Scholar
  32. Podani, J. & Dickinson, T.D. 1984. Comparison of dendrograms: a multivariate approach. Can. J. Bot. 62: 2765–2778.CrossRefGoogle Scholar
  33. Popma, J., Mucina, L., van Tongeren, O. & van der Maarel, E. 1983. On the determination of optimal levels in phytosociological classification. Vegetatio 52: 65–75.CrossRefGoogle Scholar
  34. Sneath, P.H.A. & Sokal, R.R. 1973. Numerical taxonomy. 2nd ed. Freeman, San Francisco.Google Scholar
  35. Sokal, R.R. & Michener, CD. 1958. A statistical method for evaluating systematic relationships. Univ. Kansas Sci. Bull. 38: 1409–1438.Google Scholar
  36. van der Maarel, E. 1979. Multivariate methods in phytosociology, with reference to the Netherlands. In: Werger, M.J.A. (ed.), The study of vegetation, pp. 163–225. Junk, The Hague.Google Scholar
  37. Wishart, D. 1969. An algorithm for hierarchical classifications. Biometrics 25: 165–170.CrossRefGoogle Scholar
  38. Zólyomi, B. 1958. The natural vegetation of Budapest and its surroundings. In: Pécsi, M. (ed.), Budapest természeti képe. pp. 508–642. Akadémiai, Budapest (in Hungarian).Google Scholar

Copyright information

© Kluwer Academic Publishers 1989

Authors and Affiliations

  • János Podani
    • 1
    • 2
  1. 1.Department of Plant Taxonomy and EcologyL. Eötvös UniversityBudapestHungary
  2. 2.Research Institute of Ecology and BotanyHungarian Academy of SciencesVácrátótHungary

Personalised recommendations