On plexus representation of dissimilarities

Abstract

Correspondence analysis has found widespread application in analysing vegetation gradients. However, it is not clear how it is robust to situations where structures other than a simple gradient exist. The introduction of instrumental variables in canonical correspondence analysis does not avoid these difficulties. In this paper I propose to examine some simple methods based on the notion of the plexus (sensu McIntosh) where graphs or networks are used to display some of the structure of the data so that an informed choice of models is possible. I show that two different classes of plexus model are available. These classes are distinguished by the use in one case of a global Euclidean model to obtain well-separated pair decomposition (WSPD) of a set of points which implicitly involves all dissimilarities, while in the other a Riemannian view is taken and emphasis is placed locally, i.e., on small dissimilarities. I show an example of each of these classes applied to vegetation data.

Abbreviations

MST:

Minimal Spanning Tree

CA:

Correspondence Analysis

w-b-c:

Williams, Bunt and Clay (1991).

References

  1. Agarwal, P. K., J. Matousek and S. Suri. 1992. Farthest neighbors, maximum spanning trees, and related problems in higher dimensions. Comput. Geom.: Theory and Appl. 4:189–201.

    Google Scholar 

  2. Al Ayouti, B. 1992. New forms of graphical representation in data analysis: additive forests. Proc Conf. Distancia, Rennes.

    Google Scholar 

  3. Allison, L. & C. S. Wallace. 1994. The posterior probability distribution of alignments and its application to parameter estimation of evolutionary trees and to optimisation of multiple alignments. J. Molecular Evolution 39:418–430.

    CAS  Google Scholar 

  4. Althöfer, I., G. Das, D. Dobkin, D. Joseph and J. Soares. 1993 On sparse spanners of weighted graphs. Discrete Comput. Geom. 9:81–100.

    Google Scholar 

  5. Ash, P. F. and E. D. Bolker. 1986. Generalized Dirichlet tessellation. Geometriae Dedicata 20: 209–243.

    Google Scholar 

  6. Aurenhammer, F. 1991. Voronoi diagrams - a survey of a fundamental geometric data structure. A. C. M. Computing Surveys 23: 345–405.

    Google Scholar 

  7. Austin, M. P. 1976. On nonlinear species response models in ordination. Vegetatio 33: 33–41.

    Google Scholar 

  8. Austin, M. P. 1990. Community theory and competition in vegetation. In: J. B. Grace and D. Tilman (eds.), Perspectives on Plant Competition. Academic Press, San Diego, pp. 215–238.

    Google Scholar 

  9. Babad, Y. M. & J. A. Hoffer. 1984. Even no data has value. Commun. Assoc. Comput. Mach. 27: 748–756.

    Google Scholar 

  10. Bandelt, H-J. and A. W. M. Dress. 1992. A canonical decomposition theory for metrics on a finite set. Adv. Math. 92: 47–105.

    Google Scholar 

  11. Barkman, J. J. 1965. Die Kryptogamenflora einiger Vegetationstypen in Drente und ihr Zusammenhang mit Boden und Mikroklima. In: R. Tuxen (ed.), Biosoziologie, Ber Symp. Int. Ver. Vegetskunde. Stolzenau/Weser 1960, pp. 157–171.

    Google Scholar 

  12. Beals, E. W. 1973. Ordination: mathematical elegance and ecological naivete. J. Ecol. 61: 23–35.

    Google Scholar 

  13. Birks, H. L. B., S. M. Peglar and H. A. Austin. 1996. An annotated bibliography of canonical correspondence analysis and related constrained ordination methods 1986–1993. Abstracta Botanica 20: 17–36.

    Google Scholar 

  14. Bradfield, G. E. and N. C. Kenkel. 1987. Nonlinear ordination using flexible shortest path adjustment of ecological distances. Ecology 68: 750–753.

    Google Scholar 

  15. Cai, L. 1994. NP-completeness of minimum spanner problems. Discrete Appl. Math. 48: 187–194.

    Google Scholar 

  16. Camerini, P. M. 1978. The min-max spanningtree problem and some extensions. Inform. Proc. Lett. 7: 10–14.

    Google Scholar 

  17. Camerini, P. M, F Maffioli, S. Martello and P. Toth. 1986. Most and least uniform spanning trees. Discrete Applied Math. 15: 81–187.

    Google Scholar 

  18. Chatterjee, S. and A. Narayanan. 1992. A new approach to discrimination and classification using a Hausdorff type metric. Austral. J. Statist. 34:391–406.

    Google Scholar 

  19. Critchlow, D. 1985. Metric Methods for Analyzing Partially Ranked Data. Springer-Verlag, New York.

    Google Scholar 

  20. Culik II, K. and H. A. Maurer. 1978. String representations of graphs. Internt. J. Computer Math. Sect. A 6: 272–301.

    Google Scholar 

  21. Dale, M. B. 1975. On the objectives of ordination. Vegetatio 30: 15–32.

    Google Scholar 

  22. Dale, M. B. 1994. Straightening the horseshoe: a Riemannian resolution? Coenoses 9: 43–53

    Google Scholar 

  23. De’ath, G. 1999. Principal curves: a new technique for indirect and direct gradient analysis. Ecology 80: 2237–2253.

    Google Scholar 

  24. de Soete, G. 1988. Tree representations of proximity data by least squares methods. In: H. H. Bock (ed.), Classification and Related Methods of Data Analysis. North Holland, Amsterdam, pp. 147–156.

    Google Scholar 

  25. de Vries 1952. Objective combination of species. Acta Bot. Neerl. 1: 497–499.

    Google Scholar 

  26. de Vries, D. M., J. P. Baretta and G. Haming. 1954. Constellation of frequent herbage plants based on their correlation in occurrence. Vegetatio 5/6: 105–111.

    Google Scholar 

  27. Deichsel, G. 1980. Random walk clustering in large data sets. COMPSTAT1980. Physica-Verlag, Vienna pp. 454–459.

    Google Scholar 

  28. Diday, E. and P. Bertrand. 1986. An extension to hierarchical clustering: the pyramidal presentation. In: E. S. Gelsema and L. N. Kanak (eds.), Pattern Recognition in Practice. Elsevier Science, Amsterdam, pp. 411–424.

    Google Scholar 

  29. Dobkin, D., S. J. Friedman, and K. J. Supowit. 1990. Delaunay graphs are almost as good as complete graphs. Discrete Comput. Geom. 5: 399–407.

    Google Scholar 

  30. Dress, A. W. M, D. H. Huson and V. Moulton. 1996. Analyzing and visualizing sequence and distance data using “SplitsTree”. Discrete Applied Math. 71: 95–109.

    Google Scholar 

  31. Duckworth, J. C., R. G. H. Bunce and A. J. C. Malloch. 2000. Vegetation-environment relationships in Atlantic European calcareous grasslands. J. Veg. Sci. 11:15–22.

    Google Scholar 

  32. Edgoose, T. and L. Allison. 1999. MML Markov classification of sequential data. Statistics and Computing 9: 269–278.

    Google Scholar 

  33. Eilertson, O., R. H. Økland, T. Økland and O. Pederson. 1989. The effects of scale range, species removal and downweighting of rare species on eigenvalue and gradient length in DCA ordination. J. Veg. Sci. 1:261–270.

    Google Scholar 

  34. Eppstein, D. 1992. The farthest point Delaunay triangulation minimizes angles. Computational Geometry Theory and Applications 1: 143–148.

    Google Scholar 

  35. Escofier, B., H. Benali and K. Bachar. 1990. Comment introduire la contiguité en analyse des correspondances? Application en segmentation d’image. Rapport de recherche de l’INRIA - Rennes, RR-1191, 22 pages - Mars 1990.

    Google Scholar 

  36. Falinski, J. 1960. Zastosowanie taksonomii wroclawskiej do fitosocjologii. Acta Soc. bot. Pol. 29: 333–361.

    Google Scholar 

  37. Famili, A. and P. Turney. 1991. Intelligently Helping Human Planner in Industrial Process Planning. AIEDAM 5: 109–124.

    Google Scholar 

  38. Fayyad, U., G. Piatetsky-Shapiro and P. Smyth. 1996. From Data Mining to Knowledge Discovery. In: U. Fayyad et al. (eds.), Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, Menlo Park, CA, pp. 1–34.

    Google Scholar 

  39. Friedman, J. H. and L. C. Rafsky. 1979. Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests. Ann. Statist. 7: 697–717.

    Google Scholar 

  40. Friedman, J. H. and L. C. Rafsky. 1983 Graph-theoretic measures of multivariate association and prediction. Ann. Statist. 11: 377–391.

    Google Scholar 

  41. Gabriel, K. R. and C. L. Odoroff. 1984. Resistant lower rank approximation of matrices. In: E. Diday, M. Jambu, L. Lebart, J. Pagés, and R. Tomassone (eds.), Data Analysis and Informatics III North Holland, Amsterdam. pp. 23–30.

    Google Scholar 

  42. Gabriel, K. R., C. L. Odoroff and S. Choi. 1988. Fitting lower dimensional ordinations to incomplete similarity data. In: H. H. Bock (ed.), Classification and Related Methods of Data Analysis. Elsevier- North Holland, pp. 445–454.

    Google Scholar 

  43. Gabriel, K. R. and R. R. Sokal. 1969. A new statistical approach to geographical analysis. Syst. Zool. 18: 54–64.

    Google Scholar 

  44. Gimingham, C. H. 1961. North European heath communities: a network of variation. J. Ecol. 49: 655–694.

    Google Scholar 

  45. Godehardt, E. and H. Herrmann. 1988. Multigraphs as atool for numerical classification. In: H. Bock (ed.), Classification and related methods of Data Analysis. Elsevier, North Holland, pp. 219–229.

    Google Scholar 

  46. Goodall, D. W. and R. W. Johnson. 1982. Non-linear ordination in several dimensions: a maximum likelihood approach. Vegetatio 48: 197–208.

    Google Scholar 

  47. Goodall, D. W. and R. W. Johnson. 1987. Maximum likelihood ordination: some improvements. Vegetatio 73: 3–13.

    Google Scholar 

  48. Hill, M. O. 1973. Reciprocal averaging: an eigenvector method of ordination. J. Ecol. 61: 237–249.

    Google Scholar 

  49. Hubert, L. and P. Arabic 1992. Correspondence analysis and optimal structural representations. Psychometrika 56: 119–140.

    Google Scholar 

  50. Hubert, L. and P. Arabie. 1994. The analysis of proximity matrices through sums of matrices having (anti-)Robinson forms. Brit. J. Math. Statist. Psychol. 47: 1–40.

    Google Scholar 

  51. Hubert, L. and J. Schultz. 1975. Hierarchical clustering and the concept of space distortion. Brit. J. Math. Statist. Psychol. 28:121–133.

    Google Scholar 

  52. Hubert, L. and J. Schultz. 1976. Quadratic assignment as a general data analysis strategy. Brit. J. Math. Statist. Psychol. 29: 190–241.

    Google Scholar 

  53. Huisman, J., H. Olff and L. F. M. Fresco. 1993. A hierarchical set of models for species response analysis. J. Veg. Sci. 4: 37–46.

    Google Scholar 

  54. Ihm, P and H. van Groenewoud. 1975A multivariate ordering of vegetation data based on Gaussian type gradient response curves J. Ecol. 63: 767–777.

    Google Scholar 

  55. Karadžić, B. and R. Popović. 1994. A generalized standardization procedure in ecological ordination: test with Principal Components Analysis. J. Veg. Sci. 5: 259–262.

    Google Scholar 

  56. Keil, J. M. and C. A. Gutwin. 1992. Classes of graphs which approximate the complete Euclidean graph. Computational Geometry 7: 13–28.

    Google Scholar 

  57. Kendall, D. G. 1971. Seriation from abundance matrices. In: F. R. Hodson, D. G. Kendall and P. Tautu (eds.), Mathematics in the Archaeological and Historical Sciences. Edinburgh Univ. Press. pp. 215–252.

    Google Scholar 

  58. Klauer, K. C. 1989. Ordinal network representation: representing proximities by graphs. Psychometrika 54: 737–750.

    Google Scholar 

  59. Levcopoulos, C. and A. Lingas. 1989. There are planar graphs almost as good as complete graphs and about as cheap as the minimal spanning tree. Proc. Internatl. Symp. Optimal Algorithms. Lecture Notes in Computer Science 401, Springer, Berlin, pp. 9–13.

    Google Scholar 

  60. Maa, J-F, D. K. Pearl and R. Bartoszyński. 1996. Reducing multidimensional two-sample data to one-dimensional interpoint comparisons. Annals Statistics 24: 1069–1074.

    Google Scholar 

  61. Mcintosh, R. P. 1973. Matrix and plexus techniques. In: R. H. Whittaker (ed.), Ordination and Classification of Communities. Dr. W. Junk, den Haag. pp. 157–191.

    Google Scholar 

  62. Minchin, P. R. 1987. An evaluation of the relative robustness of techniques for ecological ordination. Vegetatio 69: 89–107.

    Google Scholar 

  63. Mulder, H. M. and A. Schrijver. 1979. Median graphs and Helly hypergraphs. Discrete Mathematics 25: 41–50.

    Google Scholar 

  64. Murtagh, F. 1983. A probability theory of hierarchic clustering using random dendrograms. J. Statist. Comput. Simul. 18: 145–157.

    Google Scholar 

  65. Naga, R. A. and G. Antille. 1990. Stability of robust and non-robust principal components analysis. Comput. Statist. Data Anal. 10: 169–174.

    Google Scholar 

  66. Naouri, J-C. 1970. Analyse factorielle des correspondances continues. Publ. l’Inst. Statist. Univ. Paris 19:1–100.

    Google Scholar 

  67. O’Callaghan, J. 1974. An alternative definition for the neighbourhood of a point. I. E. E. E. Trans. Comput. C-24: 1121–1125.

    Google Scholar 

  68. Oksanen, J. and R. R. Minchin. 1997. Instability of ordination results under changes in input data order: explanations and remedies. J. Veg. Sci. 8: 447–454.

    Google Scholar 

  69. Orth, B. 1988. Representing similarities by distance graphs: monotone network analysis (MONA). In: H. H. Bock (ed.), Classification and Related methods of Data Analysis. North Holland, Amsterdam. pp. 489–496.

    Google Scholar 

  70. Posse, C. 1995. Tools for two-dimensional exploratory projection pursuit. J. Computer Graphics Statist. 4: 83–100.

    Google Scholar 

  71. Taguri, M., M. Hiramatsu, T. Kittaka and K. Wakimoto. 1976. Graphical representation of correlation analysis of ordered data by linked vector pattern. J. Jap. Statist. Soc. 6: 17–25.

    Google Scholar 

  72. Tamassia, R. and I. G. Tollis. 1995. Graph Drawing. DIMACS Internatl. Workshop, Princeton 1994. Lecture Notes in Computer Science 894, Springer, Berlin.

    Google Scholar 

  73. Tausch, R. J., D. A. Charlet, D. A. Weixelman and D. C. Zamudio. 1995. Patterns of ordination and classification instability resulting from changes in input data order. J. Veg. Sci. 6: 897–902.

    Google Scholar 

  74. ter Braak, C. J. F. 1986. Canonical correspondence analysis a new eigenvector technique for multivariate direct gradient analysis. Ecology 67: 1167–1179.

    Google Scholar 

  75. Toussaint, G. T. 1980. The relative neighbourhood graph of a finite planar set. Patt. Recog. 12: 261–268.

    Google Scholar 

  76. Vaidya, P. M. 1991. A sparse graph almost as good as the complete graph on points in K dimensions. Discrete Comput. Geom. 6: 369–381.

    Google Scholar 

  77. Van Groenewoud, H. 1992. The robustness of Correspondence, Detrended Correspondence and TWINSPAN analysis. J. Veg. Sci. 3: 239–246.

    Google Scholar 

  78. Vasilevich, V. I. 1967. A continuum in the coniferous and parvifoliate forest of the Karelian isthmus. Bot. Zhur SSSR 52: 45–53 (in Russian).

    Google Scholar 

  79. Veltkamp, R. C. 1992. The γ-neighbourhood graph. Computational Geometry: Theory and Applications 1: 227–246.

    Google Scholar 

  80. Wallace, C. S. 1995. Multiple factor analysis by MML estimation Tech. Rep. 95/218, Dept Computer Science, Monash University, Clayton Victoria3168, Australia21 pp.

    Google Scholar 

  81. Wallace, C. S. & D. L. Dowe. 2000. MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Statistics and Computing 10: 73–83.

    Google Scholar 

  82. Williams, W. T. 1973. Partition of information: the CENTPERC problem. Austral. J. Bot. 21: 277–281.

    Google Scholar 

  83. Williams, W. T. 1980. TWONET: A new program for the computation of a two-neighbour network. Austral. Comput. J. 12: 70.

    Google Scholar 

  84. Williams, W. T., J. S. Bunt and H. J. Clay. 1991. Yet another method of species-sequencing. Marine Ecol. Prog. Ser. 72: 283–287

    Google Scholar 

  85. Wishart, D. 1969. Mode analysis: A generalisation of nearest neighbour which reduces chaining effects. In: A. J. Cole (ed.), Numerical Taxonomy. Academic Press, New York. pp. 282–308.

    Google Scholar 

  86. Yanai, H. 1988. Partial correspondence analysis and its properties. In: E. Diday, C. Hayashi, M. Jambu and N. Ohsumi (eds.), Recent Developments in Clustering and Data Analysis. Academic Press, New York and London. pp. 259–266.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to M. B. Dale.

Rights and permissions

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Cite this article

Dale, M.B. On plexus representation of dissimilarities. COMMUNITY ECOLOGY 1, 43–56 (2000). https://doi.org/10.1556/ComEc.1.2000.1.7

Download citation

Keywords

  • Correlation
  • Euclidean representation
  • Gradients
  • Graph theory
  • Plexus
  • Riemannian representation
  • Spanning tree
  • SplitsTree