One possible explanation of variation in vegetation is based on the variable Poisson model. In this model, species occurrence is presumed to follow a Poisson distribution, but the value of the Poisson parameter for any species varies from point to point, as a result of environmental variation. As an extreme, this includes dividing the given habitat into areas favourable to a community and areas which are unfavourable, or at least not occupied. The spatial area can then be viewed as a series of patches within which each species follows a Poisson distribution, although different patches may have different values for the Poisson parameter for any particular species.
In this paper, I use a method of fuzzy clustering (mixture modelling) based on the minimum message length principle to examine the variation in Poisson parameter of individual species. The method uses the difference between the message length for the null, 1-cluster case and the message length for the optimal cluster solution, appropriately normalised, as a measure of the amount of pattern any analysis captures. I also compare the Poisson results with results obtained by assuming the within patch distribution is Gaussian. The Poisson alternative consistently results in a greater capture of pattern than the Gaussian, but at the expense of a much larger number of clusters. Overall, the Gaussian alternative is strongly supported. Other mechanisms that might introduce extra clusters, for example within-cluster correlation or spatial dependency between observations, would presumably apply equally to both models. The variable Poisson model, in the limit, converges on the individualistic model of vegetation, the Gaussian on something like the community unit model. With these data, the individualistic model is strongly rejected. Difficulties with comparing model classes mean this conclusion must remain tentative.
Minimum Message Length
Ashby, E. 1935. The quantitative analysis of vegetation. Ann. Bot. 49: 779–802.
Banfield, J. D. and A. E. Raftery 1993. Model-based Gaussian and non-Gaussian clustering. Biometrics 49:803–821.
Barsalou, L. W. 1995. Deriving categories to achieve goals. In:. A. Ram and D. B. Leake (eds.), Goal Directed Learning. MIT Press, Cambridge MA. pp. 121–176.
Bensmail, H., G. Celeux, A. E. Raftery and C. P. Robert. 1997. Inference in model-based cluster analysis. Statistics and Computing 7:1–10.
Boerlijst, M. and P. Hogeweg. 1991. Spiral wave structure in prebiotic evolution: hypercycles stable against parasites. Physica D 48: 17–28.
Brokaw, N. and R. T. Busing. 2000. Niche versus chance in tree diversity in forest gaps. TREE 15: 183–188.
Dale, M. B. 1987. Knowing when to stop: cluster concept-concept cluster. Coenoses 3: 11–32.
Edgoose, T. and L. Allison. 1999. MML Markov classification of sequential data. Statistics and Computing 9:269–278.
Edwards, R. T. and D. Dowe. 1998. Single factor analysis in MML mixture modelling. Lecture Notes in Artificial Intelligence 1394 Springer Verlag, pp. 96–109.
Erickson, R. O. and J. R. Stehn. 1945. A technique for analysis of population density data. Amer midl. Nat. 33:781–787.
Feller, W. 1943. On a general class of ‘contagious’ distributions. Ann. Math. Statist. 14:389–400.
Fraley C. and A. E. Raftery 1998. How many clusters? Which clustering method? - Answers via Model-Based Cluster Analysis. Technical Report no. 329, Department of Statistics, University of Washington.
Goodall, D. W. 1953. Objective methods for the classification of vegetation 1. The use of positive interspecific correlation. Austral. J. Bot. 1: 39–63.
Greig-Smith, P. 1983. Quantitative Plant Ecology, 3rd Edition, Blackwell, Oxford.
Hastie, T. and W. Stuetzle. 1989. Principal curves. Amer Statist. Assoc. J. 84: 502–516.
Hilderman, R. J. & Hamilton, H. J. 1999. Heuristics for ranking the interestingness of discovered knowledge. Proc. 3rd Pacific-Asia Conf. Knowledge Discovery PKDD’99, Beijing, Springer, Berlin. pp. 204–209.
Keddy, P. A. 1993. Do ecological communities exist? A reply to Bastow Wilson. J. Veg. Sci. 4: 135–136.
Kemp, C. D. and A. W. Kemp. 1956. The analysis of point quadrat data. Austral. J. Bot. 4: 167–174.
Kolmogorov, A. N. 1965. Three approaches to the quantitative description of information. Prob. Inform. Transmission 1: 4–7. (translation).
Mackay 1969. Recognition and action. In: S. Watanabe (ed.), Methodologies of Pattern Recognition, Academic Press, London, pp. 409–416.
Pólya, G. 1930. Sur quelques points de la théorie des probabilités. Ann. Inst. Poincaré 1: 117–161.
Rissanen, J. 1999. Hypothesis selection and testing by the MDL principle. Comput. J. 42:260–269.
Robinson, P. 1954. The distribution of plant populations. Ann. Bot. 19:59–66.
Shipley, B. and P. A. Keddy. 1987. The individualistic and community-unit concepts as falsifiable hypotheses. Vegetatio 69: 47–55.
Simberloff, D. 1980. A succession of paradigms in ecology: Essentialism to materialism and probabilism. Synthese 43: 3–29.
Singh, B. N. and K. Das. 1938. Distribution of weed species on arable land. J. Ecol. 26: 455–466.
Stanford, D. and A. E. Raftery. 1997. Principal curve clustering with noise. Tech. Rep. 317, Dept. Statistics, University of Washington.
Stevens, W. L. 1937. Significance of grouping. Ann. Eug. London. 8: 57–69.
Trass, H. and N. Malmer. 1973. North European approaches to classification. In: R. H. Whittaker (ed.), Classification and Ordination of Plant Communities, Dr. W Junk, The Hague, pp.529–575.
Wallace, C. S. 1995. Multiple factor analysis by MML estimation. Tech. Rep. 95/218, Dept Computer Science, Monash University, Clayton, Victoria 3168, Australia.
Wallace C. S. 1998. Intrinsic classification of spatially-correlated data Comput. J. 41: 602–611.
Wallace, C. S. and D. L. Dowe. 2000. MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Statistics and Computing 10:73–83.
Westhoff, V. and E. van der Maarel 1973. The Braun-Blanquet approach. In: R. H. Whittaker (ed.), Classification and Ordination of Plant Communities, Dr. W. Junk, The Hague, pp. 617–707.
Wilson, J. B. 1991. Does vegetation science exist? J. Veg. Sci. 2:289–290.
About this article
Cite this article
Dale, M.B. Minimum message length clustering, environmental heterogeneity and the variable Poisson model. COMMUNITY ECOLOGY 2, 171–180 (2001). https://doi.org/10.1556/ComEc.2.2001.2.4
- Fuzzy clustering
- Gaussian distribution
- Mixture modelling