Community Ecology

, Volume 2, Issue 2, pp 171–180 | Cite as

Minimum message length clustering, environmental heterogeneity and the variable Poisson model

  • M. B. DaleEmail author
Open Access


One possible explanation of variation in vegetation is based on the variable Poisson model. In this model, species occurrence is presumed to follow a Poisson distribution, but the value of the Poisson parameter for any species varies from point to point, as a result of environmental variation. As an extreme, this includes dividing the given habitat into areas favourable to a community and areas which are unfavourable, or at least not occupied. The spatial area can then be viewed as a series of patches within which each species follows a Poisson distribution, although different patches may have different values for the Poisson parameter for any particular species.

In this paper, I use a method of fuzzy clustering (mixture modelling) based on the minimum message length principle to examine the variation in Poisson parameter of individual species. The method uses the difference between the message length for the null, 1-cluster case and the message length for the optimal cluster solution, appropriately normalised, as a measure of the amount of pattern any analysis captures. I also compare the Poisson results with results obtained by assuming the within patch distribution is Gaussian. The Poisson alternative consistently results in a greater capture of pattern than the Gaussian, but at the expense of a much larger number of clusters. Overall, the Gaussian alternative is strongly supported. Other mechanisms that might introduce extra clusters, for example within-cluster correlation or spatial dependency between observations, would presumably apply equally to both models. The variable Poisson model, in the limit, converges on the individualistic model of vegetation, the Gaussian on something like the community unit model. With these data, the individualistic model is strongly rejected. Difficulties with comparing model classes mean this conclusion must remain tentative.


Fuzzy clustering Gaussian distribution Mixture modelling Pattern 



Minimum Message Length




  1. Ashby, E. 1935. The quantitative analysis of vegetation. Ann. Bot. 49: 779–802.CrossRefGoogle Scholar
  2. Banfield, J. D. and A. E. Raftery 1993. Model-based Gaussian and non-Gaussian clustering. Biometrics 49:803–821.CrossRefGoogle Scholar
  3. Barsalou, L. W. 1995. Deriving categories to achieve goals. In:. A. Ram and D. B. Leake (eds.), Goal Directed Learning. MIT Press, Cambridge MA. pp. 121–176.Google Scholar
  4. Bensmail, H., G. Celeux, A. E. Raftery and C. P. Robert. 1997. Inference in model-based cluster analysis. Statistics and Computing 7:1–10.CrossRefGoogle Scholar
  5. Boerlijst, M. and P. Hogeweg. 1991. Spiral wave structure in prebiotic evolution: hypercycles stable against parasites. Physica D 48: 17–28.Google Scholar
  6. Brokaw, N. and R. T. Busing. 2000. Niche versus chance in tree diversity in forest gaps. TREE 15: 183–188.Google Scholar
  7. Dale, M. B. 1987. Knowing when to stop: cluster concept-concept cluster. Coenoses 3: 11–32.Google Scholar
  8. Edgoose, T. and L. Allison. 1999. MML Markov classification of sequential data. Statistics and Computing 9:269–278.CrossRefGoogle Scholar
  9. Edwards, R. T. and D. Dowe. 1998. Single factor analysis in MML mixture modelling. Lecture Notes in Artificial Intelligence 1394 Springer Verlag, pp. 96–109.Google Scholar
  10. Erickson, R. O. and J. R. Stehn. 1945. A technique for analysis of population density data. Amer midl. Nat. 33:781–787.CrossRefGoogle Scholar
  11. Feller, W. 1943. On a general class of ‘contagious’ distributions. Ann. Math. Statist. 14:389–400.CrossRefGoogle Scholar
  12. Fraley C. and A. E. Raftery 1998. How many clusters? Which clustering method? - Answers via Model-Based Cluster Analysis. Technical Report no. 329, Department of Statistics, University of Washington.Google Scholar
  13. Goodall, D. W. 1953. Objective methods for the classification of vegetation 1. The use of positive interspecific correlation. Austral. J. Bot. 1: 39–63.Google Scholar
  14. Greig-Smith, P. 1983. Quantitative Plant Ecology, 3rd Edition, Blackwell, Oxford.Google Scholar
  15. Hastie, T. and W. Stuetzle. 1989. Principal curves. Amer Statist. Assoc. J. 84: 502–516.CrossRefGoogle Scholar
  16. Hilderman, R. J. & Hamilton, H. J. 1999. Heuristics for ranking the interestingness of discovered knowledge. Proc. 3rd Pacific-Asia Conf. Knowledge Discovery PKDD’99, Beijing, Springer, Berlin. pp. 204–209.Google Scholar
  17. Keddy, P. A. 1993. Do ecological communities exist? A reply to Bastow Wilson. J. Veg. Sci. 4: 135–136.CrossRefGoogle Scholar
  18. Kemp, C. D. and A. W. Kemp. 1956. The analysis of point quadrat data. Austral. J. Bot. 4: 167–174.CrossRefGoogle Scholar
  19. Kolmogorov, A. N. 1965. Three approaches to the quantitative description of information. Prob. Inform. Transmission 1: 4–7. (translation).Google Scholar
  20. Mackay 1969. Recognition and action. In: S. Watanabe (ed.), Methodologies of Pattern Recognition, Academic Press, London, pp. 409–416.CrossRefGoogle Scholar
  21. Pólya, G. 1930. Sur quelques points de la théorie des probabilités. Ann. Inst. Poincaré 1: 117–161.Google Scholar
  22. Rissanen, J. 1999. Hypothesis selection and testing by the MDL principle. Comput. J. 42:260–269.CrossRefGoogle Scholar
  23. Robinson, P. 1954. The distribution of plant populations. Ann. Bot. 19:59–66.CrossRefGoogle Scholar
  24. Shipley, B. and P. A. Keddy. 1987. The individualistic and community-unit concepts as falsifiable hypotheses. Vegetatio 69: 47–55.CrossRefGoogle Scholar
  25. Simberloff, D. 1980. A succession of paradigms in ecology: Essentialism to materialism and probabilism. Synthese 43: 3–29.CrossRefGoogle Scholar
  26. Singh, B. N. and K. Das. 1938. Distribution of weed species on arable land. J. Ecol. 26: 455–466.CrossRefGoogle Scholar
  27. Stanford, D. and A. E. Raftery. 1997. Principal curve clustering with noise. Tech. Rep. 317, Dept. Statistics, University of Washington.Google Scholar
  28. Stevens, W. L. 1937. Significance of grouping. Ann. Eug. London. 8: 57–69.CrossRefGoogle Scholar
  29. Trass, H. and N. Malmer. 1973. North European approaches to classification. In: R. H. Whittaker (ed.), Classification and Ordination of Plant Communities, Dr. W Junk, The Hague, pp.529–575.CrossRefGoogle Scholar
  30. Wallace, C. S. 1995. Multiple factor analysis by MML estimation. Tech. Rep. 95/218, Dept Computer Science, Monash University, Clayton, Victoria 3168, Australia.Google Scholar
  31. Wallace C. S. 1998. Intrinsic classification of spatially-correlated data Comput. J. 41: 602–611.CrossRefGoogle Scholar
  32. Wallace, C. S. and D. L. Dowe. 2000. MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Statistics and Computing 10:73–83.CrossRefGoogle Scholar
  33. Westhoff, V. and E. van der Maarel 1973. The Braun-Blanquet approach. In: R. H. Whittaker (ed.), Classification and Ordination of Plant Communities, Dr. W. Junk, The Hague, pp. 617–707.CrossRefGoogle Scholar
  34. Wilson, J. B. 1991. Does vegetation science exist? J. Veg. Sci. 2:289–290.CrossRefGoogle Scholar

Copyright information

© Akadémiai Kiadó, Budapest 2001

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Australian School of Environmental StudiesGriffith UniversityNathanAustralia

Personalised recommendations