Abstract
In this paper, we examine the application of a particular approach to induction, the minimum message length principle and illustrate some of the problems that can be addressed through its use. The MML principle seeks to identify an optimal model within some specified parameterised class of models and for this paper we have chosen to concentrate on a single model class, that of mixture separation or fuzzy clustering. The first section presents, in outline, an MML methodology for fuzzy clustering. We then present some applications, including the nature of the within-cluster model, examination of the univocality of results for different groups of species and the effectiveness of presence data compared to purely quantitative data. Finally, we examine some possibilities of extending MML methodology to include within-class correlation of species, the existence of dependence between observed samples and the comparison of different classes of models.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Abbreviations
- MML:
-
Minimum Message Length
- MDL:
-
Minimum Description Length
References
Akaike, H. 1978. A Bayesian analysis of the minimum AIC procedure. Annals Inst. Statist. Mathematics 30:9–14.
Arabie, P. and J. D. Carroll. 1980. MAPCLUS: a mathematical programming approach to fitting the ADCLUS model. Psychometrika 45: 211–235.
Austin, M. P. 1970. An applied ecological example of mixed data classification. In: R. S. Anderssen and M. R. Osborne (eds.), Data Representation. Univ. Queensland Press, Brisbane, pp. 113–117.
Babad, Y. M. and J. A. Hoffer. 1984. Even no data has value. Commun. Assoc. Comput. Mach. 27: 748–756.
Bezdek, J. C. 1974. Numerical taxonomy with fuzzy sets. J. Math. Biol. 1: 57–71.
Boerlijst, M. C. and P. Hogeweg. 1991. Spiral wave structure in prebiotic evolution: hypercycles stable against parasites. Physica D 48: 17–28.
Boik, R. J. 1987. The Fisher-Pitman permutation test: a non-robust alternative to the normal theory F-test when variances are heterogeneous. Brit. J. Math. Statist. Psychol. 40:26–42.
Boulton, D. M. and C. S. Wallace. 1970. A program for numerical classification. Comput. J. 13: 63–69.
Boulton, D. M. and C. S. Wallace. 1973. An information measure for hierarchic classification. Comput. J. 16: 254–261.
Boulton, D. M. and C. S. Wallace. 1975. An information measure for single-link classification. Comput J. 18: 236–238.
Bradfield, G. E. and N. C. Kenkel. 1987. Nonlinear ordination using flexible shortest path adjustment of ecological distance. Ecology 68: 750–753.
Carley, K. and M. Palmquist. 1992. Extracting, representing and analyzing mental models Social Forces 70: 601–636.
Chaitin, G. J. 1966. On the length of programs for computing finite sequences. J. Assoc. Comput. Mach. 13:547–549.
Chatfield, C. 1995. Model uncertainty, data mining and statistical inference J. Royal Statistical Soc. Series A 158: 419–466.
Dale, M. B. 1987. Knowing when to stop: cluster concept-concept cluster. Coenoses 3: 11–32.
Dale, M. B. 1988. Some fuzzy approaches to phytosociology: ideals and instances. Folia Geobot. Phytotax. 23: 239–274.
Dale, M. B. 1994. Straightening the horseshoe: a Riemannian resolution? Coenoses 9: 43–53.
Dale, M. B. 1999. The dynamics of diversity: mixed strategy systems. Coenoses 13:105–113.
Dale, M. B. 2000a. On plexus representation of dissimilarities. Community Ecol. 1: 43–56.
Dale, M. B. 2000b. Mt Glorious revisited: secondary succession in subtropical rainforest. Community Ecol. 1: 181–193.
Dale, M. B. 2001. Minimum message length clustering, environmental heterogeneity and the variable Poisson model. Community Ecol. 2:171–180.
Dale, M. B. (submitted) Models, measures and messages: a role for induction.
Dale, M. B. and P. Hogeweg. 1998. The dynamics of diversity: a cellular automaton approach. Coenoses 13:3–15.
Edgoose, T. and L. Allison. 1999. MML Markov classification of sequential data. Statistics and Computing 9: 269–278.
Edwards, R. T. and D. Dowe. 1998. Single factor analysis in MML mixture modelling. Lecture Notes in Artificial Intelligence 1394, Springer-Verlag, pp. 96–109.
Ganesalingam, S. and G. J. McLachlan. 1980. A comparison of the mixture and classification approaches to cluster analysis. Commun. Statist. Theor Meth. A9: 923–933.
Goodall, D. W. and E. Feoli. 1988. Application of probabilistic methods in the analysis of phytosociological data. Coenoses 1: 1–10.
Gordon, A. D. 1994. Identifying genuine clusters in a classification. Comput. Statist. Data Analysis 18: 561–581.
Hayes, A. F. 1996. Permutation test is not distribution free. Psychol. Methods 1: 184–198.
Hill, M. O., R. G. H. Bunce and M. W. Shaw. 1975. Indicator species analysis: a divisive polythetic method of classification and its application to a survey of native pinewoods in Scotland. J. Ecol. 63: 597–613.
Hoffman, R. L. and A. K. Jain. 1987. Sparse decomposition for exploratory pattern analysis. I. E. E. E. Trans. Patt. Anal. Mach. Intell. PAMI-9: 551–560.
Hubert, L. and P. Arabie. 1994. The analysis of proximity matrices through sums of matrices having (anti-)Robinson forms. Brit. J. Math. Statist. Psychol. 47:1–40.
Kolmogorov, A. N. 1965. Three approaches to the quantitative description of information. Prob. Inform. Transmission 1: 4–7 (translation).
Krishna-Iyer, P. V. 1949. The first and second moments of some probability distributions arising from points on alattice and their application. Biometrika 36: 135–141.
Legendre, P. and E. D. Gallagher. 2001. Ecologically meaningful transformations for ordination of species data. Oecologia 129: 271–280.
Li, C. and G. Biswas. 1999. Temporal pattern generation using hidden Markov model-based unsupervised classification. In: Advances in Intelligent Data Analysis, Lecture Notes in Computer Science 1642, Springer-Verlag, Berlin, pp. 245–256.
Li, C. and G. Biswas. 2000. Bayesian temporal data clustering using hidden Markov model representation. In: P. Langley (ed.), Proceedings of the Seventeenth International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA. pp. 543–550.
Liu, R. Y., J. M. Parelius and K. Singh. 1999. Multivariate analysis by data depth: descriptive statistics (with discussion). Ann. Statist. 27:783–885.
Lux, A. 2000. Die Dynamik der Kraut-Gras-Schicht in einem Mittelund Niederwaldsystem. Untersuchungen im Gebiet des Kehrenbergs bei Bad Windsheim. Dissertationes Botanicae Vol. 333.
Lux, A. and F. A. Bemmerlein-Lux 1998. Two vegetation maps of the same island: floristic units versus structural units. Appl. Veg. Sci. 1:201–210.
Oliver, J. J. and C. S. Forbes. 1997. Bayesian approaches to segmenting a simple time series. Tech. Rep. 97/336 Dept. Comput. Sci. Software Engineering, Monash University. Clayton, Victoria 3168, Australia..
Pillar, V. D. 1996. A randomization-based solution for vegetation classification and homogeneity testing. Coenoses 11: 29–36.
Richardson, S. and P.J. Green. 1997. On Bayesian analysis of mixtures with an unknown number of components. J. Roy. Statist. Soc. B 59: 731–792.
Rissanen, J. 1983. A universal prior for integers and estimation by minimum description length. Annals of Statistics 11: 416–431.
Rissanen, J. 1995. Stochastic complexity in learning. In: P. Vitányi (ed.), Computational Learning Theory, Lecture Notes in Computer Science 904, Springer Verlag, Berlin, pp. 196–201.
Robinson, P. A. 1954. The distribution of plant populations. Ann. Bot. 18: 35–45.
Sandland, R. L. and P. C. Young. 1979. Probabilistic tests and stopping rules associated with hierarchical classification techniques. Aust. J. Ecol. 4: 399–406.
Savill, N. J., P. Rohani and P. Hogeweg. 1997. Self-reinforcing spatial patterns enslave evolution in a host-parasitoid system. J. theoret. Biol. 188: 11–20.
Shipley, B. and P. A. Keddy. 1987. The individualistic and community-unit concepts as falsifiable hypotheses. Vegetatio 69: 47–55.
Stevens, W. L. 1937. Significance of grouping. Ann. Eug. London. 8:57–69.
Van der Maarel, E. 1990. Ecotones and ecoclines are different. J. Veg. Sci. 1:135–138.
Viswanathan, M., C. S. Wallace, D. L. Dowe and K. B. Korb. 1999. Finding outpoints in noisy binary sequences: a revised empirical examination. In: N. Foo (ed.), AI-99 Lecture Notes in Artificial Intelligence 1747, Springer-Verlag, Berlin, pp. 405–416.
Wallace, C. S. 1990. Classification by minimum message length inference. In: G. Goos and J. Hartmanis (eds.), Advances in Computing and Information – ICCI’90, Springer-Verlag, Berlin, pp. 72–81.
Wallace, C. S. 1995. Multiple factor analysis by MML estimation. Tech. Rep. 95/218, Dept Computer Science, Monash University, Clayton, Victoria 3168, Australia. 21 pp.
Wallace, C. S. 1998. Intrinsic classification of spatially correlated data. Comput. J. 41: 602–611.
Wallace, C. S. and D. L. Dowe. 2000. MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Statistics and Computing 10: 73–83.
Wallace, C. S. and P. R. Freeman. 1987. Estimation and inference by compact coding. J. Roy. Statist. Soc. Ser. B 49: 240–252.
Wallace, C. S. and P. R. Freeman. 1992. Single factor analysis by minimum message length estimation. J. Roy. Statist. Soc. Ser. B 54: 195–209.
Watanabe, S. 1969. Knowing and Guessing. Wiley, New York.
Williams, W. T. and M. B. Dale. 1962. Partitioned correlation matrices for heterogenous quantitative data. Nature 196: 502.
Williams, W. T., G.N. Lance, L.J. Webb, J.G. Tracey. and J.H. Connell. 1969. Studies in the numerical analysis of complex rainforest communities IV A method for the elucidation of small scale pattern. J. Ecol. 57: 635–654.
Yarranton, G. A., W. J. Beasleigh, R. G. Morrison and M. I. Shafti. 1972. On the classification of phytosociological data into nonexclusive groups with a conjecture about determining the optimum number of groups in a classification. Vegetatio 24: 1–12.
Acknowledgments
Our thanks to Sanyi Bartha who provided some extremely useful and important comments on an earlier draft.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Dale, M.B., Salmina, L. & Mucina, L. Minimum message length clustering: an explication and some applications to vegetation data. COMMUNITY ECOLOGY 2, 231–247 (2001). https://doi.org/10.1556/ComEc.2.2001.2.11
Published:
Issue Date:
DOI: https://doi.org/10.1556/ComEc.2.2001.2.11