In this paper we examine the use of the minimum message length criterion in the process of evaluating alternative models of data when the samples are serially ordered in space and implicitly in time. Much data from vegetation studies can be arranged in a sequence and in such cases the user may elect to constrain the clustering by zones, in preference to an unconstrained clustering. We use the minimum message length principle to determine if such a choice provides an effective model of the data. Pollen data provide a suitably organised set of samples, but have other properties which make it desirable to examine several different models for the distribution of palynomorphs within the clusters. The results suggest that zonation is not a particularly preferred model since it captures only a small part of the patterns present. It represents a user expectation regarding the nature of variation in the data and results in some patterns being neglected. By using unconstrained clustering within zones, we can recover some of this overlooked pattern. We then examine other evidence for the nature of change in vegetation and finally discuss the usefulness of the minimum message length as a guiding principle in model choice and its relationship to other possible criteria.
Minimum Message Length
Adomavicius, G. and Tuzhilin, A. 1997. Discovery of actionable patterns in databases: the action hierarchy approach. In: D. Heckerman, H. Mannila, D. Pregibon and R. Uthurusamy (eds.), Proceedings 3rd International Conference on Knowledge Discovery and Data Mining. AAAI. pp. 111–114.
Agusta, Y. and Dowe, D. L. 2003. Unsupervised learning of correlated multivariate Gaussian mixture models. Lecture Notes in Artificial Intelligence 2903, Springer-Verlag, Berlin. pp. 477–489.
Aitchison, S. and Kay, J. W. 2003. Possible solutions of some essential zero problems in compositional data analysis. CODA-WORK’03 Girona: La Universitat. 6 pps. http://hdl.handle.net/10256/652.
Akgiray, V. and Lamoureux, C. G. 1989. Estimation of stable-law parameters: A comparative study. J. Business Econ. Stat. 7:85–93.
Allison, L., Edgoose, T. and Dix, T. I. 1998. Compression of strings with approximate repeats. In: J. I. Glasgow, T. G. Littlejohn, F. Major, R. H. Lathrop, D. Sankoff and C. Sensen (eds.) Proceedings 6th International Conference on Intelligent Systems in Molecular Biology (ISMB’98), Montreal. pp. 8–16.
Arnold, A., Liu, Y. and Abe, N. 2007. Temporal causal modelling with graphical granger methods. In: Berkhin, P., Caruana, R., Wu, X. and Gaffney, S. (eds.) Proceedings 13th ACM SIGKDD International Conference Knowledge Discovery and Data Mining. Association for Computing Machines, New York, pp. 66–75.
Babad, Y. M. and Hoffer, J. A. 1984. Even no data has value. Communications of the Association of Computing Machines 27: 748–756.
Balasubramanian, V. 1997. Statistical inference, Occam’s razor and statistical mechanics on the space of probability distributions. Neural Computation 9:349–368.
Barron, A. R., Rissanen, J. and Yu, B. 1998. The minimum description length principle in coding and modeling. IEEE Trans. Information Theory 44:2743–2760.
Baxter, R. A. and Oliver, J. J. 2006. The kindest cut: minimum message length segmentation. Lecture Notes in Computer Science, Springer, Berlin. 1180:83–90.
Bennett, K. D. and Porter, C. 2001. Late quaternary dynamics of Western Tierra del Fuego. Uppsala Universitet: http://www.geo.uu.se/ Institutionen för geovetenskaper: Paleobiologi: forskning.
Berryman, A. A. 1992. On choosing models for describing and analyzing ecological time series. Ecology 73: 694–698.
Blum, A., Hellerstein, L. and Littlestone, N. 1995. Learning in the presence of finitely or infinitely many irrelevant attributes. J. Comput. Syst. Sci. 50:32–40.
Boulton, D. M. and Wallace, C. S. 1970. A program for numerical classification. Computer J. 13: 63–69.
Bradshaw, R. H. W. 1981. Quantitative reconstruction of local woodland vegetation using pollen analysis from a small basin in Norfolk, England. J. Ecol. 69:941–955.
Bunting, M. J. and Middleton, R. 2009. Equifinality and uncertainty in the interpretation of pollen data: the Multiple Scenario Approach to reconstruction of past vegetation mosaics. The Holocene 19:799–803.
Comley, J. W. and Dowe, D. L. 2005. Minimum message length and generalized Bayesian net with asymmetric languages. In: P. Grunwald. I. J. Myung and M. A. Pitt (eds.) Advances in Minimum Description Length:Theory and Applications Chapter 11. MIT Press, Cambridge. pp. 265–294.
Crutchfield, J. P. and Young, K. 1989. Inferring statistical complexity. Phys. Rev. Lett. 63: 105–108.
Dai, H., Korb, K. B., Wallace, C. S. and Wu, X. 1996. A study of casual discovery with weak links and small samples. Proceedings 15th International Joint Conference on Artificial Intelligence, Morgan Kaufmann Publishers Inc, San Francisco USA. pp. 1304–1309.
Dale, M. B. 2000. Mt Glorious revisited: secondary succession in subtropical rainforest. Community Ecol. 1:181–193.
Dale, M. B. 2007. Changes in the model of within-cluster distribution of attributes and their effects on cluster analysis of vegetation data. Community Ecol. 8: 9–14.
Dale, M. B., Allison, L. and Dale, P. E. R.. 2007. Segmentation and clustering as complementary sources ofinformation. Acta Œcol. :1–10. VOL??.
Dale, M. B., Allison. L. and Dale, P. E. R.. 2010. A model for correlation within clusters and its use in pollen analysis. Community Ecol. 11:51–58.
Dale, M. B. and Clifford, H. T. 1976. The effectiveness of higher taxonomic ranks for vegetation analysis. Austr. J. Ecol. 1: 37–62.
Dale, M. B., Coutts, R. and Dale, P. E. R. 1988. Landscape classification by sequences: a study of Toohey Forest. Vegetatio 29: 113–129.
Dale, M. B., Dale, P. E. R. and Tan, P. J. 2007. Supervised clustering using decision trees and decision graphs: a ecological comparison. Ecol. Model. 204:70–78.
Dale, M. B. and Wallace, C. S. 2005. Hierarchical clusters of vegetation types. Community Ecol. 6:57–74.
Dale, P. E. R. and Dale, M. B. 2002. Optimal classification to describe environmental change: pictures from an exposition. Community Ecol. 3:19–30.
Davidson, I., Eter, M. and Ravi, S. S. 2007. Efficient incremental constrained clustering. In: Berkhin, P., Caruana, R., Wu, X. and Gaffney, S. (eds.) Proceedings 13th ACM SIGKDD International Conference Knowledge Discovery and Data Mining. Association for Computing Machines, New York, pp. 240–249.
Diday, E. 1988. The symbolic approach in clustering and related methods of data analysis: the basic choices. In: H. H. Bock (ed.) Classification and Related Methods of Data Analysis, North Holland, Amsterdam. pp. 673–683.
Douglass, D. C., Singer, B. S., Kaplan, M. R., Ackert, R. P., Mickelson, D. M. and Caffee, M. W. 2000. Evidence of early Holocene glacial advances in southern South America from cosmogenic surface-exposure dating. Geology 33:237–240.
Dowe, D. L. 2008a. Foreword re C. S. Wallace. Computer J. 51: 523–560.
Dowe, D. L. 2008b. Minimum Message Length and statistically consistent invariant (objective?) Bayesian probabilistic inference -from (medical) “evidence” Social Epistemology 22:433–460.
Dowe, D. L., Farr, G. E., Hurst, A. J. and Lentin, K. L. 1996. Information-theoretic football tipping. In: N. de Mestre (ed.), 3rd Conference on Mathematics and Computing in Sport. Bond University. pp. 233–241.
Fesq-Martin, M., Friedman, A., Peters, M., Behrman, J. and Kilian, R. 2004. Late-glacial and Holocene vegetation history of the Magellanic rain forest in Southwestern Patagonia, Chile. Vegetation History and Archaeobotany 13:249–255.
Fisher, D. 1992. Pessimistic and optimistic induction. Tech. Rep. CS-92-12, Dept. Computer Sci., Vanderbilt Univ., Nashville.
Fitzgibbon, L. J., Allison, L. and Dowe, D. L. 2000. Minimum message length grouping of ordered data. Lecture Notes in Computer Science 1968: Proceedings 11 International Conference on Algorithmic Learning Theory. Springer-Verlag, London. pp. 56–70.
Fitzgibbon, L. J., Dowe, D. L. and Allison, L. 2002. Univariate polynomial inference by Monte Carlo message length approximation. In: C. Sammut and A. G. Hoffman (eds.) Proceedings 19th International Conference on Machine Learning (ICML’2002), Sydney, Australia, Morgan Kaufmann, San Francisco. pp. 147–154.
Fitzgibbon, L. J, Dowe, D. L. and Allison, L. 2003. Bayesian posterior comprehension via message from Monte Carlo. Proceedings 2nd Hawaii International Conference on Statistics and Related Fields. http://www.csse.monash.edu.au/~leighf/papers/Fitzgibbon03b.pdf.
Fitzgibbon, L. J., Dowe, D. L. and Vahid, F. 2004. Minimum message length autoregressive model order selection. Proceedings of the International Conference on Intelligent Sensing and Information Processing (ICISIP 2004), Chennai, India, 4–7 January 2004, IEEE Operations Center, Piscataway, NJ, USA, ISBN: 0-7803-8243-9, pp. 439–444.
Gale, M. and Ball, L. J. 2002. Does Positivity Bias Explain Patterns of Performance on Wason’s 2–4-6 task? In: W. D. Gray and C. D. Schunn (eds.) Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society, Routledge, p. 340–344.
Galitskii, V. V. 1999. Modelling of the plant community: An individual-oriented approach. 1. A model of the community. Biology Bulletin 26(2). (Translated from Izvestia Akademii Nauk, Seria Biologicheskaya, 2000, No. 2, pp. 178–185.
Garrett, S. M. Coghill, G. M., Srinivasar, A. and King, R. D. 2007. Learning Qualitative Models of physical and biological systems. In: S. DDeroski, P. Langley and L. Todorovski (eds.), Computational Discovery of Scientific Knowledge. Lecture Notes in Artificial Intelligence 4660:248–272.
Gell-Mann, M. and Lloyd, S. 1996. Information measures, effective complexity and total information Complexity 2:44–52.
Gillison, A. N. and Brewer, K. R. W. 1985. The use of gradient directed transects or gradsects in natural resource surveys. J. Ecol. Manage. 20:103–127.
Gopnik, A. and Glymour, C. 2002. Causal maps and Bayes nets. A cognitive and computational account of causal learning and theory formation. In: P. Carruthers, S. Stich and M. Siegel (eds.) The Cognitive Basis of Science. Cambridge University Press, Cambridge. pp. 117–132.
Gower, J. C. 1974. Maximal predictive classification. Biometrics 30:643–654.
Green, D. G. 1982. Fire and stability in the postglacial forests of southwest Nova Scotia. J. Biogeogr. 9: 29–40.
Grimm V. 1999. Ten years of individual-based modelling in ecology: what have we learned, and what could we learn in the future? Ecol. Model. 115:129–148.
Hanson, S. J. 1990. Conceptual clustering and categorization: Bridging the gap between induction and causal models. In: R. S. Michalski and Y. Kodratoff (eds.), Machine Learning: An Artificial Intelligence Approach III, Morgan Kaufmann, San Mateo, CA. pp. 235–268.
Hilderman, R. J. and Hamilton, H. J. 1999. Heuristic measures of interestingness. In: Z. M. Zytow and J. Rauch (eds.), Proceedings 3rd European Confeence on the Principles of Data mining and Knowledge Discovery (PKDD). Lecture Notes in Computer Science 1704, Springer, Berlin. pp. 232–241.
Hope, G., Singh, G., Geissler, E., Glover, L. and O’Dea, D. A 2000. Detailed Pleistocene-Holocene vegetation record from Bega Swamp, southern New SouthWales. In: J. Magee and C. Craven (eds.) Quaternary Studies Meeting, Regional Analysis of Australian Quaternary Studies: strengths, gaps and future directions, Department of Geology, Australian National University, Canberra ACT. pp. 48–50.
Jackson, S. T. and Williams, J. W. 2004. Modern analogs in quaternary palæoecology: here today, gone yesterday, gone tomorrow?. Annu. Revi. Earth Planetary Sci. 32: 495–537.
Joosten, H. 2007. In search of finiteness: the limits of fine resolution palynology of Sphagnum peat. The Holocene 17:1023–1031.
Kershaw, A. P. 1976. A Late Pleistocene and Holocene pollen diagram from Lynch’s Crater, northeastern Queensland, Australia. New Phytol. 77:469–498.
Kodratoff, Y. 1986. Leçons d’apprentissage symbolique, Cepaduesed., Toulouse.
Lafferty, J., McCallum, J. A. and Pereira, F. 2001. Conditional Random Fields: probabilistic models for segmenting and labelling sequence data. International Conference on Machine Learning (ICML’01). pp. 282–289.
Lanterman, A. D. 2007. Schwarz, Wallace and Rissanen: intertwining themes in theories of model selection. Internat. Stat. Rev. 69: 185–212.
Larossa, J. M. C. 2005. Compositional time series: past and present. EconWPA Econometrics 0510002. http://22.214.171.124/eps/em/papers/0510/0510002.pdf.
Legendre, P. and Gallagher, E.. 2001. Ecologically meaningful transformations for ordination of species data. Ecology 270: 271–280.
Li, C., Biswas, G., Dale, M. B. and Dale, P. E. R. 2001. Building Models of Ecological Dynamics using HMM-based Temporal Data Clustering. In: Advances in Intelligent Data Analysis, 4th International Conference on Intelligent Data Analysis, Lecture Notes in Computer Science 2189, Springer, pp. 53–62.
Li, M. and Vitanyi, P. 1989. Inductive reasoning and Kolomogorov complexity. In: Proceedings 4th Annual IEEE Structure in Complexity Conference, Eugene. IEEE Computer Society Press. pp. 165–185.
Mac Nally, R. 2000. Regression and model-building in conservation biology, biogeography and ecology: the distinction between-and reconciliation of-‘predictive’ and ‘explanatory’ models. Biodivers. Conserv. 9: 655–671.
Markgraf, V. 1983. Late and Postglacial vegetational and palæoclimatic changes in subantarctic, temperate, and arid environments in Argentina. Palynology 7: 43–70.
Molloy, S., Albrecht, D. W., Dowe, D. L. and Ting, K. M. 2006. Model-Based clustering of sequential data. Proceedings 5th Annual Hawaii International. Conference on Statistics, Mathematics and Related Fields, 16th-18th January, 2006, Hawaii, U.S.A. 22 pages.
Murrell, D. J., Purves, D. W. and Law, R. 2001. Uniting pattern and process in plant ecology. Trends Ecol. Evol. 16:529–530.
Myung, J., Balasubramanian, V. and Pitt, M. A. 2000. Counting probability distributions: differential geometry and model selection. PNAS 97: 11170–11175.
Needham, S. L. and Dowe, D. L. 2001. Message length as an effective Ockham’s razor in decision tree induction. In: Proceedings 8th International Workshop of Artificial Intelligence and Statistics (AIS- TATS 2001), Key West, FL. pp. 253–260.
Neil, J. R. and Korb, K. B.. 1998. The MML evolution of causal models Tech. Rep. 98/17 Dept Comput. Sci., Monash University, Melbourne.
O’Donnell, R. T., Allison, L. and Korb, K. B. 2006. Learning hybrid Bayesian networks by MML. Lecture Notes in Computer Science 4304: 192–203. Springer, Berlin.
Oliver, J. J., Baxter, R. A. and Wallace, C. S. 1998. Minimum message length segmentation. In: X. Wu, R. Kotagiri and K. B. Korb (eds.) Lecture Notes in Artificial Intelligence 1394: 222–233. Research and Development in Knowledge Discovery and Data Mining, Second Pacific-Asia Conference, PAKDD-98 Melbourne Australia, 15–17 April 1998, Springer-Verlag, Berlin.
Orlóci, L. 2010. Multi-scale trajectory analysis: powerful conceptual tool for understanding ecological change. Front. Biol. China 4:158–179.
Orlóci, L. and He, K. S. 2009. On governance in the long-term vegetation process: How dowe discover the rules? Front. Biol. China 4:557–568.
Orlóci, L., Pillar, V. D. and Anand, M. 2006. Multiscale analysis of palynological records: new possibilities. Community Ecol. 7:53–67.
Paez M. M., Schäbitz, F. and Stutz, S.. 2001. Modern pollen-vegetation and isopoll maps in southern Argentina. J. Biogeogr. 28:997–1021.
Pickett, E. J., Harrison, S. P., Hope, G., Harle, K., Dodson, J. R., Kershaw, A. P., I. Prentice, I. C., Backhouse, J., Colhoun, E. A., D’Costa, D., Flenley, J., Grindrod, J., Haberle, S., Hassell, C, Kenyon, C., Macphail, M., Martin, H., Martin, A. H., McKenzie, M., Newsome, J. C., Penny, D., Powell, J., Raine, J. I., Southern, W., Stevenson, J., Sutra, J-P., Thomas, I., van der Kaars, S. and Ward, J. 2004. Pollen-based reconstructions of biome distributions for Australia, Southeast Asia and the Pacific (SEAPAC region) at 0,6000 and 18,000 14C yr BP. J. Biogeogr. 31: 1381–1444.
Popper, K. 1992. The Logic of Scientific Discovery Chapter 7. Simplicity. Routledge, London. pp. 121–132.
Powell, D. R., Allison, L. and Dix, T. I. 2004. Modelling-alignment for non-random series. In: Lecture Notes in Artificial Intelligence 3339, Springer, Berlin. pp. 203–214.
Prentice I. C. 1985. Pollen representation, source area and basin size: towards a unified theory of pollen analysis. Quat. Res. 23:76–86.
Prentice, I. C., Guiot, J., Huntley, B., Jolly, D. and Cheddadi, R. 1996. Reconstructing biomes from palæoecological data: a general method and its application to European pollen data at 0 and 6 ka. Climate Dynamics 12: 185–194.
Rahwan, T. and Jennings, N. R. 2008. An improved dynamic programming algorithm for coalition structure generation. In: L. Padgham, D. C. Parkes, J. Mueller and S. Parsons (eds.) Proceedings 7th International Conference on Autonomous Agents and Multiagent systems (AAMAS), Estoril, Portugal. pp. 1417–1420.
Riddle, R. R. and Hafner, D. J. 1999. Species as unit of analysis in ecology and biogeography: time to take the blinkers off. Global Ecol. Biogeogr. 8: 433–441.
Rissanen, J. 1995. Stochastic complexity in learning. In: P. Vitányi (ed.) Computational Learning Theory. Lecture Notes in Computer Science 904. pp. 196–210.
Salzberg, S. 1986. Pinpointing good hypotheses with heuristics. In: W. A. Gale (ed.) Artificial Intelligence and Statistics. Addison-Wesley, Sydney. pp. 133–158.
Schader, M. 1979. Branch and Bound Clustering with a generalised scatter criterion. Oper. Res. Verfahren 30: 154–162.
Schmidhuber, J. 1997. What’s interesting? Tech. Rep. IDSIA-35–97, IDSIA, Lugano, Switzerland.
Shalizi, C. R. and Crutchfield, J. P. 2001. Computational Mechanics: Pattern and Prediction, Structure and Simplicity. J. Stat. Phys. 104:819–881.
Silberschatz, A. and Tuzhilin, A. 1996. What makes patterns interesting. I. E. E. E. Trans. Knowledge Data Engineering 8: 275–281.
Sober, E. Let’s Razor Occam’s Razor 1994. In: D. Knowles (ed.) Explanation and Its Limits Cambridge University Press Cambridge. pp. 73–93.
Solomonoff, R. J. 2008. Three kinds of probabilistic induction: universal distributions and convergence theorems. Computer J. 51:566–570.
Sombattheera, C. and Ghose, A. 2008. Abest-first anytime algorithm for computing optimal coalition structures. In: L. Padgham, D. C. Parkes, J. Mueller and S. Parsons (eds.), Proceedings 7th International Conference on Autonomous Agents and Multiagent systems (AAMAS), Estoril, Portugal. pp. 1425–1427.
Sommer, E. 1995. An approach to quantifying the quality of induced theories. In: C. Nedellec (ed.), Proceedings of the International Joint Conference on Artificial Intelligence Workshop on Macine Learning and Comprehensibility. pp. 356–359.
Srinivasan, A., Muggleton, S. and Bain, M. 1994. The justification of logical theories based on data compression. Machine Intelligence 13: 87–121.
Sugita, S. 1993. A model of pollen source area for an entire lake surface. Quat. Res. 39:239–244.
Sugita, S. 1994. Pollen representation of vegetation in Quaternary sediments: theory and method in patchy vegetation. J. Ecol. 82:881–897.
Sugita, S. 2007a. Theory of quantitative reconstruction of vegetation I: pollen from large sites REVEALS regional vegetation composition. The Holocene 17: 229–241.
Sugita, S. 2007b. Theory of quantitative reconstruction of vegetation II: all you need is LOVE. The Holocene 17: 243–257.
Sunnehag, P. and Hutter, M. 2010 Consistency of feature Markov processes. arXiv:1007.2075v1.
Van der Maarel, E. and Sykes, M. T. 1993. Small-scale plant species turnover in a limestone grassland: the carousel model and some comments on the niche concept. J. Veg. Sci. 4: 179–188.
Thagard, P. 1978. The best explanation: criteria for theory choice. J. Philos. 75:76–92.
Villa-Martínez, R. and Moreno, P. I. 2007. Pollen evidence for variations in the southern margin of the westerly winds in SW Patagonia over the last 12,600 years. Quat. Res. 68: 400–409.
Vinod, H. D. 1969. Integer programming and the theory of grouping. American Stat. Assoc. J. 64: 506–519.
Visser, G. and Dowe, D. L. 2007. Minimum message length clustering of spatially-correlated data with varying inter-class penalties. 6th IEEE International Conference on Computer and Information Science (ICIS 2007), Melbourne, Australia, pp. 17–22.
Visser, G., Dowe, D. L. and Uotila, J. P. 2009. Enhancing MML Clustering using Context Data with Climate Applications. In: A. Nicholson and X. Li (Eds.) Proceedings 22nd Australian Joint Conf. on Artificial Intelligence (AI’09), Melbourne, Australia), Lecture Notes in Artificial Intelligence (LNAI) 5866 Springer Berlin. pp. 350–359.
Von Post, L. 1916. Skogsträdspollen i sydsvenska torvmosselager-följder. Geol. Fören. Förhandl. 38:384–394.
Von Post, L. 1924. Ur de sydsvenska skogarnas regionala historia under postarktisk tid. Geol. Fören. Förhandl. 46:83–128.
V’yugin, V. V. 1999. Most sequences are predictable. Tech. Report CLRC-TR-99-01, Computer Learning Research Centre, Royal Hollaway University of London, Egham Surrey UK.
Walker, D. 1966. The late Quaternary history of the Cumberland lowlands. Philos. Trans. Roy. Soc. 251:1–210.
Walker, D and Wilson, S. R. 1978. A statistical alternative to the zoning of pollen diagrams. J. Biogeogr. 5: 1–21.
Wallace, C. S. 1996. MML inference of predictive trees, graphs and nets. In: A. Gammerman (ed.) Computational Learning and Probabilistic Reasoning, John Wiley. pp 43–66.
Wallace, C. S. 1998. Intrinsic classification of spatially correlated data. Computer J. 41: 602–611.
Wallace, C. S. 2005. Statistical and Inductive Inference by Minimum Message Length. Springer, Berlin.
Wallace, C. S. and Dowe, D. L. 1994. Intrinsic classification by MML - the Snob program. Proceedings 7th Australian Joint Conference on Artificial Intelligence, University of New England, Armidale, Australia, pp. 37–44.
Wallace, C. S. and Dowe, D. L. 2000. MML clustering with multistate, Poisson, Von Mises circular and Gaussian distributions. Statistics and Computing 10: 73–83.
Webb, L. J., Tracey, J. G., Williams, W. T. and Lance, G. N. 1967. Studies in the numerical analysis of complex rain-forest communities I. a comparison of methods applicable to site/species data. J. Ecol. 55: 171–191.
Whewell, W. 1847. The Philosophy of the Inductive Sciences Johnson Reprint Co., New York.
Williams, W. T. 1969. The problem of attributeweightinginnumerical classification. Taxon 18: 369–374.
Williams, W. T. 1971. Principles of clustering. Annu. Rev. Ecol. Syst. 2: 303–326.
Williams, W. T. and Dale, M. B. 1962. Partition correlation matrices for heterogeneous quantitative data. Nature 196: 602.
Yamada, H. and Amaroso, S. 1971. Structural and behavioural equivalences of tessellation automata. Information and Control 18:1–31.
Yang, F. and Jiang, T. 2003. Pixon-based image segmentation with Markov random fields. IEEE Transactions on Image Processing 12:1552–1559.
Yin, K. and Davidson, I. 2004. An information Theoretic Optimal Classifier for Semi-supervised Learning. Lecture Notes in Computer Science 3177, Springer Berlin. pp. 740–745.
Yu, S. X. and Shi, J. 2004. Segmentation Given Partial Grouping Constraints, IEEE Transactions Pattern Analysis and Machine Intelligence PAMI 26:173–183.
Zhang, H-X. and Lu, J. 2010. Creating ensembles of classifiers via fuzzy clustering and deflection. Fuzzy sets and Systems 161: 1790–1802.
Zhu, H-Y. and Rohwer, R. 1995. Bayesian invariant measurements of generalisation for continuous distributions. Technical Report NCRG/4352, Department Computer Science, University of Aston.
About this article
Cite this article
Dale, M.B., Allison, L. & Dale, P.E.R. Model selection using Minimal Message Length: an example using pollen data. COMMUNITY ECOLOGY 11, 187–201 (2010). https://doi.org/10.1556/ComEc.11.2010.2.7
- Compositional data
- Minimum message length
- User expectation
- Within-cluster model