Model selection using Minimal Message Length: an example using pollen data

Abstract

In this paper we examine the use of the minimum message length criterion in the process of evaluating alternative models of data when the samples are serially ordered in space and implicitly in time. Much data from vegetation studies can be arranged in a sequence and in such cases the user may elect to constrain the clustering by zones, in preference to an unconstrained clustering. We use the minimum message length principle to determine if such a choice provides an effective model of the data. Pollen data provide a suitably organised set of samples, but have other properties which make it desirable to examine several different models for the distribution of palynomorphs within the clusters. The results suggest that zonation is not a particularly preferred model since it captures only a small part of the patterns present. It represents a user expectation regarding the nature of variation in the data and results in some patterns being neglected. By using unconstrained clustering within zones, we can recover some of this overlooked pattern. We then examine other evidence for the nature of change in vegetation and finally discuss the usefulness of the minimum message length as a guiding principle in model choice and its relationship to other possible criteria.

Abbreviations

MML:

Minimum Message Length

References

  1. Adomavicius, G. and Tuzhilin, A. 1997. Discovery of actionable patterns in databases: the action hierarchy approach. In: D. Heckerman, H. Mannila, D. Pregibon and R. Uthurusamy (eds.), Proceedings 3rd International Conference on Knowledge Discovery and Data Mining. AAAI. pp. 111–114.

  2. Agusta, Y. and Dowe, D. L. 2003. Unsupervised learning of correlated multivariate Gaussian mixture models. Lecture Notes in Artificial Intelligence 2903, Springer-Verlag, Berlin. pp. 477–489.

    Google Scholar 

  3. Aitchison, S. and Kay, J. W. 2003. Possible solutions of some essential zero problems in compositional data analysis. CODA-WORK’03 Girona: La Universitat. 6 pps. http://hdl.handle.net/10256/652.

    Google Scholar 

  4. Akgiray, V. and Lamoureux, C. G. 1989. Estimation of stable-law parameters: A comparative study. J. Business Econ. Stat. 7:85–93.

    Google Scholar 

  5. Allison, L., Edgoose, T. and Dix, T. I. 1998. Compression of strings with approximate repeats. In: J. I. Glasgow, T. G. Littlejohn, F. Major, R. H. Lathrop, D. Sankoff and C. Sensen (eds.) Proceedings 6th International Conference on Intelligent Systems in Molecular Biology (ISMB’98), Montreal. pp. 8–16.

  6. Arnold, A., Liu, Y. and Abe, N. 2007. Temporal causal modelling with graphical granger methods. In: Berkhin, P., Caruana, R., Wu, X. and Gaffney, S. (eds.) Proceedings 13th ACM SIGKDD International Conference Knowledge Discovery and Data Mining. Association for Computing Machines, New York, pp. 66–75.

    Chapter  Google Scholar 

  7. Babad, Y. M. and Hoffer, J. A. 1984. Even no data has value. Communications of the Association of Computing Machines 27: 748–756.

    Article  Google Scholar 

  8. Balasubramanian, V. 1997. Statistical inference, Occam’s razor and statistical mechanics on the space of probability distributions. Neural Computation 9:349–368.

    Article  Google Scholar 

  9. Barron, A. R., Rissanen, J. and Yu, B. 1998. The minimum description length principle in coding and modeling. IEEE Trans. Information Theory 44:2743–2760.

    Article  Google Scholar 

  10. Baxter, R. A. and Oliver, J. J. 2006. The kindest cut: minimum message length segmentation. Lecture Notes in Computer Science, Springer, Berlin. 1180:83–90.

    Google Scholar 

  11. Bennett, K. D. and Porter, C. 2001. Late quaternary dynamics of Western Tierra del Fuego. Uppsala Universitet: http://www.geo.uu.se/ Institutionen för geovetenskaper: Paleobiologi: forskning.

  12. Berryman, A. A. 1992. On choosing models for describing and analyzing ecological time series. Ecology 73: 694–698.

    Article  Google Scholar 

  13. Blum, A., Hellerstein, L. and Littlestone, N. 1995. Learning in the presence of finitely or infinitely many irrelevant attributes. J. Comput. Syst. Sci. 50:32–40.

    Article  Google Scholar 

  14. Boulton, D. M. and Wallace, C. S. 1970. A program for numerical classification. Computer J. 13: 63–69.

    Article  Google Scholar 

  15. Bradshaw, R. H. W. 1981. Quantitative reconstruction of local woodland vegetation using pollen analysis from a small basin in Norfolk, England. J. Ecol. 69:941–955.

    Article  Google Scholar 

  16. Bunting, M. J. and Middleton, R. 2009. Equifinality and uncertainty in the interpretation of pollen data: the Multiple Scenario Approach to reconstruction of past vegetation mosaics. The Holocene 19:799–803.

    Article  Google Scholar 

  17. Comley, J. W. and Dowe, D. L. 2005. Minimum message length and generalized Bayesian net with asymmetric languages. In: P. Grunwald. I. J. Myung and M. A. Pitt (eds.) Advances in Minimum Description Length:Theory and Applications Chapter 11. MIT Press, Cambridge. pp. 265–294.

    Google Scholar 

  18. Crutchfield, J. P. and Young, K. 1989. Inferring statistical complexity. Phys. Rev. Lett. 63: 105–108.

    Article  CAS  Google Scholar 

  19. Dai, H., Korb, K. B., Wallace, C. S. and Wu, X. 1996. A study of casual discovery with weak links and small samples. Proceedings 15th International Joint Conference on Artificial Intelligence, Morgan Kaufmann Publishers Inc, San Francisco USA. pp. 1304–1309.

    Google Scholar 

  20. Dale, M. B. 2000. Mt Glorious revisited: secondary succession in subtropical rainforest. Community Ecol. 1:181–193.

    Google Scholar 

  21. Dale, M. B. 2007. Changes in the model of within-cluster distribution of attributes and their effects on cluster analysis of vegetation data. Community Ecol. 8: 9–14.

    Article  Google Scholar 

  22. Dale, M. B., Allison, L. and Dale, P. E. R.. 2007. Segmentation and clustering as complementary sources ofinformation. Acta Œcol. :1–10. VOL??.

  23. Dale, M. B., Allison. L. and Dale, P. E. R.. 2010. A model for correlation within clusters and its use in pollen analysis. Community Ecol. 11:51–58.

    Article  Google Scholar 

  24. Dale, M. B. and Clifford, H. T. 1976. The effectiveness of higher taxonomic ranks for vegetation analysis. Austr. J. Ecol. 1: 37–62.

    Article  Google Scholar 

  25. Dale, M. B., Coutts, R. and Dale, P. E. R. 1988. Landscape classification by sequences: a study of Toohey Forest. Vegetatio 29: 113–129.

    Google Scholar 

  26. Dale, M. B., Dale, P. E. R. and Tan, P. J. 2007. Supervised clustering using decision trees and decision graphs: a ecological comparison. Ecol. Model. 204:70–78.

    Article  Google Scholar 

  27. Dale, M. B. and Wallace, C. S. 2005. Hierarchical clusters of vegetation types. Community Ecol. 6:57–74.

    Article  Google Scholar 

  28. Dale, P. E. R. and Dale, M. B. 2002. Optimal classification to describe environmental change: pictures from an exposition. Community Ecol. 3:19–30.

    Article  Google Scholar 

  29. Davidson, I., Eter, M. and Ravi, S. S. 2007. Efficient incremental constrained clustering. In: Berkhin, P., Caruana, R., Wu, X. and Gaffney, S. (eds.) Proceedings 13th ACM SIGKDD International Conference Knowledge Discovery and Data Mining. Association for Computing Machines, New York, pp. 240–249.

    Chapter  Google Scholar 

  30. Diday, E. 1988. The symbolic approach in clustering and related methods of data analysis: the basic choices. In: H. H. Bock (ed.) Classification and Related Methods of Data Analysis, North Holland, Amsterdam. pp. 673–683.

    Google Scholar 

  31. Douglass, D. C., Singer, B. S., Kaplan, M. R., Ackert, R. P., Mickelson, D. M. and Caffee, M. W. 2000. Evidence of early Holocene glacial advances in southern South America from cosmogenic surface-exposure dating. Geology 33:237–240.

    Article  CAS  Google Scholar 

  32. Dowe, D. L. 2008a. Foreword re C. S. Wallace. Computer J. 51: 523–560.

    Article  Google Scholar 

  33. Dowe, D. L. 2008b. Minimum Message Length and statistically consistent invariant (objective?) Bayesian probabilistic inference -from (medical) “evidence” Social Epistemology 22:433–460.

    Article  Google Scholar 

  34. Dowe, D. L., Farr, G. E., Hurst, A. J. and Lentin, K. L. 1996. Information-theoretic football tipping. In: N. de Mestre (ed.), 3rd Conference on Mathematics and Computing in Sport. Bond University. pp. 233–241.

  35. Fesq-Martin, M., Friedman, A., Peters, M., Behrman, J. and Kilian, R. 2004. Late-glacial and Holocene vegetation history of the Magellanic rain forest in Southwestern Patagonia, Chile. Vegetation History and Archaeobotany 13:249–255.

    Article  Google Scholar 

  36. Fisher, D. 1992. Pessimistic and optimistic induction. Tech. Rep. CS-92-12, Dept. Computer Sci., Vanderbilt Univ., Nashville.

    Google Scholar 

  37. Fitzgibbon, L. J., Allison, L. and Dowe, D. L. 2000. Minimum message length grouping of ordered data. Lecture Notes in Computer Science 1968: Proceedings 11 International Conference on Algorithmic Learning Theory. Springer-Verlag, London. pp. 56–70.

  38. Fitzgibbon, L. J., Dowe, D. L. and Allison, L. 2002. Univariate polynomial inference by Monte Carlo message length approximation. In: C. Sammut and A. G. Hoffman (eds.) Proceedings 19th International Conference on Machine Learning (ICML’2002), Sydney, Australia, Morgan Kaufmann, San Francisco. pp. 147–154.

  39. Fitzgibbon, L. J, Dowe, D. L. and Allison, L. 2003. Bayesian posterior comprehension via message from Monte Carlo. Proceedings 2nd Hawaii International Conference on Statistics and Related Fields. http://www.csse.monash.edu.au/~leighf/papers/Fitzgibbon03b.pdf.

  40. Fitzgibbon, L. J., Dowe, D. L. and Vahid, F. 2004. Minimum message length autoregressive model order selection. Proceedings of the International Conference on Intelligent Sensing and Information Processing (ICISIP 2004), Chennai, India, 4–7 January 2004, IEEE Operations Center, Piscataway, NJ, USA, ISBN: 0-7803-8243-9, pp. 439–444.

  41. Gale, M. and Ball, L. J. 2002. Does Positivity Bias Explain Patterns of Performance on Wason’s 2–4-6 task? In: W. D. Gray and C. D. Schunn (eds.) Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society, Routledge, p. 340–344.

  42. Galitskii, V. V. 1999. Modelling of the plant community: An individual-oriented approach. 1. A model of the community. Biology Bulletin 26(2). (Translated from Izvestia Akademii Nauk, Seria Biologicheskaya, 2000, No. 2, pp. 178–185.

    Google Scholar 

  43. Garrett, S. M. Coghill, G. M., Srinivasar, A. and King, R. D. 2007. Learning Qualitative Models of physical and biological systems. In: S. DDeroski, P. Langley and L. Todorovski (eds.), Computational Discovery of Scientific Knowledge. Lecture Notes in Artificial Intelligence 4660:248–272.

  44. Gell-Mann, M. and Lloyd, S. 1996. Information measures, effective complexity and total information Complexity 2:44–52.

    Article  Google Scholar 

  45. Gillison, A. N. and Brewer, K. R. W. 1985. The use of gradient directed transects or gradsects in natural resource surveys. J. Ecol. Manage. 20:103–127.

    Google Scholar 

  46. Gopnik, A. and Glymour, C. 2002. Causal maps and Bayes nets. A cognitive and computational account of causal learning and theory formation. In: P. Carruthers, S. Stich and M. Siegel (eds.) The Cognitive Basis of Science. Cambridge University Press, Cambridge. pp. 117–132.

    Chapter  Google Scholar 

  47. Gower, J. C. 1974. Maximal predictive classification. Biometrics 30:643–654.

    Article  Google Scholar 

  48. Green, D. G. 1982. Fire and stability in the postglacial forests of southwest Nova Scotia. J. Biogeogr. 9: 29–40.

    Article  Google Scholar 

  49. Grimm V. 1999. Ten years of individual-based modelling in ecology: what have we learned, and what could we learn in the future? Ecol. Model. 115:129–148.

    Article  Google Scholar 

  50. Hanson, S. J. 1990. Conceptual clustering and categorization: Bridging the gap between induction and causal models. In: R. S. Michalski and Y. Kodratoff (eds.), Machine Learning: An Artificial Intelligence Approach III, Morgan Kaufmann, San Mateo, CA. pp. 235–268.

    Chapter  Google Scholar 

  51. Hilderman, R. J. and Hamilton, H. J. 1999. Heuristic measures of interestingness. In: Z. M. Zytow and J. Rauch (eds.), Proceedings 3rd European Confeence on the Principles of Data mining and Knowledge Discovery (PKDD). Lecture Notes in Computer Science 1704, Springer, Berlin. pp. 232–241.

    Chapter  Google Scholar 

  52. Hope, G., Singh, G., Geissler, E., Glover, L. and O’Dea, D. A 2000. Detailed Pleistocene-Holocene vegetation record from Bega Swamp, southern New SouthWales. In: J. Magee and C. Craven (eds.) Quaternary Studies Meeting, Regional Analysis of Australian Quaternary Studies: strengths, gaps and future directions, Department of Geology, Australian National University, Canberra ACT. pp. 48–50.

    Google Scholar 

  53. Jackson, S. T. and Williams, J. W. 2004. Modern analogs in quaternary palæoecology: here today, gone yesterday, gone tomorrow?. Annu. Revi. Earth Planetary Sci. 32: 495–537.

    Article  CAS  Google Scholar 

  54. Joosten, H. 2007. In search of finiteness: the limits of fine resolution palynology of Sphagnum peat. The Holocene 17:1023–1031.

    Article  Google Scholar 

  55. Kershaw, A. P. 1976. A Late Pleistocene and Holocene pollen diagram from Lynch’s Crater, northeastern Queensland, Australia. New Phytol. 77:469–498.

    Article  Google Scholar 

  56. Kodratoff, Y. 1986. Leçons d’apprentissage symbolique, Cepaduesed., Toulouse.

    Google Scholar 

  57. Lafferty, J., McCallum, J. A. and Pereira, F. 2001. Conditional Random Fields: probabilistic models for segmenting and labelling sequence data. International Conference on Machine Learning (ICML’01). pp. 282–289.

  58. Lanterman, A. D. 2007. Schwarz, Wallace and Rissanen: intertwining themes in theories of model selection. Internat. Stat. Rev. 69: 185–212.

    Article  Google Scholar 

  59. Larossa, J. M. C. 2005. Compositional time series: past and present. EconWPA Econometrics 0510002. http://129.3.20.41/eps/em/papers/0510/0510002.pdf.

  60. Legendre, P. and Gallagher, E.. 2001. Ecologically meaningful transformations for ordination of species data. Ecology 270: 271–280.

    Google Scholar 

  61. Li, C., Biswas, G., Dale, M. B. and Dale, P. E. R. 2001. Building Models of Ecological Dynamics using HMM-based Temporal Data Clustering. In: Advances in Intelligent Data Analysis, 4th International Conference on Intelligent Data Analysis, Lecture Notes in Computer Science 2189, Springer, pp. 53–62.

  62. Li, M. and Vitanyi, P. 1989. Inductive reasoning and Kolomogorov complexity. In: Proceedings 4th Annual IEEE Structure in Complexity Conference, Eugene. IEEE Computer Society Press. pp. 165–185.

    Google Scholar 

  63. Mac Nally, R. 2000. Regression and model-building in conservation biology, biogeography and ecology: the distinction between-and reconciliation of-‘predictive’ and ‘explanatory’ models. Biodivers. Conserv. 9: 655–671.

    Article  Google Scholar 

  64. Markgraf, V. 1983. Late and Postglacial vegetational and palæoclimatic changes in subantarctic, temperate, and arid environments in Argentina. Palynology 7: 43–70.

    Article  Google Scholar 

  65. Molloy, S., Albrecht, D. W., Dowe, D. L. and Ting, K. M. 2006. Model-Based clustering of sequential data. Proceedings 5th Annual Hawaii International. Conference on Statistics, Mathematics and Related Fields, 16th-18th January, 2006, Hawaii, U.S.A. 22 pages.

  66. Murrell, D. J., Purves, D. W. and Law, R. 2001. Uniting pattern and process in plant ecology. Trends Ecol. Evol. 16:529–530.

    Article  Google Scholar 

  67. Myung, J., Balasubramanian, V. and Pitt, M. A. 2000. Counting probability distributions: differential geometry and model selection. PNAS 97: 11170–11175.

    Article  CAS  Google Scholar 

  68. Needham, S. L. and Dowe, D. L. 2001. Message length as an effective Ockham’s razor in decision tree induction. In: Proceedings 8th International Workshop of Artificial Intelligence and Statistics (AIS- TATS 2001), Key West, FL. pp. 253–260.

    Google Scholar 

  69. Neil, J. R. and Korb, K. B.. 1998. The MML evolution of causal models Tech. Rep. 98/17 Dept Comput. Sci., Monash University, Melbourne.

  70. O’Donnell, R. T., Allison, L. and Korb, K. B. 2006. Learning hybrid Bayesian networks by MML. Lecture Notes in Computer Science 4304: 192–203. Springer, Berlin.

    Article  Google Scholar 

  71. Oliver, J. J., Baxter, R. A. and Wallace, C. S. 1998. Minimum message length segmentation. In: X. Wu, R. Kotagiri and K. B. Korb (eds.) Lecture Notes in Artificial Intelligence 1394: 222–233. Research and Development in Knowledge Discovery and Data Mining, Second Pacific-Asia Conference, PAKDD-98 Melbourne Australia, 15–17 April 1998, Springer-Verlag, Berlin.

  72. Orlóci, L. 2010. Multi-scale trajectory analysis: powerful conceptual tool for understanding ecological change. Front. Biol. China 4:158–179.

    Article  Google Scholar 

  73. Orlóci, L. and He, K. S. 2009. On governance in the long-term vegetation process: How dowe discover the rules? Front. Biol. China 4:557–568.

    Article  Google Scholar 

  74. Orlóci, L., Pillar, V. D. and Anand, M. 2006. Multiscale analysis of palynological records: new possibilities. Community Ecol. 7:53–67.

    Article  Google Scholar 

  75. Paez M. M., Schäbitz, F. and Stutz, S.. 2001. Modern pollen-vegetation and isopoll maps in southern Argentina. J. Biogeogr. 28:997–1021.

    Article  Google Scholar 

  76. Pickett, E. J., Harrison, S. P., Hope, G., Harle, K., Dodson, J. R., Kershaw, A. P., I. Prentice, I. C., Backhouse, J., Colhoun, E. A., D’Costa, D., Flenley, J., Grindrod, J., Haberle, S., Hassell, C, Kenyon, C., Macphail, M., Martin, H., Martin, A. H., McKenzie, M., Newsome, J. C., Penny, D., Powell, J., Raine, J. I., Southern, W., Stevenson, J., Sutra, J-P., Thomas, I., van der Kaars, S. and Ward, J. 2004. Pollen-based reconstructions of biome distributions for Australia, Southeast Asia and the Pacific (SEAPAC region) at 0,6000 and 18,000 14C yr BP. J. Biogeogr. 31: 1381–1444.

    Article  Google Scholar 

  77. Popper, K. 1992. The Logic of Scientific Discovery Chapter 7. Simplicity. Routledge, London. pp. 121–132.

    Google Scholar 

  78. Powell, D. R., Allison, L. and Dix, T. I. 2004. Modelling-alignment for non-random series. In: Lecture Notes in Artificial Intelligence 3339, Springer, Berlin. pp. 203–214.

    Google Scholar 

  79. Prentice I. C. 1985. Pollen representation, source area and basin size: towards a unified theory of pollen analysis. Quat. Res. 23:76–86.

    Article  Google Scholar 

  80. Prentice, I. C., Guiot, J., Huntley, B., Jolly, D. and Cheddadi, R. 1996. Reconstructing biomes from palæoecological data: a general method and its application to European pollen data at 0 and 6 ka. Climate Dynamics 12: 185–194.

    Article  Google Scholar 

  81. Rahwan, T. and Jennings, N. R. 2008. An improved dynamic programming algorithm for coalition structure generation. In: L. Padgham, D. C. Parkes, J. Mueller and S. Parsons (eds.) Proceedings 7th International Conference on Autonomous Agents and Multiagent systems (AAMAS), Estoril, Portugal. pp. 1417–1420.

    Google Scholar 

  82. Riddle, R. R. and Hafner, D. J. 1999. Species as unit of analysis in ecology and biogeography: time to take the blinkers off. Global Ecol. Biogeogr. 8: 433–441.

    Article  Google Scholar 

  83. Rissanen, J. 1995. Stochastic complexity in learning. In: P. Vitányi (ed.) Computational Learning Theory. Lecture Notes in Computer Science 904. pp. 196–210.

  84. Salzberg, S. 1986. Pinpointing good hypotheses with heuristics. In: W. A. Gale (ed.) Artificial Intelligence and Statistics. Addison-Wesley, Sydney. pp. 133–158.

    Google Scholar 

  85. Schader, M. 1979. Branch and Bound Clustering with a generalised scatter criterion. Oper. Res. Verfahren 30: 154–162.

    Google Scholar 

  86. Schmidhuber, J. 1997. What’s interesting? Tech. Rep. IDSIA-35–97, IDSIA, Lugano, Switzerland.

    Google Scholar 

  87. Shalizi, C. R. and Crutchfield, J. P. 2001. Computational Mechanics: Pattern and Prediction, Structure and Simplicity. J. Stat. Phys. 104:819–881.

    Article  Google Scholar 

  88. Silberschatz, A. and Tuzhilin, A. 1996. What makes patterns interesting. I. E. E. E. Trans. Knowledge Data Engineering 8: 275–281.

    Google Scholar 

  89. Sober, E. Let’s Razor Occam’s Razor 1994. In: D. Knowles (ed.) Explanation and Its Limits Cambridge University Press Cambridge. pp. 73–93.

  90. Solomonoff, R. J. 2008. Three kinds of probabilistic induction: universal distributions and convergence theorems. Computer J. 51:566–570.

    Article  Google Scholar 

  91. Sombattheera, C. and Ghose, A. 2008. Abest-first anytime algorithm for computing optimal coalition structures. In: L. Padgham, D. C. Parkes, J. Mueller and S. Parsons (eds.), Proceedings 7th International Conference on Autonomous Agents and Multiagent systems (AAMAS), Estoril, Portugal. pp. 1425–1427.

    Google Scholar 

  92. Sommer, E. 1995. An approach to quantifying the quality of induced theories. In: C. Nedellec (ed.), Proceedings of the International Joint Conference on Artificial Intelligence Workshop on Macine Learning and Comprehensibility. pp. 356–359.

  93. Srinivasan, A., Muggleton, S. and Bain, M. 1994. The justification of logical theories based on data compression. Machine Intelligence 13: 87–121.

    Google Scholar 

  94. Sugita, S. 1993. A model of pollen source area for an entire lake surface. Quat. Res. 39:239–244.

    Article  Google Scholar 

  95. Sugita, S. 1994. Pollen representation of vegetation in Quaternary sediments: theory and method in patchy vegetation. J. Ecol. 82:881–897.

    Article  Google Scholar 

  96. Sugita, S. 2007a. Theory of quantitative reconstruction of vegetation I: pollen from large sites REVEALS regional vegetation composition. The Holocene 17: 229–241.

    Article  Google Scholar 

  97. Sugita, S. 2007b. Theory of quantitative reconstruction of vegetation II: all you need is LOVE. The Holocene 17: 243–257.

    Article  Google Scholar 

  98. Sunnehag, P. and Hutter, M. 2010 Consistency of feature Markov processes. arXiv:1007.2075v1.

  99. Van der Maarel, E. and Sykes, M. T. 1993. Small-scale plant species turnover in a limestone grassland: the carousel model and some comments on the niche concept. J. Veg. Sci. 4: 179–188.

    Article  Google Scholar 

  100. Thagard, P. 1978. The best explanation: criteria for theory choice. J. Philos. 75:76–92.

    Article  Google Scholar 

  101. Villa-Martínez, R. and Moreno, P. I. 2007. Pollen evidence for variations in the southern margin of the westerly winds in SW Patagonia over the last 12,600 years. Quat. Res. 68: 400–409.

    Article  Google Scholar 

  102. Vinod, H. D. 1969. Integer programming and the theory of grouping. American Stat. Assoc. J. 64: 506–519.

    Article  Google Scholar 

  103. Visser, G. and Dowe, D. L. 2007. Minimum message length clustering of spatially-correlated data with varying inter-class penalties. 6th IEEE International Conference on Computer and Information Science (ICIS 2007), Melbourne, Australia, pp. 17–22.

  104. Visser, G., Dowe, D. L. and Uotila, J. P. 2009. Enhancing MML Clustering using Context Data with Climate Applications. In: A. Nicholson and X. Li (Eds.) Proceedings 22nd Australian Joint Conf. on Artificial Intelligence (AI’09), Melbourne, Australia), Lecture Notes in Artificial Intelligence (LNAI) 5866 Springer Berlin. pp. 350–359.

  105. Von Post, L. 1916. Skogsträdspollen i sydsvenska torvmosselager-följder. Geol. Fören. Förhandl. 38:384–394.

    Google Scholar 

  106. Von Post, L. 1924. Ur de sydsvenska skogarnas regionala historia under postarktisk tid. Geol. Fören. Förhandl. 46:83–128.

    Article  Google Scholar 

  107. V’yugin, V. V. 1999. Most sequences are predictable. Tech. Report CLRC-TR-99-01, Computer Learning Research Centre, Royal Hollaway University of London, Egham Surrey UK.

    Google Scholar 

  108. Walker, D. 1966. The late Quaternary history of the Cumberland lowlands. Philos. Trans. Roy. Soc. 251:1–210.

    Article  Google Scholar 

  109. Walker, D and Wilson, S. R. 1978. A statistical alternative to the zoning of pollen diagrams. J. Biogeogr. 5: 1–21.

    Article  Google Scholar 

  110. Wallace, C. S. 1996. MML inference of predictive trees, graphs and nets. In: A. Gammerman (ed.) Computational Learning and Probabilistic Reasoning, John Wiley. pp 43–66.

  111. Wallace, C. S. 1998. Intrinsic classification of spatially correlated data. Computer J. 41: 602–611.

    Article  Google Scholar 

  112. Wallace, C. S. 2005. Statistical and Inductive Inference by Minimum Message Length. Springer, Berlin.

    Google Scholar 

  113. Wallace, C. S. and Dowe, D. L. 1994. Intrinsic classification by MML - the Snob program. Proceedings 7th Australian Joint Conference on Artificial Intelligence, University of New England, Armidale, Australia, pp. 37–44.

  114. Wallace, C. S. and Dowe, D. L. 2000. MML clustering with multistate, Poisson, Von Mises circular and Gaussian distributions. Statistics and Computing 10: 73–83.

    Article  Google Scholar 

  115. Webb, L. J., Tracey, J. G., Williams, W. T. and Lance, G. N. 1967. Studies in the numerical analysis of complex rain-forest communities I. a comparison of methods applicable to site/species data. J. Ecol. 55: 171–191.

    Article  Google Scholar 

  116. Whewell, W. 1847. The Philosophy of the Inductive Sciences Johnson Reprint Co., New York.

    Google Scholar 

  117. Williams, W. T. 1969. The problem of attributeweightinginnumerical classification. Taxon 18: 369–374.

    Article  Google Scholar 

  118. Williams, W. T. 1971. Principles of clustering. Annu. Rev. Ecol. Syst. 2: 303–326.

    Article  Google Scholar 

  119. Williams, W. T. and Dale, M. B. 1962. Partition correlation matrices for heterogeneous quantitative data. Nature 196: 602.

    Article  Google Scholar 

  120. Yamada, H. and Amaroso, S. 1971. Structural and behavioural equivalences of tessellation automata. Information and Control 18:1–31.

    Article  Google Scholar 

  121. Yang, F. and Jiang, T. 2003. Pixon-based image segmentation with Markov random fields. IEEE Transactions on Image Processing 12:1552–1559.

    Article  Google Scholar 

  122. Yin, K. and Davidson, I. 2004. An information Theoretic Optimal Classifier for Semi-supervised Learning. Lecture Notes in Computer Science 3177, Springer Berlin. pp. 740–745.

  123. Yu, S. X. and Shi, J. 2004. Segmentation Given Partial Grouping Constraints, IEEE Transactions Pattern Analysis and Machine Intelligence PAMI 26:173–183.

    Article  Google Scholar 

  124. Zhang, H-X. and Lu, J. 2010. Creating ensembles of classifiers via fuzzy clustering and deflection. Fuzzy sets and Systems 161: 1790–1802.

    Article  Google Scholar 

  125. Zhu, H-Y. and Rohwer, R. 1995. Bayesian invariant measurements of generalisation for continuous distributions. Technical Report NCRG/4352, Department Computer Science, University of Aston.

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to M. B. Dale.

Rights and permissions

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Cite this article

Dale, M.B., Allison, L. & Dale, P.E.R. Model selection using Minimal Message Length: an example using pollen data. COMMUNITY ECOLOGY 11, 187–201 (2010). https://doi.org/10.1556/ComEc.11.2010.2.7

Download citation

Keywords

  • Censoring
  • Clustering
  • Complexity
  • Compositional data
  • Constrained
  • Gaussian
  • Geometric
  • Minimum message length
  • Unconstrained
  • User expectation
  • Within-cluster model