Community Ecology

, Volume 11, Issue 2, pp 187–201 | Cite as

Model selection using Minimal Message Length: an example using pollen data

  • M. B. DaleEmail author
  • L. Allison
  • P. E. R. Dale
Open Access


In this paper we examine the use of the minimum message length criterion in the process of evaluating alternative models of data when the samples are serially ordered in space and implicitly in time. Much data from vegetation studies can be arranged in a sequence and in such cases the user may elect to constrain the clustering by zones, in preference to an unconstrained clustering. We use the minimum message length principle to determine if such a choice provides an effective model of the data. Pollen data provide a suitably organised set of samples, but have other properties which make it desirable to examine several different models for the distribution of palynomorphs within the clusters. The results suggest that zonation is not a particularly preferred model since it captures only a small part of the patterns present. It represents a user expectation regarding the nature of variation in the data and results in some patterns being neglected. By using unconstrained clustering within zones, we can recover some of this overlooked pattern. We then examine other evidence for the nature of change in vegetation and finally discuss the usefulness of the minimum message length as a guiding principle in model choice and its relationship to other possible criteria.


Censoring Clustering Complexity Compositional data Constrained Gaussian Geometric Minimum message length Unconstrained User expectation Within-cluster model 



Minimum Message Length


  1. Adomavicius, G. and Tuzhilin, A. 1997. Discovery of actionable patterns in databases: the action hierarchy approach. In: D. Heckerman, H. Mannila, D. Pregibon and R. Uthurusamy (eds.), Proceedings 3rd International Conference on Knowledge Discovery and Data Mining. AAAI. pp. 111–114.Google Scholar
  2. Agusta, Y. and Dowe, D. L. 2003. Unsupervised learning of correlated multivariate Gaussian mixture models. Lecture Notes in Artificial Intelligence 2903, Springer-Verlag, Berlin. pp. 477–489.Google Scholar
  3. Aitchison, S. and Kay, J. W. 2003. Possible solutions of some essential zero problems in compositional data analysis. CODA-WORK’03 Girona: La Universitat. 6 pps. Scholar
  4. Akgiray, V. and Lamoureux, C. G. 1989. Estimation of stable-law parameters: A comparative study. J. Business Econ. Stat. 7:85–93.Google Scholar
  5. Allison, L., Edgoose, T. and Dix, T. I. 1998. Compression of strings with approximate repeats. In: J. I. Glasgow, T. G. Littlejohn, F. Major, R. H. Lathrop, D. Sankoff and C. Sensen (eds.) Proceedings 6th International Conference on Intelligent Systems in Molecular Biology (ISMB’98), Montreal. pp. 8–16.Google Scholar
  6. Arnold, A., Liu, Y. and Abe, N. 2007. Temporal causal modelling with graphical granger methods. In: Berkhin, P., Caruana, R., Wu, X. and Gaffney, S. (eds.) Proceedings 13th ACM SIGKDD International Conference Knowledge Discovery and Data Mining. Association for Computing Machines, New York, pp. 66–75.CrossRefGoogle Scholar
  7. Babad, Y. M. and Hoffer, J. A. 1984. Even no data has value. Communications of the Association of Computing Machines 27: 748–756.CrossRefGoogle Scholar
  8. Balasubramanian, V. 1997. Statistical inference, Occam’s razor and statistical mechanics on the space of probability distributions. Neural Computation 9:349–368.CrossRefGoogle Scholar
  9. Barron, A. R., Rissanen, J. and Yu, B. 1998. The minimum description length principle in coding and modeling. IEEE Trans. Information Theory 44:2743–2760.CrossRefGoogle Scholar
  10. Baxter, R. A. and Oliver, J. J. 2006. The kindest cut: minimum message length segmentation. Lecture Notes in Computer Science, Springer, Berlin. 1180:83–90.Google Scholar
  11. Bennett, K. D. and Porter, C. 2001. Late quaternary dynamics of Western Tierra del Fuego. Uppsala Universitet: Institutionen för geovetenskaper: Paleobiologi: forskning.
  12. Berryman, A. A. 1992. On choosing models for describing and analyzing ecological time series. Ecology 73: 694–698.CrossRefGoogle Scholar
  13. Blum, A., Hellerstein, L. and Littlestone, N. 1995. Learning in the presence of finitely or infinitely many irrelevant attributes. J. Comput. Syst. Sci. 50:32–40.CrossRefGoogle Scholar
  14. Boulton, D. M. and Wallace, C. S. 1970. A program for numerical classification. Computer J. 13: 63–69.CrossRefGoogle Scholar
  15. Bradshaw, R. H. W. 1981. Quantitative reconstruction of local woodland vegetation using pollen analysis from a small basin in Norfolk, England. J. Ecol. 69:941–955.CrossRefGoogle Scholar
  16. Bunting, M. J. and Middleton, R. 2009. Equifinality and uncertainty in the interpretation of pollen data: the Multiple Scenario Approach to reconstruction of past vegetation mosaics. The Holocene 19:799–803.CrossRefGoogle Scholar
  17. Comley, J. W. and Dowe, D. L. 2005. Minimum message length and generalized Bayesian net with asymmetric languages. In: P. Grunwald. I. J. Myung and M. A. Pitt (eds.) Advances in Minimum Description Length:Theory and Applications Chapter 11. MIT Press, Cambridge. pp. 265–294.Google Scholar
  18. Crutchfield, J. P. and Young, K. 1989. Inferring statistical complexity. Phys. Rev. Lett. 63: 105–108.CrossRefGoogle Scholar
  19. Dai, H., Korb, K. B., Wallace, C. S. and Wu, X. 1996. A study of casual discovery with weak links and small samples. Proceedings 15th International Joint Conference on Artificial Intelligence, Morgan Kaufmann Publishers Inc, San Francisco USA. pp. 1304–1309.Google Scholar
  20. Dale, M. B. 2000. Mt Glorious revisited: secondary succession in subtropical rainforest. Community Ecol. 1:181–193.Google Scholar
  21. Dale, M. B. 2007. Changes in the model of within-cluster distribution of attributes and their effects on cluster analysis of vegetation data. Community Ecol. 8: 9–14.CrossRefGoogle Scholar
  22. Dale, M. B., Allison, L. and Dale, P. E. R.. 2007. Segmentation and clustering as complementary sources ofinformation. Acta Œcol. :1–10. VOL??.Google Scholar
  23. Dale, M. B., Allison. L. and Dale, P. E. R.. 2010. A model for correlation within clusters and its use in pollen analysis. Community Ecol. 11:51–58.CrossRefGoogle Scholar
  24. Dale, M. B. and Clifford, H. T. 1976. The effectiveness of higher taxonomic ranks for vegetation analysis. Austr. J. Ecol. 1: 37–62.CrossRefGoogle Scholar
  25. Dale, M. B., Coutts, R. and Dale, P. E. R. 1988. Landscape classification by sequences: a study of Toohey Forest. Vegetatio 29: 113–129.Google Scholar
  26. Dale, M. B., Dale, P. E. R. and Tan, P. J. 2007. Supervised clustering using decision trees and decision graphs: a ecological comparison. Ecol. Model. 204:70–78.CrossRefGoogle Scholar
  27. Dale, M. B. and Wallace, C. S. 2005. Hierarchical clusters of vegetation types. Community Ecol. 6:57–74.CrossRefGoogle Scholar
  28. Dale, P. E. R. and Dale, M. B. 2002. Optimal classification to describe environmental change: pictures from an exposition. Community Ecol. 3:19–30.CrossRefGoogle Scholar
  29. Davidson, I., Eter, M. and Ravi, S. S. 2007. Efficient incremental constrained clustering. In: Berkhin, P., Caruana, R., Wu, X. and Gaffney, S. (eds.) Proceedings 13th ACM SIGKDD International Conference Knowledge Discovery and Data Mining. Association for Computing Machines, New York, pp. 240–249.CrossRefGoogle Scholar
  30. Diday, E. 1988. The symbolic approach in clustering and related methods of data analysis: the basic choices. In: H. H. Bock (ed.) Classification and Related Methods of Data Analysis, North Holland, Amsterdam. pp. 673–683.Google Scholar
  31. Douglass, D. C., Singer, B. S., Kaplan, M. R., Ackert, R. P., Mickelson, D. M. and Caffee, M. W. 2000. Evidence of early Holocene glacial advances in southern South America from cosmogenic surface-exposure dating. Geology 33:237–240.CrossRefGoogle Scholar
  32. Dowe, D. L. 2008a. Foreword re C. S. Wallace. Computer J. 51: 523–560.CrossRefGoogle Scholar
  33. Dowe, D. L. 2008b. Minimum Message Length and statistically consistent invariant (objective?) Bayesian probabilistic inference -from (medical) “evidence” Social Epistemology 22:433–460.CrossRefGoogle Scholar
  34. Dowe, D. L., Farr, G. E., Hurst, A. J. and Lentin, K. L. 1996. Information-theoretic football tipping. In: N. de Mestre (ed.), 3rd Conference on Mathematics and Computing in Sport. Bond University. pp. 233–241.Google Scholar
  35. Fesq-Martin, M., Friedman, A., Peters, M., Behrman, J. and Kilian, R. 2004. Late-glacial and Holocene vegetation history of the Magellanic rain forest in Southwestern Patagonia, Chile. Vegetation History and Archaeobotany 13:249–255.CrossRefGoogle Scholar
  36. Fisher, D. 1992. Pessimistic and optimistic induction. Tech. Rep. CS-92-12, Dept. Computer Sci., Vanderbilt Univ., Nashville.Google Scholar
  37. Fitzgibbon, L. J., Allison, L. and Dowe, D. L. 2000. Minimum message length grouping of ordered data. Lecture Notes in Computer Science 1968: Proceedings 11 International Conference on Algorithmic Learning Theory. Springer-Verlag, London. pp. 56–70.Google Scholar
  38. Fitzgibbon, L. J., Dowe, D. L. and Allison, L. 2002. Univariate polynomial inference by Monte Carlo message length approximation. In: C. Sammut and A. G. Hoffman (eds.) Proceedings 19th International Conference on Machine Learning (ICML’2002), Sydney, Australia, Morgan Kaufmann, San Francisco. pp. 147–154.Google Scholar
  39. Fitzgibbon, L. J, Dowe, D. L. and Allison, L. 2003. Bayesian posterior comprehension via message from Monte Carlo. Proceedings 2nd Hawaii International Conference on Statistics and Related Fields.
  40. Fitzgibbon, L. J., Dowe, D. L. and Vahid, F. 2004. Minimum message length autoregressive model order selection. Proceedings of the International Conference on Intelligent Sensing and Information Processing (ICISIP 2004), Chennai, India, 4–7 January 2004, IEEE Operations Center, Piscataway, NJ, USA, ISBN: 0-7803-8243-9, pp. 439–444.Google Scholar
  41. Gale, M. and Ball, L. J. 2002. Does Positivity Bias Explain Patterns of Performance on Wason’s 2–4-6 task? In: W. D. Gray and C. D. Schunn (eds.) Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society, Routledge, p. 340–344.Google Scholar
  42. Galitskii, V. V. 1999. Modelling of the plant community: An individual-oriented approach. 1. A model of the community. Biology Bulletin 26(2). (Translated from Izvestia Akademii Nauk, Seria Biologicheskaya, 2000, No. 2, pp. 178–185.Google Scholar
  43. Garrett, S. M. Coghill, G. M., Srinivasar, A. and King, R. D. 2007. Learning Qualitative Models of physical and biological systems. In: S. DDeroski, P. Langley and L. Todorovski (eds.), Computational Discovery of Scientific Knowledge. Lecture Notes in Artificial Intelligence 4660:248–272.Google Scholar
  44. Gell-Mann, M. and Lloyd, S. 1996. Information measures, effective complexity and total information Complexity 2:44–52.CrossRefGoogle Scholar
  45. Gillison, A. N. and Brewer, K. R. W. 1985. The use of gradient directed transects or gradsects in natural resource surveys. J. Ecol. Manage. 20:103–127.Google Scholar
  46. Gopnik, A. and Glymour, C. 2002. Causal maps and Bayes nets. A cognitive and computational account of causal learning and theory formation. In: P. Carruthers, S. Stich and M. Siegel (eds.) The Cognitive Basis of Science. Cambridge University Press, Cambridge. pp. 117–132.CrossRefGoogle Scholar
  47. Gower, J. C. 1974. Maximal predictive classification. Biometrics 30:643–654.CrossRefGoogle Scholar
  48. Green, D. G. 1982. Fire and stability in the postglacial forests of southwest Nova Scotia. J. Biogeogr. 9: 29–40.CrossRefGoogle Scholar
  49. Grimm V. 1999. Ten years of individual-based modelling in ecology: what have we learned, and what could we learn in the future? Ecol. Model. 115:129–148.CrossRefGoogle Scholar
  50. Hanson, S. J. 1990. Conceptual clustering and categorization: Bridging the gap between induction and causal models. In: R. S. Michalski and Y. Kodratoff (eds.), Machine Learning: An Artificial Intelligence Approach III, Morgan Kaufmann, San Mateo, CA. pp. 235–268.CrossRefGoogle Scholar
  51. Hilderman, R. J. and Hamilton, H. J. 1999. Heuristic measures of interestingness. In: Z. M. Zytow and J. Rauch (eds.), Proceedings 3rd European Confeence on the Principles of Data mining and Knowledge Discovery (PKDD). Lecture Notes in Computer Science 1704, Springer, Berlin. pp. 232–241.CrossRefGoogle Scholar
  52. Hope, G., Singh, G., Geissler, E., Glover, L. and O’Dea, D. A 2000. Detailed Pleistocene-Holocene vegetation record from Bega Swamp, southern New SouthWales. In: J. Magee and C. Craven (eds.) Quaternary Studies Meeting, Regional Analysis of Australian Quaternary Studies: strengths, gaps and future directions, Department of Geology, Australian National University, Canberra ACT. pp. 48–50.Google Scholar
  53. Jackson, S. T. and Williams, J. W. 2004. Modern analogs in quaternary palæoecology: here today, gone yesterday, gone tomorrow?. Annu. Revi. Earth Planetary Sci. 32: 495–537.CrossRefGoogle Scholar
  54. Joosten, H. 2007. In search of finiteness: the limits of fine resolution palynology of Sphagnum peat. The Holocene 17:1023–1031.CrossRefGoogle Scholar
  55. Kershaw, A. P. 1976. A Late Pleistocene and Holocene pollen diagram from Lynch’s Crater, northeastern Queensland, Australia. New Phytol. 77:469–498.CrossRefGoogle Scholar
  56. Kodratoff, Y. 1986. Leçons d’apprentissage symbolique, Cepaduesed., Toulouse.Google Scholar
  57. Lafferty, J., McCallum, J. A. and Pereira, F. 2001. Conditional Random Fields: probabilistic models for segmenting and labelling sequence data. International Conference on Machine Learning (ICML’01). pp. 282–289.Google Scholar
  58. Lanterman, A. D. 2007. Schwarz, Wallace and Rissanen: intertwining themes in theories of model selection. Internat. Stat. Rev. 69: 185–212.CrossRefGoogle Scholar
  59. Larossa, J. M. C. 2005. Compositional time series: past and present. EconWPA Econometrics 0510002.
  60. Legendre, P. and Gallagher, E.. 2001. Ecologically meaningful transformations for ordination of species data. Ecology 270: 271–280.Google Scholar
  61. Li, C., Biswas, G., Dale, M. B. and Dale, P. E. R. 2001. Building Models of Ecological Dynamics using HMM-based Temporal Data Clustering. In: Advances in Intelligent Data Analysis, 4th International Conference on Intelligent Data Analysis, Lecture Notes in Computer Science 2189, Springer, pp. 53–62.Google Scholar
  62. Li, M. and Vitanyi, P. 1989. Inductive reasoning and Kolomogorov complexity. In: Proceedings 4th Annual IEEE Structure in Complexity Conference, Eugene. IEEE Computer Society Press. pp. 165–185.Google Scholar
  63. Mac Nally, R. 2000. Regression and model-building in conservation biology, biogeography and ecology: the distinction between-and reconciliation of-‘predictive’ and ‘explanatory’ models. Biodivers. Conserv. 9: 655–671.CrossRefGoogle Scholar
  64. Markgraf, V. 1983. Late and Postglacial vegetational and palæoclimatic changes in subantarctic, temperate, and arid environments in Argentina. Palynology 7: 43–70.CrossRefGoogle Scholar
  65. Molloy, S., Albrecht, D. W., Dowe, D. L. and Ting, K. M. 2006. Model-Based clustering of sequential data. Proceedings 5th Annual Hawaii International. Conference on Statistics, Mathematics and Related Fields, 16th-18th January, 2006, Hawaii, U.S.A. 22 pages.Google Scholar
  66. Murrell, D. J., Purves, D. W. and Law, R. 2001. Uniting pattern and process in plant ecology. Trends Ecol. Evol. 16:529–530.CrossRefGoogle Scholar
  67. Myung, J., Balasubramanian, V. and Pitt, M. A. 2000. Counting probability distributions: differential geometry and model selection. PNAS 97: 11170–11175.CrossRefGoogle Scholar
  68. Needham, S. L. and Dowe, D. L. 2001. Message length as an effective Ockham’s razor in decision tree induction. In: Proceedings 8th International Workshop of Artificial Intelligence and Statistics (AIS- TATS 2001), Key West, FL. pp. 253–260.Google Scholar
  69. Neil, J. R. and Korb, K. B.. 1998. The MML evolution of causal models Tech. Rep. 98/17 Dept Comput. Sci., Monash University, Melbourne.Google Scholar
  70. O’Donnell, R. T., Allison, L. and Korb, K. B. 2006. Learning hybrid Bayesian networks by MML. Lecture Notes in Computer Science 4304: 192–203. Springer, Berlin.CrossRefGoogle Scholar
  71. Oliver, J. J., Baxter, R. A. and Wallace, C. S. 1998. Minimum message length segmentation. In: X. Wu, R. Kotagiri and K. B. Korb (eds.) Lecture Notes in Artificial Intelligence 1394: 222–233. Research and Development in Knowledge Discovery and Data Mining, Second Pacific-Asia Conference, PAKDD-98 Melbourne Australia, 15–17 April 1998, Springer-Verlag, Berlin.Google Scholar
  72. Orlóci, L. 2010. Multi-scale trajectory analysis: powerful conceptual tool for understanding ecological change. Front. Biol. China 4:158–179.CrossRefGoogle Scholar
  73. Orlóci, L. and He, K. S. 2009. On governance in the long-term vegetation process: How dowe discover the rules? Front. Biol. China 4:557–568.CrossRefGoogle Scholar
  74. Orlóci, L., Pillar, V. D. and Anand, M. 2006. Multiscale analysis of palynological records: new possibilities. Community Ecol. 7:53–67.CrossRefGoogle Scholar
  75. Paez M. M., Schäbitz, F. and Stutz, S.. 2001. Modern pollen-vegetation and isopoll maps in southern Argentina. J. Biogeogr. 28:997–1021.CrossRefGoogle Scholar
  76. Pickett, E. J., Harrison, S. P., Hope, G., Harle, K., Dodson, J. R., Kershaw, A. P., I. Prentice, I. C., Backhouse, J., Colhoun, E. A., D’Costa, D., Flenley, J., Grindrod, J., Haberle, S., Hassell, C, Kenyon, C., Macphail, M., Martin, H., Martin, A. H., McKenzie, M., Newsome, J. C., Penny, D., Powell, J., Raine, J. I., Southern, W., Stevenson, J., Sutra, J-P., Thomas, I., van der Kaars, S. and Ward, J. 2004. Pollen-based reconstructions of biome distributions for Australia, Southeast Asia and the Pacific (SEAPAC region) at 0,6000 and 18,000 14C yr BP. J. Biogeogr. 31: 1381–1444.CrossRefGoogle Scholar
  77. Popper, K. 1992. The Logic of Scientific Discovery Chapter 7. Simplicity. Routledge, London. pp. 121–132.Google Scholar
  78. Powell, D. R., Allison, L. and Dix, T. I. 2004. Modelling-alignment for non-random series. In: Lecture Notes in Artificial Intelligence 3339, Springer, Berlin. pp. 203–214.Google Scholar
  79. Prentice I. C. 1985. Pollen representation, source area and basin size: towards a unified theory of pollen analysis. Quat. Res. 23:76–86.CrossRefGoogle Scholar
  80. Prentice, I. C., Guiot, J., Huntley, B., Jolly, D. and Cheddadi, R. 1996. Reconstructing biomes from palæoecological data: a general method and its application to European pollen data at 0 and 6 ka. Climate Dynamics 12: 185–194.CrossRefGoogle Scholar
  81. Rahwan, T. and Jennings, N. R. 2008. An improved dynamic programming algorithm for coalition structure generation. In: L. Padgham, D. C. Parkes, J. Mueller and S. Parsons (eds.) Proceedings 7th International Conference on Autonomous Agents and Multiagent systems (AAMAS), Estoril, Portugal. pp. 1417–1420.Google Scholar
  82. Riddle, R. R. and Hafner, D. J. 1999. Species as unit of analysis in ecology and biogeography: time to take the blinkers off. Global Ecol. Biogeogr. 8: 433–441.CrossRefGoogle Scholar
  83. Rissanen, J. 1995. Stochastic complexity in learning. In: P. Vitányi (ed.) Computational Learning Theory. Lecture Notes in Computer Science 904. pp. 196–210.Google Scholar
  84. Salzberg, S. 1986. Pinpointing good hypotheses with heuristics. In: W. A. Gale (ed.) Artificial Intelligence and Statistics. Addison-Wesley, Sydney. pp. 133–158.Google Scholar
  85. Schader, M. 1979. Branch and Bound Clustering with a generalised scatter criterion. Oper. Res. Verfahren 30: 154–162.Google Scholar
  86. Schmidhuber, J. 1997. What’s interesting? Tech. Rep. IDSIA-35–97, IDSIA, Lugano, Switzerland.Google Scholar
  87. Shalizi, C. R. and Crutchfield, J. P. 2001. Computational Mechanics: Pattern and Prediction, Structure and Simplicity. J. Stat. Phys. 104:819–881.CrossRefGoogle Scholar
  88. Silberschatz, A. and Tuzhilin, A. 1996. What makes patterns interesting. I. E. E. E. Trans. Knowledge Data Engineering 8: 275–281.Google Scholar
  89. Sober, E. Let’s Razor Occam’s Razor 1994. In: D. Knowles (ed.) Explanation and Its Limits Cambridge University Press Cambridge. pp. 73–93.Google Scholar
  90. Solomonoff, R. J. 2008. Three kinds of probabilistic induction: universal distributions and convergence theorems. Computer J. 51:566–570.CrossRefGoogle Scholar
  91. Sombattheera, C. and Ghose, A. 2008. Abest-first anytime algorithm for computing optimal coalition structures. In: L. Padgham, D. C. Parkes, J. Mueller and S. Parsons (eds.), Proceedings 7th International Conference on Autonomous Agents and Multiagent systems (AAMAS), Estoril, Portugal. pp. 1425–1427.Google Scholar
  92. Sommer, E. 1995. An approach to quantifying the quality of induced theories. In: C. Nedellec (ed.), Proceedings of the International Joint Conference on Artificial Intelligence Workshop on Macine Learning and Comprehensibility. pp. 356–359.Google Scholar
  93. Srinivasan, A., Muggleton, S. and Bain, M. 1994. The justification of logical theories based on data compression. Machine Intelligence 13: 87–121.Google Scholar
  94. Sugita, S. 1993. A model of pollen source area for an entire lake surface. Quat. Res. 39:239–244.CrossRefGoogle Scholar
  95. Sugita, S. 1994. Pollen representation of vegetation in Quaternary sediments: theory and method in patchy vegetation. J. Ecol. 82:881–897.CrossRefGoogle Scholar
  96. Sugita, S. 2007a. Theory of quantitative reconstruction of vegetation I: pollen from large sites REVEALS regional vegetation composition. The Holocene 17: 229–241.CrossRefGoogle Scholar
  97. Sugita, S. 2007b. Theory of quantitative reconstruction of vegetation II: all you need is LOVE. The Holocene 17: 243–257.CrossRefGoogle Scholar
  98. Sunnehag, P. and Hutter, M. 2010 Consistency of feature Markov processes. arXiv:1007.2075v1.Google Scholar
  99. Van der Maarel, E. and Sykes, M. T. 1993. Small-scale plant species turnover in a limestone grassland: the carousel model and some comments on the niche concept. J. Veg. Sci. 4: 179–188.CrossRefGoogle Scholar
  100. Thagard, P. 1978. The best explanation: criteria for theory choice. J. Philos. 75:76–92.CrossRefGoogle Scholar
  101. Villa-Martínez, R. and Moreno, P. I. 2007. Pollen evidence for variations in the southern margin of the westerly winds in SW Patagonia over the last 12,600 years. Quat. Res. 68: 400–409.CrossRefGoogle Scholar
  102. Vinod, H. D. 1969. Integer programming and the theory of grouping. American Stat. Assoc. J. 64: 506–519.CrossRefGoogle Scholar
  103. Visser, G. and Dowe, D. L. 2007. Minimum message length clustering of spatially-correlated data with varying inter-class penalties. 6th IEEE International Conference on Computer and Information Science (ICIS 2007), Melbourne, Australia, pp. 17–22.Google Scholar
  104. Visser, G., Dowe, D. L. and Uotila, J. P. 2009. Enhancing MML Clustering using Context Data with Climate Applications. In: A. Nicholson and X. Li (Eds.) Proceedings 22nd Australian Joint Conf. on Artificial Intelligence (AI’09), Melbourne, Australia), Lecture Notes in Artificial Intelligence (LNAI) 5866 Springer Berlin. pp. 350–359.Google Scholar
  105. Von Post, L. 1916. Skogsträdspollen i sydsvenska torvmosselager-följder. Geol. Fören. Förhandl. 38:384–394.Google Scholar
  106. Von Post, L. 1924. Ur de sydsvenska skogarnas regionala historia under postarktisk tid. Geol. Fören. Förhandl. 46:83–128.CrossRefGoogle Scholar
  107. V’yugin, V. V. 1999. Most sequences are predictable. Tech. Report CLRC-TR-99-01, Computer Learning Research Centre, Royal Hollaway University of London, Egham Surrey UK.Google Scholar
  108. Walker, D. 1966. The late Quaternary history of the Cumberland lowlands. Philos. Trans. Roy. Soc. 251:1–210.CrossRefGoogle Scholar
  109. Walker, D and Wilson, S. R. 1978. A statistical alternative to the zoning of pollen diagrams. J. Biogeogr. 5: 1–21.CrossRefGoogle Scholar
  110. Wallace, C. S. 1996. MML inference of predictive trees, graphs and nets. In: A. Gammerman (ed.) Computational Learning and Probabilistic Reasoning, John Wiley. pp 43–66.Google Scholar
  111. Wallace, C. S. 1998. Intrinsic classification of spatially correlated data. Computer J. 41: 602–611.CrossRefGoogle Scholar
  112. Wallace, C. S. 2005. Statistical and Inductive Inference by Minimum Message Length. Springer, Berlin.Google Scholar
  113. Wallace, C. S. and Dowe, D. L. 1994. Intrinsic classification by MML - the Snob program. Proceedings 7th Australian Joint Conference on Artificial Intelligence, University of New England, Armidale, Australia, pp. 37–44.Google Scholar
  114. Wallace, C. S. and Dowe, D. L. 2000. MML clustering with multistate, Poisson, Von Mises circular and Gaussian distributions. Statistics and Computing 10: 73–83.CrossRefGoogle Scholar
  115. Webb, L. J., Tracey, J. G., Williams, W. T. and Lance, G. N. 1967. Studies in the numerical analysis of complex rain-forest communities I. a comparison of methods applicable to site/species data. J. Ecol. 55: 171–191.CrossRefGoogle Scholar
  116. Whewell, W. 1847. The Philosophy of the Inductive Sciences Johnson Reprint Co., New York.Google Scholar
  117. Williams, W. T. 1969. The problem of attributeweightinginnumerical classification. Taxon 18: 369–374.CrossRefGoogle Scholar
  118. Williams, W. T. 1971. Principles of clustering. Annu. Rev. Ecol. Syst. 2: 303–326.CrossRefGoogle Scholar
  119. Williams, W. T. and Dale, M. B. 1962. Partition correlation matrices for heterogeneous quantitative data. Nature 196: 602.CrossRefGoogle Scholar
  120. Yamada, H. and Amaroso, S. 1971. Structural and behavioural equivalences of tessellation automata. Information and Control 18:1–31.CrossRefGoogle Scholar
  121. Yang, F. and Jiang, T. 2003. Pixon-based image segmentation with Markov random fields. IEEE Transactions on Image Processing 12:1552–1559.CrossRefGoogle Scholar
  122. Yin, K. and Davidson, I. 2004. An information Theoretic Optimal Classifier for Semi-supervised Learning. Lecture Notes in Computer Science 3177, Springer Berlin. pp. 740–745.Google Scholar
  123. Yu, S. X. and Shi, J. 2004. Segmentation Given Partial Grouping Constraints, IEEE Transactions Pattern Analysis and Machine Intelligence PAMI 26:173–183.CrossRefGoogle Scholar
  124. Zhang, H-X. and Lu, J. 2010. Creating ensembles of classifiers via fuzzy clustering and deflection. Fuzzy sets and Systems 161: 1790–1802.CrossRefGoogle Scholar
  125. Zhu, H-Y. and Rohwer, R. 1995. Bayesian invariant measurements of generalisation for continuous distributions. Technical Report NCRG/4352, Department Computer Science, University of Aston.Google Scholar

Copyright information

© Akadémiai Kiadó, Budapest 2010

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Griffith School of Environment, Environmental Futures Centre, Australian Rivers InstituteGriffith UniversityNathanAustralia
  2. 2.Dept. Computer Science and Software EngineeringMonash UniversityClaytonAustralia

Personalised recommendations