Skip to main content

Enhancing MML Clustering Using Context Data with Climate Applications

  • Conference paper
AI 2009: Advances in Artificial Intelligence (AI 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5866))

Included in the following conference series:

Abstract

In Minimum Message Length (MML) clustering (unsupervised classification, mixture modelling) the aim is to infer a set of classes that best explains the observed data items. There are cases where parts of the observed data do not need to be explained by the inferred classes but can be used to improve the inference and resulting predictions. Our main contribution is to provide a simple and flexible way of using such context data in MML clustering. This is done by replacing the traditional mixing proportion vector with a new context matrix. We show how our method can be used to give evidence regarding the presence of apparent long-term trends in climate-related atmospheric pressure records. Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) solutions for our model have also been implemented to compare with the MML solution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cassano, J.J., Uotila, P., Lynch, A.: Changes in synoptic weather patterns in the polar regions in the twentieth and twenty-first centuries, part 1: Arctic. International Journal of Climatology 26(8), 1027–1049 (2006)

    Article  Google Scholar 

  2. Chaitin, G.J.: On the length of programs for computing finite binary sequences. Journal of the Association of Computing Machinery 13, 547–569 (1966)

    MATH  MathSciNet  Google Scholar 

  3. Christainsen, B.: Atmospheric Circulation Regimes: Can Cluster Analysis Provide the Number? Climate Journal 20(10), 2229–2250 (2007)

    Article  Google Scholar 

  4. Comley, J.W., Dowe, D.L.: Minimum message length and generalized Bayesian nets with asymmetric languages. In: Grünwald, P., Pitt, M.A., Myung, I.J. (eds.) Advances in Minimum Description Length: Theory and Applications, pp. 265–294. MIT Press, Cambridge (2005)

    Google Scholar 

  5. Dowe, D.L.: Foreword re C. S. Wallace. Computer Journal 51(5), 523–560 (2008)

    Article  Google Scholar 

  6. Dowe, D.L., Gardner, S., Oppy, G.R.: Bayes not bust! Why simplicity is no problem for Bayesians. British J. Philosophy of Science, 709–754 (December 2007)

    Google Scholar 

  7. Edgoose, T., Allison, L.: MML Markov classification of sequential data. Statistics and Computing 9, 269–278 (1999)

    Article  Google Scholar 

  8. Edwards, R.T., Dowe, D.L.: Single factor analysis in MML mixture modeling. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS (LNAI), vol. 1394, pp. 96–109. Springer, Heidelberg (1998)

    Google Scholar 

  9. Grunwald, P., Langford, J.: Suboptimal behavior of Bayes and MDL in classification under misspecification. Machine Learning 66(2-3), 119–149 (2007)

    Article  Google Scholar 

  10. Jebara, T.: Discriminative, Generative and Imitative learning. PhD thesis, MIT (2001)

    Google Scholar 

  11. Kohonen, T.: Self-Organizing Maps, vol. 30. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  12. Kolmogorov, A.N.: Three approaches to the quantitative definition of information. Problems of Information Transmission 1, 1–17 (1965)

    Google Scholar 

  13. Reusch, D.B., Alley, R.B.: Relative performance of Self-Organizing Maps and Principal Component Analysis in pattern extraction from synthetic climatological data. Polar Geography 29(3), 188–212 (2005)

    Article  Google Scholar 

  14. Rissanen, J.: Modeling by the shortest data description. Automatica 14, 465–471 (1978)

    Article  MATH  Google Scholar 

  15. Solomonoff, R.J.: A formal theory of inductive inference. Information and Control 7, 1–22, 224–254 (1964)

    Google Scholar 

  16. Wallace, C.S.: Statistical and Inductive Inference by Minimum Message Length. Springer, Heidelberg (2005)

    MATH  Google Scholar 

  17. Wallace, C.S., Boulton, D.M.: An information measure for classification. Computer Journal 11, 185–194 (1968)

    MATH  Google Scholar 

  18. Wallace, C.S., Dowe, D.L.: Intrinsic classification by MML - the Snob program. In: Proc. 7th Australian Joint Conf. on Artificial Intelligence, pp. 37–44. World Scientific, Singapore (1994)

    Google Scholar 

  19. Wallace, C.S., Dowe, D.L.: Minimum message length and Kolmogorov complexity. Computer Journal 42(4), 270–283 (1999)

    Article  MATH  Google Scholar 

  20. Wallace, C.S., Dowe, D.L.: MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Statistics and Computing 10, 73–83 (2000)

    Article  Google Scholar 

  21. Wallace, C.S., Freeman, P.R.: Estimation and inference by compact coding. J. Royal Statistical Society B 49, 240–252 (1987)

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Visser, G., Dowe, D.L., Uotila, P. (2009). Enhancing MML Clustering Using Context Data with Climate Applications. In: Nicholson, A., Li, X. (eds) AI 2009: Advances in Artificial Intelligence. AI 2009. Lecture Notes in Computer Science(), vol 5866. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10439-8_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-10439-8_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-10438-1

  • Online ISBN: 978-3-642-10439-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics