Unsupervised Learning of Correlated Multivariate Gaussian Mixture Models Using MML

  • Yudi Agusta
  • David L. Dowe
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2903)


Mixture modelling or unsupervised classification is the problem of identifying and modelling components (or clusters, or classes) in a body of data. We consider here the application of the Minimum Message Length (MML) principle to a mixture modelling problem of multivariate Gaussian distributions. Earlier work in MML mixture modelling includes the multinomial, Gaussian, Poisson, von Mises circular, and Student t distributions and in these applications all variables in a component are assumed to be uncorrelated with each other. In this paper, we propose a more general type of MML mixture modelling which allows the variables within a component to be correlated. Two MML approximations are used. These are the Wallace and Freeman (1987) approximation and Dowe’s MMLD approximation (2002). The former is used for calculating the relative abundances (mixing proportions) of each component and the latter is used for estimating the distribution parameters involved in the components of the mixture model. The proposed method is applied to the analysis of two real-world datasets – the well-known (Fisher) Iris and diabetes datasets. The modelling results are then compared with those obtained using two other modelling criteria, AIC and BIC (which is identical to Rissanen’s 1978 MDL), in terms of their probability bit-costings, and show that the proposed MML method performs better than both these criteria. Furthermore, the MML method also infers more closely the three underlying Iris species than both AIC and BIC.


Unsupervised Classification Mixture Modelling Machine Learning Knowledge Discovery and Data Mining Minimum Message Length MML Classification Clustering Intrinsic Classification Numerical Taxonomy Information Theory Statistical Inference 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agusta, Y., Dowe, D.L.: Clustering of Gaussian and t Distributions using Minimum Message Length. In: Proc. Int’l. Conf. Knowledge Based Computer Systems - KBCS-2002, Mumbai, India, pp. 289–299. Vikas Publishing House Pvt. Ltd. (2002)Google Scholar
  2. 2.
    Agusta, Y., Dowe, D.L.: MML Clustering of Continuous-Valued Data Using Gaussian and t Distributions. In: McKay, B., Slaney, J.K. (eds.) Canadian AI 2002. LNCS (LNAI), vol. 2557, pp. 143–154. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Akaike, H.: A new look at the statistical model identification. IEEE Transactions on Automatic Control AC-19(6), 716–723 (1974)CrossRefMathSciNetGoogle Scholar
  4. 4.
    Chaitin, G.J.: On the length of programs for computing finite sequences. J. the Association for Computing Machinery 13, 547–569 (1966)zbMATHMathSciNetGoogle Scholar
  5. 5.
    Cheeseman, P., Stutz, J.: Bayesian Classification (AutoClass): Theory and Results. In: Advances in Knowledge Discovery and Data Mining, pp. 153–180. AAAI Press/MIT Press (1996)Google Scholar
  6. 6.
    Dowe, D.L., Baxter, R.A., Oliver, J.J., Wallace, C.S.: Point Estimation using the Kullback-Leibler Loss Function and MML. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS, vol. 1394, pp. 87–95. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  7. 7.
    Edwards, R.T., Dowe, D.L.: Single factor analysis in MML mixture modelling. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS, vol. 1394, pp. 96–109. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  8. 8.
    Figueiredo, M.A.T., Jain, A.K.: Unsupervised Learning of Finite Mixture Models. IEEE Trans. on Pattern Analysis and Machine Intelligence 24(3), 381–396 (2002)CrossRefGoogle Scholar
  9. 9.
    Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annals of Eugenics 7, 179–188 (1936)Google Scholar
  10. 10.
    Fitzgibbon, L.J., Dowe, D.L., Allison, L.: Change-Point Estimation Using New Minimum Message Length Approximations. In: Ishizuka, M., Sattar, A. (eds.) PRICAI 2002. LNCS (LNAI), vol. 2417, pp. 244–254. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  11. 11.
    Fitzgibbon, L.J., Dowe, D.L., Allison, L.: Univariate Polynomial Inference by Monte Carlo Message Length Approximation. In: Proc. 19th International Conf. of Machine Learning (ICML 2002), Sydney, pp. 147–154. Morgan Kaufmann, San Francisco (2002)Google Scholar
  12. 12.
    Fraley, C., Raftery, A.E.: How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. Computer J. 41(8), 578–588 (1998)zbMATHCrossRefGoogle Scholar
  13. 13.
    Fraley, C., Raftery, A.E.: MCLUST: Software for Model-Based Cluster and Discriminant Analysis. Technical Report 342, Statistics Dept., Washington Uni., Seattle, USA (1998)Google Scholar
  14. 14.
    Hunt, L.A., Jorgensen, M.A.: Mixture model clustering using the Multimix program. Australian and New Zealand Journal of Statistics 41(2), 153–171 (1999)zbMATHCrossRefGoogle Scholar
  15. 15.
    Kolmogorov, A.N.: Three approaches to the quantitative definition of information. Problems of Information Transmission 1, 4–7 (1965)MathSciNetGoogle Scholar
  16. 16.
    Lam, E.: Improved approximations in MML. Honours Thesis, School of Computer Science and Software Engineering, Monash Uni., Clayton 3800 Australia (2000)Google Scholar
  17. 17.
    McLachlan, G.J., Peel, D.: Finite Mixture Models. John Wiley, NY (2000)zbMATHCrossRefGoogle Scholar
  18. 18.
    McLachlan, G.J., Peel, D., Basford, K.E., Adams, P.: The EMMIX software for the fitting of mixtures of Normal and t-components. J. Stat. Software 4 (1999)Google Scholar
  19. 19.
    Reaven, G.M., Miller, R.G.: An Attempt to Define the Nature of Chemical Diabetes Using a Multidimensional Analysis. Diabetologia 16, 17–24 (1979)CrossRefGoogle Scholar
  20. 20.
    Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)zbMATHCrossRefGoogle Scholar
  21. 21.
    Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman & Hall, London (1997)zbMATHCrossRefGoogle Scholar
  22. 22.
    Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)zbMATHCrossRefGoogle Scholar
  23. 23.
    Sclove, S.L.: Application of model-selection criteria to some problems in multivariate analysis. Psychometrika 52(3), 333–343 (1987)CrossRefGoogle Scholar
  24. 24.
    Solomonoff, R.J.: A formal theory of inductive inference. Information and Control 7(1-22), 224–254 (1964)zbMATHCrossRefMathSciNetGoogle Scholar
  25. 25.
    Tan, P.J., Dowe, D.L.: MML Inference of Decision Graphs with Multi-way Joins. In: McKay, B., Slaney, J.K. (eds.) Canadian AI 2002. LNCS (LNAI), vol. 2557, pp. 131–142. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  26. 26.
    Wallace, C.S.: An improved program for classification. In: Proc. 9th Aust. Computer Science Conference (ACSC-9), vol. 8, pp. 357–366. Monash Uni., Australia (1986)Google Scholar
  27. 27.
    Wallace, C.S., Boulton, D.M.: An information measure for classification. Computer J. 11(2), 185–194 (1968)zbMATHGoogle Scholar
  28. 28.
    Wallace, C.S., Dowe, D.L.: Intrinsic classification by MML - the Snob program. In: Proc. 7th Aust. Joint Conf. on AI, pp. 37–44. World Scientific, Singapore (1994)Google Scholar
  29. 29.
    Wallace, C.S., Dowe, D.L.: MML Mixture Modelling of Multi-State, Poisson, von Mises Circular and Gaussian Distributions. In: Proc. 6th International Workshop on Artificial Intelligence and Statistics, Florida, pp. 529–536 (1997)Google Scholar
  30. 30.
    Wallace, C.S., Dowe, D.L.: Minimum Message Length and Kolmogorov Complexity. Comp. J. 42(4), 270–283 (1999), Special issue on Kolmogorov ComplexityzbMATHCrossRefGoogle Scholar
  31. 31.
    Wallace, C.S., Dowe, D.L.: Refinements of MDL and MML Coding. Computer J. 42(4), 330–337 (1999), Special issue on Kolmogorov ComplexityzbMATHCrossRefGoogle Scholar
  32. 32.
    Wallace, C.S., Dowe, D.L.: MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Statistics and Computing 10, 73–83 (2000)CrossRefGoogle Scholar
  33. 33.
    Wallace, C.S., Freeman, P.R.: Estimation and Inference by Compact Coding. J. Royal Statistical Society (B) 49(3), 240–265 (1987)zbMATHMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Yudi Agusta
    • 1
  • David L. Dowe
    • 1
  1. 1.Computer Science & Software EngMonash UniversityClaytonAustralia

Personalised recommendations