Minimum Message Length Grouping of Ordered Data
Explicit segmentation is the partitioning of data into homogeneous regions by specifying cut-points. W. D. Fisher (1958) gave an early example of explicit segmentation based on the minimisation of squared error. Fisher called this the grouping problem and came up with a polynomial time Dynamic Programming Algorithm (DPA). Oliver, Baxter and colleagues (1996, 1997, 1998) have applied the informationtheoretic Minimum Message Length (MML) principle to explicit segmentation. They have derived formulas for specifying cut-points imprecisely and have empirically shown their criterion to be superior to other segmentation methods (AIC, MDL and BIC). We use a simple MML criterion and Fisher’s DPA to perform numerical Bayesian (summing and) integration (using message lengths) over the cut-point location parameters. This gives an estimate of the number of segments, which we then use to estimate the cut-point positions and segment parameters by minimising the MML criterion. This is shown to have lower Kullback-Leibler distances on generated data.
KeywordsMessage Length Order Data Segment Parameter Minimum Message Length Computer Cience
Unable to display preview. Download preview PDF.
- 1.H. Akaike. Information theory and an extension of the maximum likelihood principle. In B. N. Petrov and F. Csaki, editors, Proceeding 2nd International Symposium on Information Theory, pages 267–281. Akademia Kiado, Budapest, 1973.Google Scholar
- 2.R. A. Baxter and J. J. Oliver. MDL and MML: Similarities and differences. Technical report TR 207, Dept. of Computer Science, Monash University, Clayton, Victoria 3168, Australia, 1994.Google Scholar
- 3.R. A. Baxter and J. J. Oliver. The kindest cut: minimum message length segmentation. In S. Arikawa and A. K. Sharma, editors, Proc. 7th Int. Workshop on Algorithmic Learning Theory, volume 1160 of LCNS, pages 83–90. Springer-Verlag Berlin, 1996.Google Scholar
- 4.J.H. Conway and N.J.A Sloane. Sphere Packings, Lattices and Groups. Springer-Verlag, London, 1988.Google Scholar
- 5.D. L. Dowe, R. A. Baxter, J. J. Oliver, and C. S. Wallace. Point estimation using the Kullback-Leibler loss function and MML. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD98), volume 1394 of LNAI, pages 87–95, 1998.Google Scholar
- 6.D. L. Dowe, J. J. Oliver, and C. S. Wallace. MML estimation of the parameters of the spherical Fisher distribution. In S. Arikawa and A. K. Sharma, editors, Proc. 7th Int. Workshop on Algorithmic Learning Theory, volume 1160 of LCNS, pages 213–227. Springer-Verlag Berlin, 1996.Google Scholar
- 10.J. J. Oliver, R. A. Baxter, and C. S. Wallace. Minimum message length segmentation. In X. Wu, R. Kotagiri, and K. Korb, editors, Research and Development in Knowledge Discovery and Data Mining (PAKDD-98), pages 83–90. Springer, 1998.Google Scholar
- 11.J. J. Oliver and C. S. Forbes. Bayesian approaches to segmenting a simple time series. Technical Report 97/336, Dept. Computer Science, Monash University, Australia 3168, December 1997.Google Scholar
- 17.M. Viswanathan, C.S. Wallace, D.L. Dowe, and K. Korb. Finding cutpoints in noisy binary sequences-a revised empirical evaluation. In 12th Australian Joint Conference onArtificial Intelligence, 1999. A sequel has been submitted to Machine Learning Journal.Google Scholar