MML Inference of Decision Graphs with Multi-way Joins

  • Peter J. Tan
  • David L. Dowe
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2557)


A decision tree is a comprehensible representation that has been widely used in many machine learning domains. But in the area of supervised learning, decision trees have their limitations. Two notable problems are those of replication and fragmentation. One way of solving these problems is to introduce decision graphs, a generalization of the decision tree, which address the above problems by allowing for disjunctions, or joins. While various decision graph systems are available, all of these systems impose some forms of restriction on the proposed representations, often leading to either a new redundancy or the original redundancy not being removed. In this paper, we propose an unrestricted representation called the decision graph with multi-way joins, which has improved representative power and is able to use training data efficiently. An algorithm to infer these decision graphs with multi-way joins using the Minimum Message Length (MML) principle is also introduced. On both real-world and artificial data with only discrete attributes (including at least five UCI data-sets), and in terms of both “right”/“wrong” classification accuracy and I.J. Good’s logarithm of probability “bit-costing” predictive accuracy, our novel multi-way join decision graph program significantly out-performs both C4.5 and C5.0. Our program also out-performs the Oliver and Wallace binary join decision graph program on the only data-set available for comparison.


Machine learning decision trees decision graphs supervised learning probabilistic prediction minimum message length MML MDL 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    C.L. Blake and C.J. Merz. UCI repository of machine learning databases, 1998.
  2. 2.
    D.L. Dowe, G.E. Farr, A.J. Hurst, and K.L. Lentin. Information-theoretic football tipping. In N. de Mestre, editor, Third Australian Conference on Mathematics and Computers in Sport, pages 233–241. Bond University, Qld, Australia, 1996.Google Scholar
  3. 3.
    D.L. Dowe and N. Krusel. Adecision tree model of bushfire activity. In Proceedings of 6th Australian joint conference on Artificial intelligence, pages 287–292, 1993.Google Scholar
  4. 4.
    D.L. Dowe, J.J. Oliver, L. Allison, C.S. Wallace, and T.I. Dix. A Decision Graph Explanation of Protein Secondary Structure Prediction. In Proceedings of the the Hawaii International Conference on System Science (HICSS)Biote chnology Computing Track, pages 669–678, 1993.Google Scholar
  5. 5.
    I.J. Good. Rational Decisions. Journal of the Royal Statistical Society. Series B, 14:107–114, 1952.MathSciNetGoogle Scholar
  6. 6.
    I.J. Good. Corroboration, Explanation, Evolving Probability, Simplicity, and a Sharpened Razor. British Journal Philosophy of Science, 19:123–143, 1968.CrossRefMathSciNetGoogle Scholar
  7. 7.
    Ron Kohavi. Bottom-up induction of oblivious read-once decision graphs: Strengths and limitations. In National Conference on Artificial Intelligence, pages 613–618, 1994.Google Scholar
  8. 8.
    Yishay Mansour and David McAllester. Boosting using branching programs. In Proc. 13th Annual Conference on Comput. Learning Theory, pages 220–224. Morgan Kaufmann, San Francisco, 2000.Google Scholar
  9. 9.
    Manish Mehta, Jorma Rissanen, and Rakesh Agrawal. MDL-based Decision Tree Pruning. In The First International Conference on Knowledge Discovery & Data Mining, pages 216–221. AAAI Press, 1995.Google Scholar
  10. 10.
    J.J. Oliver. Decision Graphs-An Extension of Decision Trees. In Proceedings of the Fourth International Workshop on Artificial Intelligence and Statistics, pages 343–350, 1993. Extended version available as TR 173, Dept. of Computer Science, Monash University, Clayton, Victoria 3168, Australia.Google Scholar
  11. 11.
    J.J. Oliver, D.L. Dowe, and C.S. Wallace. Inferring Decision Graphs Using the Minimum Message Length Principle. In Proceedings of the 5th Joint Conference on Artificial Intelligence, pages 361–367. World Scientific, Singapore, 1992.Google Scholar
  12. 12.
    J.J. Oliver and C.S. Wallace. Inferring Decision Graphs. In Workshop 8 International Join Conference on AI, Sydney, Australia, August 1991.Google Scholar
  13. 13.
    J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo,CA, 1992. The latest version of C5 is available from
  14. 14.
    J.R. Quinlan and R. Rivest. Inferring Decision Trees Using the Minimum Description Length Principle. Information and Computation, 80:227–248, 1989.zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    J.J. Rissanen. Modeling by shortest data description. Automatica, 14:465–471, 1978.zbMATHCrossRefGoogle Scholar
  16. 16.
    C.S. Wallace and D.M. Boulton. a tAn Information Measure for Classification. ComputerJournal, 11:185–194, 1968.zbMATHGoogle Scholar
  17. 17.
    C.S. Wallace and D.L. Dowe. Minimum Message Length and Kolmogorov Complexity. In Computer Journal, Special Issue-Kolmogorov Complexity, volume 42 of No 4, pages 270–283. Oxford University Press, 1999.zbMATHGoogle Scholar
  18. 18.
    C.S. Wallace and D.L. Dowe. MML Clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Statistics and Computing, 10(1):73–83, Jan 2000.CrossRefGoogle Scholar
  19. 19.
    C.S. Wallace and P.R. Freeman. Estimation and Inference by Compact Coding. Journal of the Royal Statistical Society. Series B, 49(3):240–265, 1987.zbMATHMathSciNetGoogle Scholar
  20. 20.
    C.S Wallace and J.D. Patrick. Coding Decision Trees. Machine Learning, 11:7–22, 1993.zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Peter J. Tan
    • 1
  • David L. Dowe
    • 1
  1. 1.School of Computer Science and Software EngineeringMonash UniversityClaytonAustralia

Personalised recommendations