MML Inference of Decision Graphs with Multi-way Joins
A decision tree is a comprehensible representation that has been widely used in many machine learning domains. But in the area of supervised learning, decision trees have their limitations. Two notable problems are those of replication and fragmentation. One way of solving these problems is to introduce decision graphs, a generalization of the decision tree, which address the above problems by allowing for disjunctions, or joins. While various decision graph systems are available, all of these systems impose some forms of restriction on the proposed representations, often leading to either a new redundancy or the original redundancy not being removed. In this paper, we propose an unrestricted representation called the decision graph with multi-way joins, which has improved representative power and is able to use training data efficiently. An algorithm to infer these decision graphs with multi-way joins using the Minimum Message Length (MML) principle is also introduced. On both real-world and artificial data with only discrete attributes (including at least five UCI data-sets), and in terms of both “right”/“wrong” classification accuracy and I.J. Good’s logarithm of probability “bit-costing” predictive accuracy, our novel multi-way join decision graph program significantly out-performs both C4.5 and C5.0. Our program also out-performs the Oliver and Wallace binary join decision graph program on the only data-set available for comparison.
KeywordsMachine learning decision trees decision graphs supervised learning probabilistic prediction minimum message length MML MDL
Unable to display preview. Download preview PDF.
- 1.C.L. Blake and C.J. Merz. UCI repository of machine learning databases, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html.
- 2.D.L. Dowe, G.E. Farr, A.J. Hurst, and K.L. Lentin. Information-theoretic football tipping. In N. de Mestre, editor, Third Australian Conference on Mathematics and Computers in Sport, pages 233–241. Bond University, Qld, Australia, 1996.Google Scholar
- 3.D.L. Dowe and N. Krusel. Adecision tree model of bushfire activity. In Proceedings of 6th Australian joint conference on Artificial intelligence, pages 287–292, 1993.Google Scholar
- 4.D.L. Dowe, J.J. Oliver, L. Allison, C.S. Wallace, and T.I. Dix. A Decision Graph Explanation of Protein Secondary Structure Prediction. In Proceedings of the the Hawaii International Conference on System Science (HICSS)Biote chnology Computing Track, pages 669–678, 1993.Google Scholar
- 7.Ron Kohavi. Bottom-up induction of oblivious read-once decision graphs: Strengths and limitations. In National Conference on Artificial Intelligence, pages 613–618, 1994.Google Scholar
- 8.Yishay Mansour and David McAllester. Boosting using branching programs. In Proc. 13th Annual Conference on Comput. Learning Theory, pages 220–224. Morgan Kaufmann, San Francisco, 2000.Google Scholar
- 9.Manish Mehta, Jorma Rissanen, and Rakesh Agrawal. MDL-based Decision Tree Pruning. In The First International Conference on Knowledge Discovery & Data Mining, pages 216–221. AAAI Press, 1995.Google Scholar
- 10.J.J. Oliver. Decision Graphs-An Extension of Decision Trees. In Proceedings of the Fourth International Workshop on Artificial Intelligence and Statistics, pages 343–350, 1993. Extended version available as TR 173, Dept. of Computer Science, Monash University, Clayton, Victoria 3168, Australia.Google Scholar
- 11.J.J. Oliver, D.L. Dowe, and C.S. Wallace. Inferring Decision Graphs Using the Minimum Message Length Principle. In Proceedings of the 5th Joint Conference on Artificial Intelligence, pages 361–367. World Scientific, Singapore, 1992.Google Scholar
- 12.J.J. Oliver and C.S. Wallace. Inferring Decision Graphs. In Workshop 8 International Join Conference on AI, Sydney, Australia, August 1991.Google Scholar
- 13.J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo,CA, 1992. The latest version of C5 is available from http://www.rulequest.com.