Flexible Decision Trees in a General Data-Mining Environment
We describe a new data-mining platform, CDMS, aimed at the streamlined development, comparison and application of machine learning tools. We discuss its type system, focussing on the treatment of statistical models as first-class values.
This allows rapid construction of composite models – complex models built from simpler ones – such as mixture models, Bayesian networks and decision trees. We illustrate this with a flexible decision tree tool for CDMS which rather than being limited to discrete target attributes, can model any kind of data using arbitrary probability distributions.
KeywordsDecision Tree Continuous Attribute Target Attribute Probabilistic Score Decision Tree Model
Unable to display preview. Download preview PDF.
- 1.Allison, L.: Types and classes of machine learning and data mining. In: Proceedings of the Twenty-Sixth Australasian Computer Science Conference (ACSC 2003), Adelaide, South Australia, pp. 207–215 (2003)Google Scholar
- 2.Blake, C., Merz, C.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
- 4.Fitzgibbon, L., Allison, L., Comley, J.: Probability model type sufficiency. In: Proc. 4th International Conference on Intelligent Data Engineering and Automated Learning (2003)Google Scholar
- 6.Quinlan, J.R.: C5.0., http://www.rulequest.com
- 9.Wallace, C.S., Boulton, D.M.: An invariant Bayes method for point estimation. Classification Society Bulletin 3(3), 11–34 (1975)Google Scholar
- 12.Witten, I.H., Frank, E.: Nuts and bolts: Machine learning algorithms in Java. Morgan Kaufmann, San Francisco (1999)Google Scholar