Advertisement

Flexible Decision Trees in a General Data-Mining Environment

  • Joshua W. Comley
  • Lloyd Allison
  • Leigh J. Fitzgibbon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2690)

Abstract

We describe a new data-mining platform, CDMS, aimed at the streamlined development, comparison and application of machine learning tools. We discuss its type system, focussing on the treatment of statistical models as first-class values.

This allows rapid construction of composite models – complex models built from simpler ones – such as mixture models, Bayesian networks and decision trees. We illustrate this with a flexible decision tree tool for CDMS which rather than being limited to discrete target attributes, can model any kind of data using arbitrary probability distributions.

Keywords

Decision Tree Continuous Attribute Target Attribute Probabilistic Score Decision Tree Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Allison, L.: Types and classes of machine learning and data mining. In: Proceedings of the Twenty-Sixth Australasian Computer Science Conference (ACSC 2003), Adelaide, South Australia, pp. 207–215 (2003)Google Scholar
  2. 2.
    Blake, C., Merz, C.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
  3. 3.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Statistics/Probability Series. Wadsworth Publishing Company, Belmont (1984)zbMATHGoogle Scholar
  4. 4.
    Fitzgibbon, L., Allison, L., Comley, J.: Probability model type sufficiency. In: Proc. 4th International Conference on Intelligent Data Engineering and Automated Learning (2003)Google Scholar
  5. 5.
    Ihaka, R., Gentleman, R.: R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics 5(3), 299–314 (1996)CrossRefGoogle Scholar
  6. 6.
    Quinlan, J.R.: C5.0., http://www.rulequest.com
  7. 7.
    Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S-PLUS, 3rd edn. Springer, Heidelberg (1999)zbMATHGoogle Scholar
  8. 8.
    Wallace, C.S., Boulton, D.M.: An information measure for classification. Computer Journal 11, 185–194 (1968)zbMATHGoogle Scholar
  9. 9.
    Wallace, C.S., Boulton, D.M.: An invariant Bayes method for point estimation. Classification Society Bulletin 3(3), 11–34 (1975)Google Scholar
  10. 10.
    Wallace, C.S., Freeman, P.R.: Estimation and inference by compact coding. J. Royal Statistical Society (Series B) 49, 240–252 (1987)zbMATHMathSciNetGoogle Scholar
  11. 11.
    Wallace, C.S., Patrick, J.D.: Coding decision trees. Machine Learning 11, 7–22 (1993)zbMATHCrossRefGoogle Scholar
  12. 12.
    Witten, I.H., Frank, E.: Nuts and bolts: Machine learning algorithms in Java. Morgan Kaufmann, San Francisco (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Joshua W. Comley
    • 1
  • Lloyd Allison
    • 1
  • Leigh J. Fitzgibbon
    • 1
  1. 1.School of Computer Science and Software EngineeringMonash UniversityClaytonAustralia

Personalised recommendations