Advertisement

Multivariate Decision Trees vs. Univariate Ones

  • Mariusz Koziol
  • Michal Wozniak
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 57)

Summary

There is much current research into developing ever more efficient and accurate recognition algorithms. Decision tree classifiers are currently the focus of intense research. In this work methods of univariate and multivariate decision tree induction are presented and their qualities are compared via computer experiments. Additionally causes of decision tree parallelization are discussed.

Keywords

Decision Tree Information Gain Decision Tree Algorithm Decision Tree Induction Credit Risk Assessment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alpaydin, E.: Introduction to Machine Learning. MIT Press, London (2004)Google Scholar
  2. 2.
    Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html Google Scholar
  3. 3.
    Ben-Haim, Y., Yom-Tov, E.: A streaming parallel decision tree algorithm. In: Proc. of ICML 2008 Workshop PASCAL Large Scale Learning Challenge, Helsinki, Finland (2008)Google Scholar
  4. 4.
    Blum, A.L., Langley, P.: Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence 97(1-2), 245–271 (1997)CrossRefzbMATHMathSciNetGoogle Scholar
  5. 5.
    Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees, Wadsworth (1984)Google Scholar
  6. 6.
    Brodley, C.E., Utgoff, P.E.: Multivariate Decision Trees. Machine Learning 19(1), 45–77 (1995)zbMATHGoogle Scholar
  7. 7.
    Buckinx, W., et al.: Using Machine Learning Techniques to Predict Defection of Top Clients. In: Proc. 3rd International Conference on Data Mining Methods and Databases, Bologna, Italy, pp. 509–517 (2002)Google Scholar
  8. 8.
    Cover, T.M.: The Best Two Independent Measurements are Not the Two Best. IEEE Transactions on Systems, Man and Cybernetics SMC-4(1), 116–117 (1974)Google Scholar
  9. 9.
    Crook, J.N., Edelman, D.B., Thomas, L.C.: Recent developments in consumer credit risk assessment. European Journal of Operational Research 183, 1447–1465 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
  10. 10.
    Dash, M., Liu, H.: Feature Selection for Classification. Intelligent Data Analysis 1(1-4), 131–156 (1997)CrossRefGoogle Scholar
  11. 11.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Willey and Sons, New York (2001)zbMATHGoogle Scholar
  12. 12.
    Gama, J.: Functional Trees. Machine Learning 55(3), 219–250 (2004)CrossRefzbMATHMathSciNetGoogle Scholar
  13. 13.
    Jin, R., Agrawal, G.: Communication and memory efficient parallel decision tree construction. In: Proc. of the 3rd SIAM Conference on Data Mining, San Francisco, CA, pp. 119–129 (2003)Google Scholar
  14. 14.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proc. of the 14th Int. Joint Conf. on Artificial Intelligence, San Mateo, pp. 1137–1143 (1995)Google Scholar
  15. 15.
    Kohavi, R.: Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. In: Proc. of the Second International Conference on Knoledge Discovery and Data Mining, pp. 202–207 (1996)Google Scholar
  16. 16.
    Kufrin, R.: Decision trees on parallel processors. In: Geller, J., Kitano, H., Suttner, C.B. (eds.) Parallel Processing for Artificial Intelligence, vol. 3, pp. 279–306. Elsevier Science, Amsterdam (1997)CrossRefGoogle Scholar
  17. 17.
    Kurzynski, M.: The optimal strategy of a tree classifier. Pattern Recognition 16(1), 81–87 (1983)Google Scholar
  18. 18.
    Landwehr, N., Hall, M., Frank, E.: Logistic Model Trees. Machine Learning 95(1-2), 161–205 (2005)Google Scholar
  19. 19.
    Liebowitz, J. (ed.): The Handbook of Applied Expert Systems. CRC Press, Boca Raton (1998)zbMATHGoogle Scholar
  20. 20.
    Mehta, M., et al.: SLIQ: A fast scalable classifier for data mining. In: Proc. of the 5th International Conference on Extending Database Technology, Avignon, France, pp. 18–32 (1996)Google Scholar
  21. 21.
    Mitchell, T.M.: Machine Learning. McGraw-Hill Comp., Inc., New York (1997)zbMATHGoogle Scholar
  22. 22.
    Mui, J., Fu, K.S.: Automated classification of nucleated blood cells using a binary tree classifier. IEEE Trans. Pattern Anal. Mach. Intell., PAMI 2, 429–443 (1980)CrossRefGoogle Scholar
  23. 23.
    Quinlan, J.R.: Induction on Decision Tree. Machine Learning 1, 81–106 (1986)Google Scholar
  24. 24.
    Quinlan, J.R.: C4.5: Program for Machine Learning. Morgan Kaufman, San Mateo (1993)Google Scholar
  25. 25.
    Paliouras, G., Bree, D.S.: The effect of numeric features on the scalability of inductive learning programs. In: Lavrač, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 218–231. Springer, Heidelberg (1995)Google Scholar
  26. 26.
    Pearson, R.A.: A coarse grained parallel induction heuristic. In: Kitano, H., Kumar, V., Suttner, C.B. (eds.) Parallel Processing for Artificial Intelligence, vol. 2, pp. 207–226. Elsevier Science, Amsterdam (1994)CrossRefGoogle Scholar
  27. 27.
    Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Systems, Man Cyber. 21(3), 660–674 (1991)CrossRefMathSciNetGoogle Scholar
  28. 28.
    Shafer, J., et al.: SPRINT: A scalable parallel classifier for data mining. In: Proc. of the 22nd VLBD Conference, pp. 544–555 (1996)Google Scholar
  29. 29.
    Shen, A., Tong, R., Deng, Y.: Application of Classification Models on Credit Card Fraud Detection. In: Proc. of 2007 International Conference on Service Systems and Service Management, Chengdu, China, June 9-11, 2007, pp. 1–4 (2007)Google Scholar
  30. 30.
    Srivastava, A., et al.: Parallel formulations of decision tree classification algorithms. Data Mining and Knowledge Discovery 3(3), 237–261 (1999)CrossRefGoogle Scholar
  31. 31.
    Sumner, M., Frank, E., Hall, M.: Speeding up Logistic Model Tree Induction. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS, vol. 3721, pp. 675–683. Springer, Heidelberg (2005)Google Scholar
  32. 32.
    Tang, T.-I., et al.: A Comparative Study of Medical Data Classification Methods Based on Decision Tree and System Reconstruction Analysis. Industrial Engineering and Management Systems 4(1), 102–108 (2005)Google Scholar
  33. 33.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann Pub., San Francisco (2000)Google Scholar
  34. 34.
    Yidiz, O.T., Dikmen, O.: Paralel univariate decision trees. Pattern Recognition Letters 28, 825–832 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Mariusz Koziol
    • 1
  • Michal Wozniak
    • 1
  1. 1.Chair of Systems and Computer NetworksWroclaw University of TechnologyWroclawPoland

Personalised recommendations