Feature Transformation and Multivariate Decision Tree Induction

  • Huan Liu
  • Rudy Setiono
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1532)

Abstract

Univariate decision trees (UDT’s) have inherent problems of replication, repetition, and fragmentation. Multivariate decision trees (MDT’s) have been proposed to overcome some of the problems. Close examination of the conventional ways of building MDT’s, however, reveals that the fragmentation problem still persists. A novel approach is suggested to minimize the fragmentation problem by separating hyperplane search from decision tree building. This is achieved by feature transformation. Let the initial feature vector be x, the new feature vector after feature transformation T is y, i.e., y = T(x). We can obtain an MDTb y (1) building a UDT on y; and (2) replacing new features y at each node with the combinations of initial features x. We elaborate on the advantages of this approach, the details of T, and why it is expected to perform well. Experiments are conducted in order to confirm the analysis, and results are compared to those of C4.5, OC1, and CART

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    K.P. Bennett and O.L. Mangasarian. Neural network training via linear programming. In P.M. Pardalos, editor, Advances in Optimization and Parallel Computing, pages 56–67. Elsevier Science Publishers B.V., Amsterdam, 1992.Google Scholar
  2. 2.
    L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software, 1984.Google Scholar
  3. 3.
    C.E. Brodley and P.E. Utgoff. Multivariate decision trees. Machine Learning, 19:45–77, 1995.MATHGoogle Scholar
  4. 4.
    M. Dash and H. Liu. Feature selection methods for classifications. Intelligent Data Analysis: An International Journal, 1(3), 1997. http://www-east.elsevier.com/ida/free.htm.
  5. 5.
    U.M. Fayyad and K.B. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 1022–1027. Morgan Kaufmann Publishers, Inc., 1993.Google Scholar
  6. 6.
    J.H. Friedman, R. Kohavi, and Y. Yun. Lazy decision trees. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 717–724, 1996.Google Scholar
  7. 7.
    L. Fu. Neural Networks in Computer Intelligence. McGraw-Hill, 1994.Google Scholar
  8. 8.
    B. Hassibi and D.G. Stork. Second order derivatives for network pruning: Optimal brain surgeon. Neural Information Processing Systems, 5:164–171, 1993.Google Scholar
  9. 9.
    D. Heath, S. Kasif, and S. Salzberg. Learning oblique decision trees. In Proceedings of the Thirteenth International Joint Conference on AI, pages 1002–1007, France, 1993.Google Scholar
  10. 10.
    K. Kira and L.A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 129–134. Menlo Park: AAAI Press/The MIT Press, 1992.Google Scholar
  11. 11.
    H. Liu and R. Setiono. Chi2: Feature selection and discretization of numeric attributes. In J.F. Vassilopoulos, editor, Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence, November 5–8, 1995, pages 388–391, Herndon, Virginia, 1995. IEEE Computer Society.Google Scholar
  12. 12.
    C Matheus and L. Rendell. Constructive induction on decision trees. In Proceedings of International Joint Conference on AI, pages 645–650, August1989.Google Scholar
  13. 13.
    C.J. Merz and P.M. Murphy. UCI repository of machine learning databases. http://www.ics.uci.edu/ mlearn/MLRepository.html. Irvine, CA: University of California,Department of Information and Computer Science, 1996.Google Scholar
  14. 14.
    John Mingers. An empirical comparison of selection measures for decision-tree induction. Machine Learning, 3:319–342, 1989.Google Scholar
  15. 15.
    S Murthy, S. Kasif, S. Salzberg, and R. Beigel. Oc1: Randomized induction of oblique decision trees. In Proceedings of AAAIConference (AAAI’93), pages 322–327. AAAI Press / The MIT Press, 1993.Google Scholar
  16. 16.
    G. Pagallo and D. Haussler. Boolean feature discovery in empirical learning. Machine Learning, 5:71–99, 1990.CrossRefGoogle Scholar
  17. 17.
    J.R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106, 1986.Google Scholar
  18. 18.
    J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.Google Scholar
  19. 19.
    D.E. Rumelhart, J.L. McClelland, and the PDP Research Group. Parallel Distributed Processing, volume 1. Cambridge, Mass. The MIT Press, 1986.Google Scholar
  20. 20.
    I.K. Sethi. Neural implementation of tree classifiers. IEEE Trans. on Systems, Man, and Cybernetics, 25(8), August 1995.Google Scholar
  21. 21.
    R. Setiono. A penalty-function approach for pruning feedforward neural networks. Neural Computation, 9(1):185–204, 1997.MATHCrossRefGoogle Scholar
  22. 22.
    R. Setiono and H. Liu. Understanding neural networks via rule extraction. In Proceedings of International Joint Conference on AI, 1995.Google Scholar
  23. 23.
    R. Setiono and H. Liu. Analysis of hidden representations by greedy clustering. Connection Science, 10(1):21–42, 1998.CrossRefGoogle Scholar
  24. 24.
    J.W. Shavlik, R.J. Mooney, and G.G. Towell. Symbolic and neural learning algorithms: An experimental comparison. Machine Learning, 6(2):111–143, 1991.Google Scholar
  25. 25.
    G.G. Towell and J.W. Shavlik. Extracting refined rules from knowledge-based neural networks. Machine Learning, 13(1):71–101, 1993.Google Scholar
  26. 26.
    P.E. Utgo. and C.E. Brodley. An incremental method for finding multivariate splits for decision trees. In Machine Learning: Proceedings of the Seventh International Conference, pages 58–65. University of Texas, Austin,Texas, 1990.Google Scholar
  27. 27.
    R. Vilalta, G. Blix, and L. Rendell. Global data analysis and the fragmentation problem in decision tree induction. In M. van Someren and G. Widmer, editors, Machine Learning: ECML-97, pages 312–326. Springer-Verlag, 1997.Google Scholar
  28. 28.
    J. Wnek and R.S. Michalski. Hypothesis-driven constructive induction in AQ17-HCI: A method and experiments. Machine Learning, 14, 1994.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg Berlin Heidelberg 1998

Authors and Affiliations

  • Huan Liu
    • 1
  • Rudy Setiono
    • 1
  1. 1.School of ComputingNational University of SingaporeSingapore

Personalised recommendations