Advertisement

Analysis of Hepatitis Dataset by Decision Tree Based on Graph-Based Induction

  • Warodom Geamsakul
  • Takashi Matsuda
  • Tetsuya Yoshida
  • Kouzou Ohara
  • Hiroshi Motoda
  • Takashi Washio
  • Hideto Yokoi
  • Katsuhiko Takabayashi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3609)

Abstract

A machine learning technique called Graph-Based Induction (GBI) efficiently extracts typical patterns from graph-structured data by stepwise pair expansion (pairwise chunking). It is very efficient because of its greedy search. We have expanded GBI to construct a decision tree that can handle graph-structured data. DT-GBI constructs a decision tree while simultaneously constructing attributes for classification using GBI. In DT-GBI attributes, namely substructures useful for classification task, are constructed by GBI on the fly during the tree construction. We applied both GBI and DT-GBI to classification tasks of a real world hepatitis data. Three classification problems were solved in five experiments. In the first 4 experiments, DT-GBI was applied to build decision trees to classify 1) cirrhosis and non-cirrhosis (Experiments 1 and 2), 2) type C and type B (Experiment 3), and 3) positive and negative responses of interferon therapy (Experiment 4). As the patterns extracted in these experiments are thought discriminative, in the last experiment (Experiment 5) GBI was applied to extract descriptive patterns for interferon therapy. The preliminary results of experiments, both constructed decision trees and their predictive accuracies as well as extracted patterns, are reported in this paper. Some of the patterns match domain experts’ experience and the overall results are encouraging.

Keywords

Data mining graph-structured data Decision Tree Graph-Based Induction hepatitis dataset analysis 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software, Pacific Grove (1984)zbMATHGoogle Scholar
  2. 2.
    Ho, T.B., Nguyen, T.D., Kawasaki, S., Le, S.Q., Nguyen, D.D., Yokoi, H., Takabayashi, K.: Mining hepatitis data with temporal abstraction. In: Proc. of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2003, pp. 369–377. ACM Press, New York (2003)Google Scholar
  3. 3.
    Matsuda, T., Horiuchi, T., Motoda, H., Washio, T.: Extension of graph-based induction for general graph structured data. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 420–431. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  4. 4.
    Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Knowledge discovery from structured data by beam-wise graph-based induction. In: Ishizuka, M., Sattar, A. (eds.) PRICAI 2002. LNCS (LNAI), vol. 2417, pp. 255–264. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  5. 5.
    Ohsaki, M., Sato, Y., Kitaguchi, S., Yamaguchi, T.: A rule discovery support system. In: Project “Realization of Active Mining in the Era of Information Flood” Report, pp. 147–152 (March 2003)Google Scholar
  6. 6.
    Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)Google Scholar
  7. 7.
    Quinlan, J.R.: C4.5:Programs For Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  8. 8.
    Suzuki, E., Watanabe, T., Yamada, Y., Takeuchi, F., Choki, Y., Nakamoto, K., Inatani, S., Yamaguchi, N., Nagahama, M., Yokoi, H., Takabayashi, K.: Toward spiral exception discovery. In: Project “Realization of Active Mining in the Era of Information Flood” Report, pp. 153–160 (March 2003)Google Scholar
  9. 9.
    Tsumoto, S., Takabayashi, K., Nagira, M., Hirano, S.: Trend-evaluation multiscale analysis of the hepatitis dataset. In: Project “Realization of Active Mining in the Era of Information Flood” Report, pp. 191–197 (March 2003)Google Scholar
  10. 10.
    Warodom, G., Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Classifier construction by graph-based induction for graph-structured data. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 52–62. Springer, Heidelberg (2003)Google Scholar
  11. 11.
    Warodom, G., Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Performance evaluation of decision tree graph-based induction. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 128–140. Springer, Heidelberg (2003)Google Scholar
  12. 12.
    Yamada, Y., Suzuki, E., Yokoi, H., Takabayashi, K.: Decision-tree induction from time-series data based on a standard-example split test. In: Proc. of the 12th International Conference on Machine Learning, August 2003, pp. 840–847 (2003)Google Scholar
  13. 13.
    Yoshida, K., Motoda, H.: Clip: Concept learning from inference pattern. Journal of Artificial Intelligence 75(1), 63–92 (1995)CrossRefGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Warodom Geamsakul
    • 1
  • Takashi Matsuda
    • 1
  • Tetsuya Yoshida
    • 1
  • Kouzou Ohara
    • 1
  • Hiroshi Motoda
    • 1
  • Takashi Washio
    • 1
  • Hideto Yokoi
    • 2
  • Katsuhiko Takabayashi
    • 2
  1. 1.Institute of Scientific and Industrial Research, Osaka UniversityJapan
  2. 2.Division for Medical Informatics, Chiba University HospitalJapan

Personalised recommendations