Advertisement

Predictive Graph Mining

  • Andreas Karwath
  • Luc De Raedt
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3245)

Abstract

Graph mining approaches are extremely popular and effective in molecular databases. The vast majority of these approaches first derive interesting, i.e. frequent, patterns and then use these as features to build predictive models. Rather than building these models in a two step indirect way, the SMIREP system introduced in this paper, derives predictive rule models from molecular data directly. SMIREP combines the SMILES and SMARTS representation languages that are popular in computational chemistry with the IREP rule-learning algorithm by Fürnkranz. Even though SMIREP is focused on SMILES, its principles are also applicable to graph mining problems in other domains. SMIREP is experimentally evaluated on two benchmark databases.

Keywords

Inductive Logic Programming Good Rule Graph Mining Cyclic Fragment Predictive Rule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Dehaspe, L.: Frequent Pattern Discovery in First-Order Logic. K. U. Leuven (1998)Google Scholar
  2. 2.
    Deshpande, M., Kuramochi, M., Karypis, G.: Frequent sub-structure-based approaches for classifying chemical compounds. In: Proc. ICDM 2003, pp. 35–42 (2003)Google Scholar
  3. 3.
    Kramer, S., De Raedt, L., Helma, C.: Molecular feature mining in HIV data. In: Provost, F., Srikant, R. (eds.) Proc. KDD 2001, pp. 136–143. ACM Press, New York (2001)CrossRefGoogle Scholar
  4. 4.
    Zaki, M.: Efficiently mining frequent trees in a forest. In: Hand, D., Keim, D., Ng, R. (eds.) Proc. KDD 2002, pp. 71–80. ACM Press, New York (2002)CrossRefGoogle Scholar
  5. 5.
    Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: Proc. ICDM 2002 (2002)Google Scholar
  6. 6.
    Inokuchi, A., Kashima, H.: Mining significant pairs of patterns from graph structures with class labels. In: Proc. ICDM 2003, pp. 83–90 (2003)Google Scholar
  7. 7.
    Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: Mining graph data. Machine Learning 50, 321–354 (2003)zbMATHCrossRefGoogle Scholar
  8. 8.
    Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proc. ICDM 2001, pp. 179–186 (2001)Google Scholar
  9. 9.
    Yan, X., Han, J.: Closegraph: Mining closed frequent graph patterns. In: Proc. KDD 2003 (2003)Google Scholar
  10. 10.
    Fürnkranz, J., Widmer, G.: Incremental reduced error pruning. In: Cohen, W.W., Hirsh, H. (eds.) Proc. ICML 1994, pp. 70–77. Morgan Kaufmann, San Francisco (1994)Google Scholar
  11. 11.
    Cohen, W.W.: Fast effective rule induction. In: Proc. ICML 1995, pp. 115–123. Morgan Kaufmann, San Francisco (1995)Google Scholar
  12. 12.
    King, R.D., Muggleton, S., Srinivasan, A., Sternberg, M.J.E.: Structure-activity relationships derived by machine learning: The use of atoms and their bond connectivities to predict mutagenicity by inductive logic programming. Proc. of the National Academy of Sciences 93, 438–442 (1996)CrossRefGoogle Scholar
  13. 13.
    Weininger, D.: SMILES, a chemical language and information system 1. Introduction and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988)Google Scholar
  14. 14.
    Quinlan, J.R.: Learning logical definitions from relations. Machine Learning 5, 239–266 (1990)Google Scholar
  15. 15.
    Srinivasan, A., Muggleton, S., Sternberg, M.E., King, R.D.: Theories for mutagenicity: a study of first-order and feature based induction. A.I. Journal 85, 277–299 (1996)Google Scholar
  16. 16.
    Cook, Holder: Graph-based data mining. ISTA: Intelligent Systems & their applications 15 (2000)Google Scholar
  17. 17.
    Gonzalez, J.A., Holder, L.B., Cook, D.J.: Experimental comparison of graph-based relational concept learning with inductive logic programming systems. In: Matwin, S., Sammut, C. (eds.) ILP 2002. LNCS (LNAI), vol. 2583, pp. 84–100. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  18. 18.
    Warodom, G., Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Classifier construction by graph-based induction for graph-structured data. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 52–62. Springer, Heidelberg (2003)Google Scholar
  19. 19.
    Geamsakul, W., Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Constructing a decision tree for graph structured data. In: Proc. MGTS 2003, pp. 1–10 (2003), http://www.ar.sanken.osaka-u.ac.jp/MGTS-2003CFP.html
  20. 20.
    Horvath, T., Wrobel, S., Bohnebeck, U.: Relational instance-based learning with lists and terms. Machine Learning 43, 53–80 (2001)zbMATHCrossRefGoogle Scholar
  21. 21.
    Muggleton, S.: Inverting entailment and Progol. Machine Intelligence 14, 133–188 (1995)Google Scholar
  22. 22.
    Srinivasan, A., King, R.D., Bristol, D.W.: An assessment of ILP-assisted models for toxicology and the PTE-3 experiment. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, pp. 291–302. Springer, Heidelberg (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Andreas Karwath
    • 1
  • Luc De Raedt
    • 1
  1. 1.Institut für InformatikAlbert-Ludwigs-Universität FreiburgFreiburgGermany

Personalised recommendations