Advertisement

Discovering Emerging Graph Patterns from Chemicals

  • Guillaume Poezevara
  • Bertrand Cuissart
  • Bruno Crémilleux
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5722)

Abstract

Emerging patterns are patterns of a great interest for characterizing classes. This task remains a challenge, especially with graph data. In this paper, we propose a method to mine the whole set of frequent emerging graph patterns, given a frequency threshold and an emergence threshold. Our results are achieved thanks to a change of the description of the initial problem so that we are able to design a process combining efficient algorithmic and data mining methods. Experiments on a real-world database composed of chemicals show the feasibility and the efficiency of our approach.

Keywords

Data mining emerging patterns subgraph isomorphism chemical information 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Borgelt, C., Berthold, M.R.: Mining molecular fragments: Finding relevant substructures of molecules. In: Proceedings of the IEEE International Conference on Data Mining (ICDM 2002), pp. 51–58 (2002)Google Scholar
  2. 2.
    Borgelt, C., Meinl, T., Berthold, M.: Moss: a program for molecular substructure mining. In: Workshop Open Source Data Mining Software, pp. 6–15. ACM Press, New York (2005)Google Scholar
  3. 3.
    Cook, D.J., Holder, L.B.: Mining Graph Data. John Wiley & Sons, Chichester (2006)CrossRefzbMATHGoogle Scholar
  4. 4.
    De Raedt, L., Kramer, S.: The levelwise version space algorithm and its application to molecular fragment finding. In: IJCAI 2001, pp. 853–862 (2001)Google Scholar
  5. 5.
    Dong, G., Li, J.: Efficient mining of emerging patterns: discovering trends and differences. In: Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD 1999), pp. 43–52. ACM Press, New York (1999)CrossRefGoogle Scholar
  6. 6.
    EPAFHM. Mid continent ecology division (environement protection agency), fathead minnow, http://www.epa.gov/med/Prods_Pubs/fathead_minnow.htm
  7. 7.
    Garey, M.R., Johnson, D.S.: Computers and Intractability. Freeman and Company, New York (1979)zbMATHGoogle Scholar
  8. 8.
    Kramer, S., Raedt, L.D., Helma, C.: Molecular feature mining in HIV data. In: KDD, pp. 136–143 (2001)Google Scholar
  9. 9.
    Li, J., Dong, G., Ramamohanarao, K.: Making use of the most expressive jumping emerging patterns for classification. Knowledge and Information Systems 3(2), 131–145 (2001)CrossRefzbMATHGoogle Scholar
  10. 10.
    Li, J., Wong, L.: Emerging patterns and gene expression data. Genome Informatics 12, 3–13 (2001)Google Scholar
  11. 11.
    Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3), 241–258 (1997)CrossRefGoogle Scholar
  12. 12.
    Ng, R.T., Lakshmanan, V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained associations rules. In: Proceedings of ACM SIGMOD 1998, pp. 13–24. ACM Press, New York (1998)CrossRefGoogle Scholar
  13. 13.
    Soulet, A., Crémilleux, B.: Mining constraint-based patterns using automatic relaxation. Intelligent Data Analysis 13(1), 1–25 (2009)Google Scholar
  14. 14.
    Soulet, A., Kléma, J., Crémilleux, B.: Efficient Mining under Rich Constraints Derived from Various Datasets. In: Džeroski, S., Struyf, J. (eds.) KDID 2006. LNCS, vol. 4747, pp. 223–239. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  15. 15.
    Ting, R.M.H., Bailey, J.: Mining minimal contrast subgraph patterns. In: Ghosh, J., Lambert, D., Skillicorn, D.B., Srivastava, J. (eds.) SDM, pp. 638–642. SIAM, Philadelphia (2006)Google Scholar
  16. 16.
    Ullman, J.: An algorithm for subgraph isomorphism. Journal of the ACM 23, 31–42 (1976)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Veith, G., Greenwood, B., Hunter, R., Niemi, G., Regal, R.: On the intrinsic dimensionality of chemical structure space. Chemosphere 17(8), 1617–1644 (1988)CrossRefGoogle Scholar
  18. 18.
    Wörlein, M., Meinl, T., Fischer, I., Philippsen, M.: A quantitative comparison of the subgraph miners mofa, gspan, FFSM, and gaston. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 392–403. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  19. 19.
    Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: ICDM. LNCS, vol. 2394, pp. 721–724. IEEE Computer Society Press, Los Alamitos (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Guillaume Poezevara
    • 1
  • Bertrand Cuissart
    • 1
  • Bruno Crémilleux
    • 1
  1. 1.Laboratoire GREYC-CNRS UMR 6072Université de Caen Basse-NormandieFrance

Personalised recommendations