Advertisement

Knowledge and Information Systems

, Volume 44, Issue 1, pp 1–25 | Cite as

LC-mine: a framework for frequent subgraph mining with local consistency techniques

  • Brahim Douar
  • Michel Liquiere
  • Chiraz Latiri
  • Yahya Slimani
Regular Paper

Abstract

Developing algorithms that discover all frequently occurring subgraphs in a large graph database is computationally extensive, as graph and subgraph isomorphisms play a key role throughout the computations. Since subgraph isomorphism testing is a hard problem, fragment miners are exponential in runtime. To alleviate the complexity issue, we propose to introduce a bias in the projection operator and instead of using the costly subgraph isomorphism projection, one can use a polynomial projection having a semantically valid structural interpretation. In this paper, our purpose is to present LC-mine, a generic and efficient framework to mine frequent subgraphs by the means of local consistency techniques used in the constraint programming field. Two instances of the framework based on the arc consistency technique are developed and presented in this paper. The first instance follows a breadth-first order, while the second is a pattern-growth approach that follows a depth-first search space exploration strategy. Then, we prove experimentally that we can achieve an important performance gain without or with nonsignificant loss of discovered patterns in terms of quality.

Keywords

Relational learning Graph mining Projection operator Graph classification 

References

  1. 1.
    Agrawal R, Skirant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on very large databases. Santiago, Chile, pp 478–499Google Scholar
  2. 2.
    Bessière C, Régin JC (1996) Mac and combined heuristics: two reasons to forsake fc (and cbj?) on hard problems. In ‘CP’, pp 61–75Google Scholar
  3. 3.
    Cook JD, Holder LB (2006) Mining graph data. Wiley, LondonCrossRefGoogle Scholar
  4. 4.
    Douar B, Liquiere M, Latiri C, Slimani Y (2011a), FGMAC: Frequent subgraph mining with Arc Consistency. In: Proceedings of the IEEE symposium on computational intelligence and data mining, CIDM 2011, part of the IEEE symposium series on computational intelligence. IEEE Computer Society, Paris, pp 112–119Google Scholar
  5. 5.
    Douar B, Liquiere M, Latiri C, Slimani Y (2011b) Graph-based relational learning with a polynomial time projection algorithm. In: Proceedings of the 21st international conference on inductive logic programming, ILP 2011, vol 7207 of LNAI. Springer, Windsor Great Park, pp 96–112Google Scholar
  6. 6.
    Fan W, Li J, Luo J, Tan Z, Wang X, Wu Y (2011) Incremental graph pattern matching. In: Proceedings of the 2011 international conference on Management of data, SIGMOD ’11. ACM, New York, pp 925–936Google Scholar
  7. 7.
    Fan W, Li J, Ma S, Tang N, Wu Y, Wu Y (2010) Graph pattern matching: from intractable to polynomial time. Proc. VLDB Endow. 3(1–2):264–275CrossRefGoogle Scholar
  8. 8.
    Fan W, Li J, Ma S, Wang H, Wu Y (2010) Graph homomorphism revisited for graph matching. Proc. VLDB Endow. 3(1–2):1161–1172CrossRefGoogle Scholar
  9. 9.
    Hell P, Nesetril J (2004) Graphs and homomorphism, vol 28. Oxford University Press, OxfordCrossRefGoogle Scholar
  10. 10.
    Huan J, Wang W, Prins J (2003) Efficient mining of frequent subgraphs in the presence of isomorphism. In: Proceedings of the 3rd IEEE international conference on data mining, ICDM ’03, IEEE computer society, Washington p 549Google Scholar
  11. 11.
    Inokuchi A, Washio T, Motoda H (2003) Complete mining of frequent patterns from graphs: mining graph data. Mach. Learn. 50(3):321–354zbMATHCrossRefGoogle Scholar
  12. 12.
    Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: Cercone N, Lin TY, Wu X (eds) International conference on data mining, IEEE computer society, pp 313–320Google Scholar
  13. 13.
    Kuramochi M, Karypis G (2004) An efficient algorithm for discovering frequent subgraphs. IEEE Trans Knowl Data Eng 16:1038–1051CrossRefGoogle Scholar
  14. 14.
    Liquiere M. (2007) Arc consistency projection: a new generalization relation for graphs. In: ICCS, pp 333–346Google Scholar
  15. 15.
    Mackworth AK (1977) Consistency in networks of relations. Artif. Intell. 8(1):99–118zbMATHMathSciNetCrossRefGoogle Scholar
  16. 16.
    Nijssen S, Kok JN (2004) The gaston tool for frequent subgraph mining. In: International workshop on graph-based tools (Grabats). Electronic notes in theoretical computer science, pp 77–87Google Scholar
  17. 17.
    Provost FJ, Fawcett T (2001) Robust classification for imprecise environments. Mach. Learn. 42(3):203–231zbMATHCrossRefGoogle Scholar
  18. 18.
    Quinlan JR (1993) C4.5: programs for machine learning, 1st edn. Morgan Kaufmann, BurlingtonGoogle Scholar
  19. 19.
    Read RC, Corneil DG (1977) The graph isomorphism disease. J. Graph Theory 1(1):339–363zbMATHMathSciNetCrossRefGoogle Scholar
  20. 20.
    Rossi F, van Beek P, Walsh T (eds) (2006) Handbook of constraint programming. Elsevier, AmsterdamzbMATHGoogle Scholar
  21. 21.
    Solnon C (2010) Alldifferent-based filtering for subgraph isomorphism. Artif. Intell. 174:850–864zbMATHMathSciNetCrossRefGoogle Scholar
  22. 22.
    Thoma M, Cheng H, Gretton A, Han J, Kriegel HP, Smola A, Song L, Yu PS, Yan X, Borgwardt KM (2010) Discriminative frequent subgraph mining with optimality guarantees. Stat. Anal. Data Min. 3(5):302–318MathSciNetCrossRefGoogle Scholar
  23. 23.
    Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, second edition (Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann, San FranciscoGoogle Scholar
  24. 24.
    Wörlein M, Meinl T, Fischer I, Philippsen M (2005) A quantitative comparison of the subgraph miners mofa, gspan, ffsm, and gaston. In: European conference on machine learning and principles and practice of knowledge discovery in databases, vol 3721 of LNCS, Springer, Berlin pp 392–403Google Scholar
  25. 25.
    Yan X, Han J (2002) gspan: Graph-based substructure pattern mining. In: International conference on data mining, IEEE computer society, pp 721–724Google Scholar
  26. 26.
    Zampelli S, Deville Y, Solnon C (2010) Solving subgraph isomorphism problems with constraint programming. J Constraints 15:327–353zbMATHMathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  • Brahim Douar
    • 1
    • 2
  • Michel Liquiere
    • 1
  • Chiraz Latiri
    • 2
  • Yahya Slimani
    • 3
  1. 1.LIRMMMontpellier II UniversityMontpellierFrance
  2. 2.LIPAH, Faculty of Sciences of TunisTunis El Manar UniversityTunisTunisia
  3. 3.LISI, INSATUniversity of CarthageTunisTunisia

Personalised recommendations