Skip to main content
Log in

Frequent subgraph mining in outerplanar graphs

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

In recent years there has been an increased interest in frequent pattern discovery in large databases of graph structured objects. While the frequent connected subgraph mining problem for tree datasets can be solved in incremental polynomial time, it becomes intractable for arbitrary graph databases. Existing approaches have therefore resorted to various heuristic strategies and restrictions of the search space, but have not identified a practically relevant tractable graph class beyond trees. In this paper, we consider the class of outerplanar graphs, a strict generalization of trees, develop a frequent subgraph mining algorithm for outerplanar graphs, and show that it works in incremental polynomial time for the practically relevant subclass of well-behaved outerplanar graphs, i.e., which have only polynomially many simple cycles. We evaluate the algorithm empirically on chemo- and bioinformatics applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo AI (1996) Fast discovery of association rules. In: Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (eds) Advances in knowledge discovery and data mining. AAAI Press/The MIT Press, Menlo Park, CA, pp 307–328

    Google Scholar 

  • Bodlaender HL (1998) A partial k-arboretum of graphs with bounded treewidth. Theor Comput Sci 209(1–2): 1–45

    Article  MATH  MathSciNet  Google Scholar 

  • Borgelt C, Berthold M (2002) Mining molecular fragments: finding relevant substructures of molecules. In: Proceedings of the 2002 IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 51–58

  • Calders T, Ramon J, Van Dyck D (2008) Anti-monotonic overlap-graph support measures. In: Proceedings of the 2008 IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 73–82

  • Chartrand G, Harary F (1967) Planar permutation graphs. Annales de l’institut Henri Poincaré, (Sec. B) Probabilités et Statistiques 3(4): 433–438

    MATH  MathSciNet  Google Scholar 

  • Chi Y, Nijssen S, Muntz RR, Kok JN (2005) Frequent subtree mining–an overview. Fundam Inform 66: 161–198

    MATH  MathSciNet  Google Scholar 

  • Chi Y, Yang Y, Muntz RR (2005) Canonical forms for labelled trees and their applications in frequent subtree mining. Knowl Inf Syst 8(2): 203–234

    Article  Google Scholar 

  • Cook D, Holder L (1994) Substructure discovery using minimum description length and background knowledge. J Artif Intell Res 1: 231–255

    Google Scholar 

  • Deshpande M, Kuramochi M, Wale N, Karypis G (2005) Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans Knowl Data Eng 17(8): 1036–1050

    Article  Google Scholar 

  • Diestel R (2005) Graph theory. 3. Springer, Heidelberg

    MATH  Google Scholar 

  • Feder T, Hell P (1998) List homomorphisms to reflexive graphs. J Comb Theory B 72(2): 236–250

    Article  MathSciNet  Google Scholar 

  • Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. Freeman, San Francisco

    MATH  Google Scholar 

  • Harary F (1971) Graph theory. Addison-Wesley, Reading

    Google Scholar 

  • He H, Singh AK (2007) Efficient algorithms for mining significant substructures in graphs with quality guarantees. In: Proceedings of the 2007 IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 163–172

  • Hedetniemi S, Chartrand G, Geller D (1971) Graphs with forbidden suhgraphs. J Comb Theory 10: 12–41

    Article  MATH  MathSciNet  Google Scholar 

  • Hopcroft JE, Wong JK (1974) Linear time algorithm for isomorphism of planar graphs. In: Proceedings of the sixth annual ACM symposium on theory of Computing (STOC). ACM Press, New York, pp 172–184

  • Horváth T (2005) Cyclic pattern kernels revisited. In: Proceedings of the 9th Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD), vol 3518 of LNAI. Springer, Heidelberg, pp 791–801

  • Horváth T, Bringmann B, Raedt LD (2007) Frequent hypergraph mining. In: Proceedings of the 16th international conference on inductive logic programming (ILP), vol 4455 of LNAI. Springer, Heidelberg, pp 244–259

  • Horváth T, Wrobel S, Bohnebeck U (2001) Relational instance-based learning with lists and terms. Mach Learn 43(1/2): 53–80

    Article  MATH  Google Scholar 

  • Inokuchi A, Washio T, Motoda H (2003) Complete mining of frequent patterns from graphs: mining graph data. Mach Learn 50(3): 321–354

    Article  MATH  Google Scholar 

  • Johnson DS, Papadimitriou CH, Yannakakis M (1988) On generating all maximal independent sets. Inform Process Lett 27(3): 119–123

    Article  MATH  MathSciNet  Google Scholar 

  • Koontz W (1980) Economic evaluation of loop feeder relief alternatives. Bell Syst Tech J 59: 277–281

    Google Scholar 

  • Kramer S, De Raedt L, Helma C (2001) Molecular feature mining in HIV data. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, pp 136–143

  • Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: Proceedings of the 2001 international conference on data mining (ICDM). IEEE Computer Society, pp 313–320

  • Leydold J, Stadler PF (1998) Minimal cycle bases of outerplanar graphs. Electron J Comb 5

  • Lingas A (1989) Subgraph isomorphism for biconnected outerplanar graphs in cubic time. Theor Comput Sci 63(3): 295–302

    Article  MATH  MathSciNet  Google Scholar 

  • Mannila H, Toivonen H (1997) Levelwise search and borders of theories in knowledge discovery. Data Mining Knowl Discover 1(3): 241–258

    Article  Google Scholar 

  • Matula DW (1978) Subtree isomorphism in O(n 5/2). Ann Discrete Math 2: 91–106

    Article  MATH  MathSciNet  Google Scholar 

  • Maunz A, Helma C, Kramer S (2009) Large-scale graph mining using backbone refinement classes. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, pp 617–626

  • Mitchell SL (1979) Linear algorithms to recognize outerplanar and maximal outerplanar graphs. Inform Process Lett 9(5): 229–232

    Article  MATH  MathSciNet  Google Scholar 

  • Nijssen S, Kok JN (2004) A quickstart in frequent structure mining can make a difference. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, pp 647–652

  • Nishi T, Chua LO (1986) Uniqueness of solution for nonlinear resistive circuits containing CCCS’s or VCVS’s whose controlling coefficients are finite. IEEE Trans Circuits Syst 33(4): 381–397

    Article  MATH  MathSciNet  Google Scholar 

  • Ramon J, Nijssen S (2009) Polynomial-delay enumeration of monotonic graph classes. J Mach Learn Res 10: 907–929

    MathSciNet  Google Scholar 

  • Read RC, Tarjan RE (1975) Bounds on backtrack algorithms for listing cycles, paths, and spanning trees. Networks 5(3): 237–252

    MATH  MathSciNet  Google Scholar 

  • Schietgat L, Costa F, Ramon J, De Raedt L (2009) Maximum common subgraph mining: a fast and effective approach towards feature generation. In: Proceedings of the 7th international workshop on mining and learning with graphs (MLG). Leuven, Belgium, pp 1–3

  • Schietgat L, Ramon J, Bruynooghe M, Blockeel H (2008) An efficiently computable graph-based metric for the classification of small molecules. In: Proceedings of the 11th international conference on discovery science (DS), vol 5255 of LNAI. Springer, Heidelberg, pp 197–209

  • Shamir R, Tsur D (1999) Faster subtree isomorphism. J Algorithms 33(2): 267–280

    Article  MATH  MathSciNet  Google Scholar 

  • Syslo MM (1982) The subgraph isomorphism problem for outerplanar graphs. Theor Comput Sci 17: 91–97

    Article  MATH  MathSciNet  Google Scholar 

  • Tarjan RE (1972) Depth-first search and linear graph algorithms. SIAM J Comput 1(2): 146–160

    Article  MATH  MathSciNet  Google Scholar 

  • Tong H, Faloutsos C, Gallagher B, Eliassi-Rad T (2007) Fast best-effort pattern matching in large attributed graphs. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, pp 737–746

  • Yan X, Han J (2002) gSpan: graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE international conference on data mining. IEEE Computer Society, pp 721–724

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tamás Horváth.

Additional information

Responsible editor: M.J. Zaki.

A preliminary version of this paper appeared in: T. Eliassi-Rad, L. Ungar, M. Craven, and D. Gunopulos (Eds.), Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 197–206, ACM Press, New York, NY, 2006.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Horváth, T., Ramon, J. & Wrobel, S. Frequent subgraph mining in outerplanar graphs. Data Min Knowl Disc 21, 472–508 (2010). https://doi.org/10.1007/s10618-009-0162-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-009-0162-1

Keywords

Navigation