DB-FSG: An SQL-Based Approach for Frequent Subgraph Mining
Mining frequent subgraphs (FSG) is one form of graph mining for which only main memory algorithms exist currently. There are many applications in social networks, biology, computer networks, chemistry and the World Wide Web that require mining of frequent subgraphs. The focus of this paper is to apply relational database techniques to support frequent subgraph mining. Some of the computations, such as duplicate elimination, canonical labeling, and isomorphism checking are not straightforward using SQL. The contribution of this paper is to efficiently map complex computations to relational operators. Unlike the main memory counter parts of FSG, our approach addresses the most general graph representation including multiple edges between any two vertices, bi-directional edges, and cycles. Experimental evaluation of the proposed approach is also presented in the paper.
KeywordsMultiple Edge Edge Label Vertex Label Connectivity Attribute Graph Mining
Unable to display preview. Download preview PDF.
- 1.Chakravarthy, S., Beera, R., Balachandran, R.: Database approach to graph mining. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 341–350. Springer, Heidelberg (2004)Google Scholar
- 3.Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: Mining graph data. Mach. Learn. 50(3) (2003)Google Scholar
- 6.Padmanabhan, S.: HDB-Subdue, a relational database approach to graph mining and hierarchial reduction. Master’s thesis, CSE Dept., U T Arlington (2004)Google Scholar
- 7.Pradhan, S.: A Relational Database Approach to Frequent Subgraph (FSG) Mining. Master’s thesis, The University of Texas at Arlington (August 2006), http://itlab.uta.edu/ITLABWEB/Students/sharma/theses/Pra06MS.pdf
- 8.Rissanen, J.: Stochastic Complexity in Statistical Inquiry Theory. World Scientific Publishing Co., Singapore (1989)Google Scholar
- 9.Sarawagi, S., Thomas, S., Agrawal, R.: Integrating mining with relational database systems: Alternatives and implications. In: SIGMOD Conference, pp. 343–354 (1998)Google Scholar
- 11.Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: ICDM 2002: Proc. of the 2002 IEEE Int. Conf. on Data Mining, pp. 721–731 (2002)Google Scholar