Advertisement

DB-FSG: An SQL-Based Approach for Frequent Subgraph Mining

  • Sharma Chakravarthy
  • Subhesh Pradhan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5181)

Abstract

Mining frequent subgraphs (FSG) is one form of graph mining for which only main memory algorithms exist currently. There are many applications in social networks, biology, computer networks, chemistry and the World Wide Web that require mining of frequent subgraphs. The focus of this paper is to apply relational database techniques to support frequent subgraph mining. Some of the computations, such as duplicate elimination, canonical labeling, and isomorphism checking are not straightforward using SQL. The contribution of this paper is to efficiently map complex computations to relational operators. Unlike the main memory counter parts of FSG, our approach addresses the most general graph representation including multiple edges between any two vertices, bi-directional edges, and cycles. Experimental evaluation of the proposed approach is also presented in the paper.

Keywords

Multiple Edge Edge Label Vertex Label Connectivity Attribute Graph Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Chakravarthy, S., Beera, R., Balachandran, R.: Database approach to graph mining. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 341–350. Springer, Heidelberg (2004)Google Scholar
  2. 2.
    Cook, D.J., Holder, L.B.: Graph-based data mining. IEEE Intelligent Systems 15(2), 32–41 (2000)CrossRefGoogle Scholar
  3. 3.
    Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: Mining graph data. Mach. Learn. 50(3) (2003)Google Scholar
  4. 4.
    Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM 2001: Proc. of the 2001 IEEE International Conference on Data Mining, Washington, DC, USA, pp. 313–320. IEEE Computer Society, Los Alamitos (2001)CrossRefGoogle Scholar
  5. 5.
    Mishra, P., Chakravarthy, S.: Performance evaluation and analysis of k-way join variants for association rule mining. In: James, A., Younas, M., Lings, B. (eds.) BNCOD 2003. LNCS, vol. 2712, pp. 95–114. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Padmanabhan, S.: HDB-Subdue, a relational database approach to graph mining and hierarchial reduction. Master’s thesis, CSE Dept., U T Arlington (2004)Google Scholar
  7. 7.
    Pradhan, S.: A Relational Database Approach to Frequent Subgraph (FSG) Mining. Master’s thesis, The University of Texas at Arlington (August 2006), http://itlab.uta.edu/ITLABWEB/Students/sharma/theses/Pra06MS.pdf
  8. 8.
    Rissanen, J.: Stochastic Complexity in Statistical Inquiry Theory. World Scientific Publishing Co., Singapore (1989)Google Scholar
  9. 9.
    Sarawagi, S., Thomas, S., Agrawal, R.: Integrating mining with relational database systems: Alternatives and implications. In: SIGMOD Conference, pp. 343–354 (1998)Google Scholar
  10. 10.
    Washio, T., Motoda, H.: State of the art of graph-based data mining. SIGKDD Explor. Newsl. 5(1), 59–68 (2003)CrossRefGoogle Scholar
  11. 11.
    Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: ICDM 2002: Proc. of the 2002 IEEE Int. Conf. on Data Mining, pp. 721–731 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Sharma Chakravarthy
    • 1
  • Subhesh Pradhan
    • 1
  1. 1.IT Laboratory & Department of Computer Science and EngineeringThe University of Texas at ArlingtonArlington 

Personalised recommendations