DB-Subdue: Database Approach to Graph Mining
In contrast to mining over transactional data, graph mining is done over structured data represented in the form of a graph. Data having structural relationships lends itself to graph mining. Subdue is one of the early main memory graph mining algorithms that detects the best substructure that compresses a graph using the minimum description length principle. Database approach to graph mining presented in this paper overcomes the problems – performance and scalability – inherent to main memory algorithms. The focus of this paper is the development of graph mining algorithms (specifically Subdue) using SQL and stored procedures in a Relational database environment. We have not only shown how the Subdue class of algorithms can be translated to SQL-based algorithms, but also demonstrated that scalability can be achieved without sacrificing performance.
KeywordsMain Memory Extension Attribute Association Rule Mining Graph Mining Frequent Subgraph
Unable to display preview. Download preview PDF.
- 1.Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings 20th International Conference Very Large Databases, VLDB, Chile, pp. 487–499 (1994)Google Scholar
- 2.Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: ACM SIGMOD International Conference on Management of Data (2000)Google Scholar
- 3.Thomas, S.: Architectures and Optimizations for integrating Data Mining algorithms with Database Systems, Ph.d Thesis CISE Dept, University of Florida (1998)Google Scholar
- 4.Sarawagi, S., Thomas, S., Agrawal, R.: Integrating Mining with Relational Database Systems: Alternatives and Implications. In: SIGMOD, Seattle, pp. 343–354 (1998)Google Scholar
- 5.Mishra, P., Chakravarthy, S.: Performance Evaluation and Analysis of SQL 1992 Approaches for Association Rule Mining. In: BNCOD Proceedings, pp. 95–114 (2003)Google Scholar
- 9.Brazma, A., et al.: Discovering patterns and subfamilies in biosequences. In: Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology, pp. 34–93 (1996)Google Scholar
- 10.Kuramochi, M., Karypis, G.: An Efficient Algorithm for Discovering Frequent Subgraphs, in Technical Report. Department of Computer Science/Army HPC Research Center, University of Minnesota (2002)Google Scholar
- 11.Chamberlin, D.: A Complete Guide to DB2 Universal Database. Morgan Kaufmann Publishers, Inc., San Francisco (1998)Google Scholar