Graph Mining: Repository vs. Canonical Form

  • Christian Borgelt
  • Mathias Fiedler
Conference paper

DOI: 10.1007/978-3-540-78246-9_27

Part of the book series Studies in Classification, Data Analysis, and Knowledge Organization (STUDIES CLASS)
Cite this paper as:
Borgelt C., Fiedler M. (2008) Graph Mining: Repository vs. Canonical Form. In: Preisach C., Burkhardt H., Schmidt-Thieme L., Decker R. (eds) Data Analysis, Machine Learning and Applications. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg

Abstract

In frequent subgraph mining one tries to find all subgraphs that occur with a userspecified minimum frequency in a given graph database. The basic approach is to grow subgraphs, adding an edge and maybe a node in each step, to count the number of database graphs containing them, and to eliminate infrequent subgraphs. The predominant method to avoid redundant search (the same subgraph can be grown in several ways) is to define a canonical form that uniquely identifies a graph up to automorphisms. The obvious alternative, a repository of processed subgraphs, has received fairly little attention yet. However, if the repository is laid out as a hash table with a carefully designed hash function, this approach is competitive with canonical form pruning. In experiments we conducted, the repository-based approach could sometimes outperform canonical form pruning by 15%.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Christian Borgelt
    • 1
  • Mathias Fiedler
    • 1
  1. 1.European Center for Soft ComputingMieresSpain