Advances in Data Analysis pp 337-349
Canonical Forms for Frequent Graph Mining
A core problem of approaches to frequent graph mining, which are based on growing subgraphs into a set of graphs, is how to avoid redundant search. A powerful technique for this is a canonical description of a graph, which uniquely identifies it, and a corresponding test. I introduce a family of canonical forms based on systematic ways to construct spanning trees. I show that the canonical form used in gSpan ([Yan and Han (2002)]) is a member of this family, and that MoSS/MoFa ([Borgelt and Berthold (2002), Borgelt et al. (2005)]) is implicitly based on a different member, which I make explicit and exploit in the same way.
Unable to display preview. Download preview PDF.