All normalized anti-monotonic overlap graph measures are bounded
In graph mining, a frequency measure for graphs is anti-monotonic if the frequency of a pattern never exceeds the frequency of a subpattern. The efficiency and correctness of most graph pattern miners relies critically on this property. We study the case where frequent subgraphs have to be found in one graph. Vanetik et al. (Data Min Knowl Disc 13(2):243–260, 2006) already gave sufficient and necessary conditions for anti-monotonicity of graph measures depending only on the edge-overlaps between the instances of the pattern in a labeled graph. We extend these results to homomorphisms, isomorphisms and homeomorphisms on both labeled and unlabeled, directed and undirected graphs, for vertex- and edge-overlap. We show a set of reductions between the different morphisms that preserve overlap. As a secondary contribution, we prove that the popular maximum independent set measure assigns the minimal possible normalized frequency and we introduce a new measure based on the minimum clique partition that assigns the maximum possible normalized frequency. In that way, we obtain that all normalized anti-monotonic overlap graph measures are bounded from above and below. We also introduce a new measure sandwiched between the former two based on the polynomial time computable Lovász θ-function.
KeywordsGraph mining Large network setting Support measure Anti-monotonicity Morphism reductions
Unable to display preview. Download preview PDF.
- Bringmann B, Nijssen S (2007) What is frequent in a single graph? In: Proceedings of mining and learning with graphs (MLG), Florence, ItalyGoogle Scholar
- De Raedt L, Kramer S (2001) The levelwise version space algorithm and its application to molecular fragment finding. In: Nebel B (ed) Proceedings of the 17th international joint conference on artificial intelligence. Morgan Kaufmann, CA, pp 853–862Google Scholar
- Diestel Reinhard (2000) Graph theory. Springer, New YorkGoogle Scholar
- Fiedler M, Borgelt C (2007) Support computation for mining frequent subgraphs in a single graph. In: Proceedings of the fifth workshop on mining and learning with graphs (MLG’07), FlorenceGoogle Scholar
- Furer M, Kasiviswanathan S Prasad (2008) Approximately Counting Embeddings into Random Graphs. In: Proceedings of the 11th international workshop, APPROX 2008, and 12th international workshop, RANDOM 2008 on approximation, randomization and combinatorial optimization: algorithms and techniques, Boston, MA, USA, pp 416–429Google Scholar
- He H, Singh AK (2007) Efficient algorithms for mining significant substructures in graphs with quality guarantees. In: IEEE international conference on data mining, Omaha, Nebraska, pp 163–172Google Scholar
- Knuth Donald E (1994) The sandwich theorem. Electron J Combin 1: 48Google Scholar
- LaPaugh AS, Rivest RL (1978) The subgraph homeomorphism problem. In: STOC ’78. ACM Press, New York, NY, USA, pp 40–50Google Scholar
- Mcglohon M, Leskovec J, Faloutsos C, Hurst M, Glance N (2007) Finding patterns in blog shapes and blog evolution. In: Proceedings of the international conference on weblogs and social media, Boulder, CO, USA, pp 26–28Google Scholar
- Ramon J, Francis T, Blockeel H (2000) Learning a Tsume-Go heuristic with Tilde. In: Proceedings of CG2000, the second international conference on computers and games, Hamamatsu, Japan. Lecture Notes in Computer Science, vol 2063. Springer, NY, pp 151–169Google Scholar
- Tong H, Faloutsos C, Gallagher B, Eliassi-Rad T (2007) Fast best-effort pattern matching in large attributed graphs. In: KDD ’07: proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, NY, USA, pp 737–746Google Scholar