# All normalized anti-monotonic overlap graph measures are bounded

## Abstract

In graph mining, a frequency measure for graphs is anti-monotonic if the frequency of a pattern never exceeds the frequency of a subpattern. The efficiency and correctness of most graph pattern miners relies critically on this property. We study the case where frequent subgraphs have to be found in one graph. Vanetik et al. (Data Min Knowl Disc 13(2):243–260, 2006) already gave sufficient and necessary conditions for anti-monotonicity of graph measures depending only on the edge-overlaps between the instances of the pattern in a labeled graph. We extend these results to homomorphisms, isomorphisms and homeomorphisms on both labeled and unlabeled, directed and undirected graphs, for vertex- and edge-overlap. We show a set of reductions between the different morphisms that preserve overlap. As a secondary contribution, we prove that the popular maximum independent set measure assigns the minimal possible normalized frequency and we introduce a new measure based on the minimum clique partition that assigns the maximum possible normalized frequency. In that way, we obtain that all normalized anti-monotonic overlap graph measures are bounded from above and below. We also introduce a new measure sandwiched between the former two based on the polynomial time computable Lovász *θ*-function.

## Keywords

Graph mining Large network setting Support measure Anti-monotonicity Morphism reductions## Preview

Unable to display preview. Download preview PDF.

## References

- Bandyopadhyay S, Sharan R, Ideker T (2006) Systematic identification of functional orthologs based on protein network comparison. Genome Res 16(3): 428–435CrossRefGoogle Scholar
- Bolobas B (2001) Random graphs. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- Brimkov Valentin E (2004) Clique, chromatic, and lovasz numbers of certain circulant graphs. Electron Notes Discr Math 17: 63–67MathSciNetCrossRefGoogle Scholar
- Bringmann B, Nijssen S (2007) What is frequent in a single graph? In: Proceedings of mining and learning with graphs (MLG), Florence, ItalyGoogle Scholar
- Crespi V (2004) Exact formulae for the lovasz theta function of sparse circulant graphs. SIAM J Discr Math 17(4): 670–674MathSciNetzbMATHCrossRefGoogle Scholar
- De Raedt L, Kramer S (2001) The levelwise version space algorithm and its application to molecular fragment finding. In: Nebel B (ed) Proceedings of the 17th international joint conference on artificial intelligence. Morgan Kaufmann, CA, pp 853–862Google Scholar
- Diestel Reinhard (2000) Graph theory. Springer, New YorkGoogle Scholar
- Fiedler M, Borgelt C (2007) Support computation for mining frequent subgraphs in a single graph. In: Proceedings of the fifth workshop on mining and learning with graphs (MLG’07), FlorenceGoogle Scholar
- Furer M, Kasiviswanathan S Prasad (2008) Approximately Counting Embeddings into Random Graphs. In: Proceedings of the 11th international workshop, APPROX 2008, and 12th international workshop, RANDOM 2008 on approximation, randomization and combinatorial optimization: algorithms and techniques, Boston, MA, USA, pp 416–429Google Scholar
- Gross JL, Yellen J (2004) Handbook of graph theory. CRC Press, BostonzbMATHGoogle Scholar
- Grunewald et al (2007) Qnet: An agglomerative method for the construction of phylogenetic networks from weighted quartets. Mole Biol Evol 24(2): 532–538CrossRefGoogle Scholar
- He H, Singh AK (2007) Efficient algorithms for mining significant substructures in graphs with quality guarantees. In: IEEE international conference on data mining, Omaha, Nebraska, pp 163–172Google Scholar
- Hell P, Nešetřil J (2004) Graphs and homomorphisms. Oxford University Press, OxfordzbMATHCrossRefGoogle Scholar
- Kashtan N, Itzkovitz S, Milo R, Alon U (2004) Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 22;20(11): 1746–1758CrossRefGoogle Scholar
- Knuth Donald E (1994) The sandwich theorem. Electron J Combin 1: 48Google Scholar
- Kuramochi M, Karypis G (2005) Finding frequent patterns in a large sparse graph. Data Min Knowl Discov 11(3): 243–271MathSciNetCrossRefGoogle Scholar
- LaPaugh AS, Rivest RL (1978) The subgraph homeomorphism problem. In: STOC ’78. ACM Press, New York, NY, USA, pp 40–50Google Scholar
- Mcglohon M, Leskovec J, Faloutsos C, Hurst M, Glance N (2007) Finding patterns in blog shapes and blog evolution. In: Proceedings of the international conference on weblogs and social media, Boulder, CO, USA, pp 26–28Google Scholar
- Muggleton S, De Raedt L (1994) Inductive logic programming : theory and methods. J Logic Prog 19, 20: 629–679MathSciNetCrossRefGoogle Scholar
- Muzychuk M (2004) A solution of the isomorphism problem for circulant graphs. Proc Lond Math Soc 3: 1–41MathSciNetCrossRefGoogle Scholar
- Papadimitriou CH (1994) Computational complexity. Addison-Wesley, BostonzbMATHGoogle Scholar
- Ramon J, Francis T, Blockeel H (2000) Learning a Tsume-Go heuristic with Tilde. In: Proceedings of CG2000, the second international conference on computers and games, Hamamatsu, Japan. Lecture Notes in Computer Science, vol 2063. Springer, NY, pp 151–169Google Scholar
- Tong H, Faloutsos C, Gallagher B, Eliassi-Rad T (2007) Fast best-effort pattern matching in large attributed graphs. In: KDD ’07: proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, NY, USA, pp 737–746Google Scholar
- Vanetik N, Shimony SE, Gudes E (2006) Support measures for graph data. Data Min Knowl Discov 13(2): 243–260MathSciNetCrossRefGoogle Scholar