An efficiently computable subgraph pattern support measure: counting independent observations

Wang, Yuyi; Ramon, Jan; Fannes, Thomas

doi:10.1007/s10618-013-0318-x

An efficiently computable subgraph pattern support measure: counting independent observations

Published: 09 May 2013

Volume 27, pages 444–477, (2013)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Yuyi Wang¹,
Jan Ramon¹ &
Thomas Fannes¹

546 Accesses
5 Citations
Explore all metrics

Abstract

Graph support measures are functions measuring how frequently a given subgraph pattern occurs in a given database graph. An important class of support measures relies on overlap graphs. A major advantage of overlap-graph based approaches is that they combine anti-monotonicity with counting the occurrences of a subgraph pattern which are independent according to certain criteria. However, existing overlap-graph based support measures are expensive to compute. In this paper, we propose a new support measure which is based on a new notion of independence. We show that our measure is the solution to a sparse linear program, which can be computed efficiently using interior point methods. We study the anti-monotonicity and other properties of this new measure, and relate it to the statistical power of a sample of embeddings in a network. We show experimentally that, in contrast to earlier overlap-graph based proposals, our support measure makes it feasible to mine subgraph patterns in large networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graph support measures and flows

Article 25 August 2022

An Efficient Approach for Counting Occurring Induced Subgraphs

Resling: a scalable and generic framework to mine top-k representative subgraph patterns

Article 08 November 2017

Notes

This system is part of the MIPS project, available from http://people.cs.kuleuven.be/~jan.ramon/MiGraNT/MIPS/
http://odysseas.calit2.uci.edu/doku.php/public:online_social_networks

References

Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of SIGMOD’93, Washington DC, pp 207–216
Barabási A, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512
Article MathSciNet Google Scholar
Berlingerio M, Bonchi F, Bringmann B, Gionis A (2009) Mining graph evolution rules. In: Proceedings of ECML/PKDD’09, Bled, pp 115–130
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
MATH Google Scholar
Bringmann B, Nijssen S (2008) What is frequent in a single graph? In: Proceedings of PAKDD’08, Osaka, pp 858–863
Calders T, Ramon J, Dyck DV (2011) All normalized anti-monotonic overlap graph measures are bounded. Data Min Knowl Discov 23(3):503–548
Article MathSciNet MATH Google Scholar
Chakrabarti D, Faloutsos C (2006) Graph mining: laws, generators, and algorithms. ACM Comput Surv 38(1):1–69
Article Google Scholar
Chan T, Chang KL, Raman R (2009) An SDP primal-dual algorithm for approximating the Lovsz-theta function. In: Proceedings of the IEEE ISIT’09, pp 2808–2812
Diestel R (2010) Graph theory. Springer, Heidelberg
Book Google Scholar
Dreweke A, Wörlein M, Fischer I, Schell D, Meinl Th, Philippsen M (2007) Graph-based procedural abstraction. In: Proceedings of the international symposium on code generation and optimization’07, San Jose, pp 259–270
Fagin R (1976) Probabilities on finite models. J Symb Logic 41(1):50–58
Article MathSciNet MATH Google Scholar
Fiedler M, Borgelt C (2007) Support computation for mining frequent subgraphs in a single graph. In: Proceedings of the workshop on mining and learning with graphs (MLG’07), Firenze
Feige U, Goldwasser S, Lovász L, Safra S, Szegedy M (1991) Approximating clique is almost NP-complete. In: FOCS IEEE Computer Society, pp 2–12
Garey MR, Johnson DS (1979) Computers and intractibility, a guide to the theory of NP-completeness. W. H. Freeman and Company, New York
Google Scholar
Gjoka M, Kurant M, Butts C, Markopoulou A (2010) Walking in facebook: a case study of unbiased sampling of OSNs. In: Proceedings of IEEE INFOCOM’10, San Diego, pp 1–9
Iyengar G, Phillips DJ, Stein C (2011) Approximating semidefinite packing programs. SIAM J Optim 21(1):231–268
Article MathSciNet MATH Google Scholar
Kibriya A, Ramon J (2012) Nearly exact mining of frequent trees in large networks. In: Proceedings of ECML-PKDD 2012, Bristol, pp 426–441
Klein PN, Lu H (1996) Efficient approximation algorithms for semidefinite programs arising from MAX CUT and COLORING. In: Proceedings of ACM STOC’96, pp 338–347
Knuth DE (1994) The sandwich theorem. Electron J Comb 1:1–48
Google Scholar
Kuramochi M, Karypis G (2005) Finding frequent subgraph patterns in a large sparse graph. Data Mining Knowl Discov 11(3):243–271
Article MathSciNet Google Scholar
Lovász L (1979) On the Shannon capacity of a graph. IEEE Trans Inf Theory 25(1):1–7
Article MATH Google Scholar
Luigi P, Pasquale F, Carlo S, Mario V (2004) A subgraph isomorphism algorithm for matching large graphs. IEEE Trans Pattern Anal Mach Intell 26(10):1367–1372
Article Google Scholar
Schrijver A (1979) A comparison of the Delsarte and Lovász bounds. IEEE Trans Inf Theory 25:425–429
Google Scholar
Vanetik N, Gudes E, Shimony SE (2002) Computing frequent graph subgraph patterns from semistructured data. In: Proceeding of the IEEE international conference on data mining (ICDM’02), Maebashi, pp 458–465
Vanetik N, Shimony SE, Gudes E (2006) Support measures for graph data. Data Mining Knowl Discov 13(2):243–260
Article MathSciNet MATH Google Scholar
Wang Y, Ramon J (2012) An efficiently computable support measure for frequent subgraph pattern mining. In: Proceedings of ECML-PKDD 2012, Bristol, pp 362–377

Download references

Acknowledgments

This work was supported by ERC Starting Grant 240186 “MiGraNT: Mining Graphs and Networks: a Theory-based approach”.

Author information

Authors and Affiliations

Department of Computer Science, KU Leuven, 3001 , Heverlee, Leuven, Belgium
Yuyi Wang, Jan Ramon & Thomas Fannes

Authors

Yuyi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jan Ramon
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Fannes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuyi Wang.

Additional information

Communicated by Tijl De Bie and Peter Flach.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Y., Ramon, J. & Fannes, T. An efficiently computable subgraph pattern support measure: counting independent observations. Data Min Knowl Disc 27, 444–477 (2013). https://doi.org/10.1007/s10618-013-0318-x

Download citation

Received: 18 November 2012
Accepted: 19 April 2013
Published: 09 May 2013
Issue Date: November 2013
DOI: https://doi.org/10.1007/s10618-013-0318-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficiently computable subgraph pattern support measure: counting independent observations

Abstract

Access this article

Similar content being viewed by others

Graph support measures and flows

An Efficient Approach for Counting Occurring Induced Subgraphs

Resling: a scalable and generic framework to mine top-k representative subgraph patterns

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An efficiently computable subgraph pattern support measure: counting independent observations

Abstract

Access this article

Similar content being viewed by others

Graph support measures and flows

An Efficient Approach for Counting Occurring Induced Subgraphs

Resling: a scalable and generic framework to mine top-k representative subgraph patterns

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation