Skip to main content
Log in

Motif detection speed up by using equations based on the degree sequence

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

To identify so-called network motifs, the fixed degree sequence model (fdsm) is usually used. For any real-world network, the fdsm is defined as the set of all graphs with the same degree sequence that do not have multiple edges between nodes or self-loops. A subgraph is called a network motif if it occurs statistically significantly often compared to its expected occurrence in the model. However, approximating this value by sampling from the fdsm is computationally expensive and does not scale for large networks. Thus, in this article, we propose a set of equations, based on the degree sequence and a simple independence assumption, to estimate the occurrence of a set of subgraphs in the fdsm. Based on a range of real-world networks, we show that these equations approximate the values in the fdsm very well, except only two data sets. We then propose an efficient way to characterize those data sets in which the equations can be used as an approximation to the fdsm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.weizmann.ac.il/mcb/UriAlon/e-coli-transcription-network.

  2. For further information see http://pil.phys.uniroma1.it/~gcalda/cosinsite/extra/data/foodwebs/WEB.html.

  3. http://wws.weizmann.ac.il/mcb/UriAlon/download/collection-complex-networks.

References

  • Aiello W, Graham FC, Lu L (2001) A random graph model for power law graphs. Exp Math 10(1):53–66

    Article  MathSciNet  MATH  Google Scholar 

  • Bender EA, Canfield ER (1978) The asymptotic number of labeled graphs with given degree sequences. J Comb Theory A 24(3):296–307

    Article  MathSciNet  MATH  Google Scholar 

  • Berger A, Müller-Hannemann M (2010) Uniform sampling of digraphs with a fixed degree sequence. Graph theoretic concepts in computer science. Springer, New York, pp 220–231

    Chapter  Google Scholar 

  • Blitzstein J, Diaconis P (2011) A sequential importance sampling algorithm for generating random graphs with prescribed degrees. Internet Math 6(4):489–522

    Article  MathSciNet  MATH  Google Scholar 

  • Bollobás B (1980) A probabilistic proof of an asymptotic formula for the number of labelled regular graphs. Eur J Comb 1(4):311–316

    Article  MathSciNet  MATH  Google Scholar 

  • Brualdi RA (1980) Matrices of zeros and ones with fixed row and column sum vectors. Linear Algebra Appl 33:159–231

    Article  MathSciNet  MATH  Google Scholar 

  • Chung F, Lu L (2002) Connected components in random graphs with given degree sequences. Ann Comb 6:125–145

    Article  MathSciNet  MATH  Google Scholar 

  • Del Genio CI, Kim H, Toroczkai Z, Bassler KE (2010) Efficient and exact sampling of simple graphs with given arbitrary degree sequence. PLoS One 5(4):e10,012

    Article  Google Scholar 

  • Erdős PL, Miklós I, Toroczkai Z (2010) A simple Havel-Hakimi type algorithm to realize graphical degree sequences of directed graphs. Electron J Comb 17(1):R66

    MathSciNet  MATH  Google Scholar 

  • Gigerenzer G (2004) Mindless statistics. J Soc Econ 33(5):587–606

    Article  Google Scholar 

  • Gkantsidis C, Mihail M, Zegura E (2003) The markov chain simulation method for generating connected power law random graphs. In: SIAM Alenex

  • Goodman LA (1962) The variance of the product of k random variables. J Am Stat Assoc 57(297): 54–60. http://www.jstor.org/stable/2282440

  • Goodman LA (1960) On the exact variance of products. J Am Stat Assoc 55(292):708–713. doi:10.1080/01621459.1960.10483369

    Article  MathSciNet  MATH  Google Scholar 

  • Janson S (2013) The probability that a random multigraph is simple, ii

  • Janson S (2009) The probability that a random multigraph is simple. Comb Probab Comput 18(1–2):205–225

    Article  MathSciNet  MATH  Google Scholar 

  • Kashtan N, Itzkovitz S, Milo R, Alon U (2004) Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20(11):1746–1758

    Article  Google Scholar 

  • Kim H, Genio CID, Bassler KE, Toroczkai Z (2012) Constructing and sampling directed graphs with given degree sequences. New J Phys 14(2):023,012

    Article  Google Scholar 

  • Klymko C, Gleich DF, Kolda TG (2014) Using triangles to improve community detection in directed networks. In: The second ASE international conference on big data science and computing, BigDataScience

  • Kolmogorov AN (1933) Sulla determinazione empirica di una legge di distribuzione. na

  • Lancichinetti A, Fortunato S, Kertész J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11(3):033,015

    Article  Google Scholar 

  • Leskovec, J, Krevl A (2014) SNAP datasets: stanford large network dataset collection. http://snap.stanford.edu/data

  • Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: simple building blocks of complex networks. Science 298(5594):824–827

    Article  Google Scholar 

  • Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S, Ayzenshtat I, Sheffer M, Alon U (2004) Superfamilies of evolved and designed networks. Science 303(5663):1538–1542

    Article  Google Scholar 

  • Milo R, Kashtan N, Itzkovitz S, Newman, MEJ, Alon, U (2003) On the uniform generation of random graphs with prescribed degree sequences. arXiv preprint cond-mat/0312028

  • Moore C, Newman MEJ (2000) Epidemics and percolation in small-world networks. http://arxiv.org/abs/cond-mat/9911492

  • Newman ME (2005) Power laws, pareto distributions and zipf’s law. Contemp Phys 46: 323–351. http://arxiv.org/abs/cond-mat/0412004

  • Newman ME, Strogatz SH, Watts DJ (2001) Random graphs with arbitrary degree distributions and their applications. Phys Rev E Stat Nonlin Soft Matter Phys 64(2 Pt 2). http://view.ncbi.nlm.nih.gov/pubmed/11497662

  • Newman ME (2002) Assortative mixing in networks. Phys Rev Lett 89:208,701

    Article  Google Scholar 

  • Newman ME, Watts DJ, Strogatz SH (2002) Random graph models of social networks. Proc Natl Acad Sci USA 99:2566–2572

    Article  MATH  Google Scholar 

  • Newman ME (2003) Mixing patterns in networks. Phys Rev E 67(2):026126. doi:10.1103/physreve.67.026126

    Article  MathSciNet  Google Scholar 

  • Newman MEJ (2003) The structure and function of complex networks. SIAM Rev 45:167–256

    Article  MathSciNet  MATH  Google Scholar 

  • Newman M (2010) Networks: an introduction. Oxford University Press, Oxford

    Book  MATH  Google Scholar 

  • Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113

    Google Scholar 

  • Ray J, Pinar A, Seshadhri C (2012) Are we there yet? when to stop a markov chain while generating random graphs. In: Bonato A, Janssen J (eds) Algorithms and Models for the Web Graph, vol 7323., Lecture Notes in Computer ScienceSpringer, Berlin Heidelberg, pp 153–164

    Chapter  Google Scholar 

  • Ryser HJ (1957) Combinatorial properties of matrices of zeros and ones. Can J Math 9:371–377

    Article  MathSciNet  MATH  Google Scholar 

  • Schlauch WE, Horvát EÁ, Zweig KA (2015) Different flavors of randomness: comparing random graph models with fixed degree sequences. Soc Netw Anal Min 5(1):1–14

    Article  Google Scholar 

  • Shen H, Cheng X, Cai K, Hu MB (2009) Detect overlapping and hierarchical community structure in networks. Phys A Stat Mech Appl 388(8):1706–1712

    Article  Google Scholar 

  • Van Der Hofstad R (2009) Random graphs and complex networks. http://www.win.tue.nl/rhofstad/NotesRGCN.pdf

  • Viger F, Latapy M (2005) Efficient and simple generation of random simple connected graphs with prescribed degree sequence. Computing and Combinatorics. Springer, New York, pp 440–449

    Chapter  Google Scholar 

  • Wang J, Tsang WW, Marsaglia G (2003) Evaluating kolmogorov’s distribution. J Stat Softw 8(18):1–14

  • Wilson J (1987) Methods for detecting non-randomness in species co-occurrences: a contribution. Oecologia 73(4):579–582

    Article  Google Scholar 

  • Zweig KA (2010) How to forget the second side of the story: a new method for the one-mode projection of bipartite graphs. In: Proceedings of the 2010 international conference on advances in social networks analysis and mining ASONAM 2010, pp. 200–207

  • Zweig KA, Kaufmann M (2011) A systematic approach to the one-mode projection of bipartite graphs. Soc Netw Anal Min 1(3):187–218

    Article  Google Scholar 

Download references

Acknowledgments

This work was funded by the DFG SPP 1736. We thank the official and unofficial reviewers for their helpful comments and insights.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wolfgang E. Schlauch.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schlauch, W.E., Zweig, K.A. Motif detection speed up by using equations based on the degree sequence. Soc. Netw. Anal. Min. 6, 47 (2016). https://doi.org/10.1007/s13278-016-0357-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-016-0357-6

Keywords

Navigation