Network Motif Discovery Using Subgraph Enumeration and Symmetry-Breaking

  • Joshua A. Grochow
  • Manolis Kellis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4453)

Abstract

The study of biological networks and network motifs can yield significant new insights into systems biology. Previous methods of discovering network motifs – network-centric subgraph enumeration and sampling – have been limited to motifs of 6 to 8 nodes, revealing only the smallest network components. New methods are necessary to identify larger network sub-structures and functional motifs.

Here we present a novel algorithm for discovering large network motifs that achieves these goals, based on a novel symmetry-breaking technique, which eliminates repeated isomorphism testing, leading to an exponential speed-up over previous methods. This technique is made possible by reversing the traditional network-based search at the heart of the algorithm to a motif-based search, which also eliminates the need to store all motifs of a given size and enables parallelization and scaling. Additionally, our method enables us to study the clustering properties of discovered motifs, revealing even larger network elements.

We apply this algorithm to the protein-protein interaction network and transcription regulatory network of S. cerevisiae, and discover several large network motifs, which were previously inaccessible to existing methods, including a 29-node cluster of 15-node motifs corresponding to the key transcription machinery of S. cerevisiae.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baskerville, K., Paczuski, M.: Subgraph ensembles and motif discovery using a new heuristic for graph isomorphism (2006), http://www.arxiv.org:q/bio/0606023
  2. 2.
    Costanzo, M.C., Crawford, M.E., Hirschman, J.E., Kranz, J.E., Olsen, P., Robertson, L.S., Skrzypek, M.S., Braun, B.R., Hopkins, K.L., Kondu, P., Lengieza, C., Lew-Smith, J.E., Tillberg, M., Garrels, J.I.: Ypd(tm), pombepd(tm), and wormpd(tm): model organism volumes of the bioknowledge(tm) library, an integrated resource for protein information. Nucleic Acids Res. 29, 75–79 (2001)CrossRefGoogle Scholar
  3. 3.
    Dobrin, R., Beg, Q.K., Barabási, A.-L., Oltvai, Z.N.: Aggregation of topological motifs in the Escherichia coli transcriptional regulatory network. BMC Bioinformatics 5, 10 (2004)CrossRefGoogle Scholar
  4. 4.
    Han, J.-D.J., Bertin, N., Hao, T., Goldberg, D.S., Berriz, G.F., Zhang, L.V., Dupuy, D., Walhout, A.J.M., Cusick, M.E., Roth, F.P., Vidal, M.: Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430(6995), 88–93 (2004)CrossRefGoogle Scholar
  5. 5.
    Jaimovich, A., Elidan, G., Margalit, H., Friedman, N.: Towards an integrated protein-protein interaction network: a relational markov network approach. J. Comp. Bio. 13, 145–164 (2006)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Jeong, H., Mason, S., Barabási, A.-L., Oltvai, Z.N.: Centrality and lethality of protein networks. Nature 411 (2001)Google Scholar
  7. 7.
    Jeong, H., Tombor, B., Albert, R., Oltvai, Z.N., Barabási, A.-L.: The large-scale organization of metabolic networks. Nature 407 (2000)Google Scholar
  8. 8.
    Kalir, S., McClure, J., Pabbaraju, K., Southward, C., Ronen, M., Leibler, S., Surette, M.G., Alon, U.: Ordering genes in a flagella pathway by analysis of expression kinetics from living bacteria. Science 292(5524), 2080–2083 (2001)CrossRefGoogle Scholar
  9. 9.
    Kashtan, N., Itzkovitz, S., Milo, R., Alon, U.: Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics, Evaluation Studies 20(11), 1746–1758 (2004)Google Scholar
  10. 10.
    Kashtan, N., Itzkovitz, S., Milo, R., Alon, U.: Topological generalizations of network motifs. Phys. Rev. E, 70:031909 (2004)Google Scholar
  11. 11.
    Lee, T.I., Young, R.A.: Transcription of eukaryotic protein-coding genes. Annu. Rev. Genet. 34, 77–137 (2000)MATHCrossRefGoogle Scholar
  12. 12.
    Mangan, S., Alon, U.: Structure and function of the feed-forward loop network motif. PNAS 100(21), 11980–11985 (2003)CrossRefGoogle Scholar
  13. 13.
    Mangan, S., Zaslaver, A., Alon, U.: The coherent feedforward loop serves as a sign-sensitive delay element in transcription networks. J. Mol. Biol. 334(2), 197–204 (2003)CrossRefGoogle Scholar
  14. 14.
    McKay, B.D.: Practical graph isomorphism. In: Proceedings of the Tenth Manitoba Conference on Numerical Mathematics and Computing, vol. I, Winnipeg, Man., 1980, pp. 45–87 (1981), http://cs.anu.edu.au/~bdm/nauty/
  15. 15.
    McKay, B.D.: Isomorph-free exhaustive generation. J. Algorithms 26, 306–324 (1998)MATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    Middendorf, M., Ziv, E., Wiggins, C.H.: Inferring network mechanisms: the Drosophila melanogaster protein interaction network. PNAS 102(9), 3192–3197 (2005)CrossRefGoogle Scholar
  17. 17.
    Milo, R., Itzkovitz, S., Kashtan, N., Levitt, R., Shen-Orr, S., Ayzenshtat, I., Sheffer, M., Alon, U.: Superfamilies of evolved and designed networks. Science 303(5663), 1538–1542 (2004)CrossRefGoogle Scholar
  18. 18.
    Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network motifs: simple building blocks of complex networks. Science 298(5594), 824–827 (2002)CrossRefGoogle Scholar
  19. 19.
    Przytycka, T.M.: An important connection between network motifs and parsimony models. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P., Waterman, M. (eds.) RECOMB 2006. LNCS (LNBI), vol. 3909, pp. 321–335. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  20. 20.
    Ronen, M., Rosenberg, R., Shraiman, B.I., Alon, U.: Assigning numbers to the arrows: parameterizing a gene regulation network by using accurate expression kinetics. Proc. Natl. Acad. Sci. U S A 99(16), 10555–10560 (2002)CrossRefGoogle Scholar
  21. 21.
    Shen-Orr, S.S., Milo, R., Mangan, S., Alon, U.: Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genetics 31(1), 64–68 (2002)CrossRefGoogle Scholar
  22. 22.
    JUNG Framework Development Team: Jung: The java universal network/graph framework (2005)Google Scholar
  23. 23.
    Ullman, J.R.: An algorithm for subgraph isomorphism. J. Assoc. Comp. Mach. 23(1), 31–42 (1976)MATHGoogle Scholar
  24. 24.
    Vazquez, A., Dobrin, R., Sergi, D., Eckmann, J.-P., Oltvai, Z.N., Barabasi, A.-L.: The topological relationship between the large-scale attributes and local interaction patterns of complex networks. PNAS 101(52), 17940–17945 (2004)CrossRefGoogle Scholar
  25. 25.
    Zaslaver, A., Mayo, A.E., Rosenberg, R., Bashkin, P., Sberro, H., Tsalyuk, M., Surette, M.G., Alon, U.: Just-in-time transcription program in metabolic pathways. Nature Genetics 36(5), 486–491 (2004)CrossRefGoogle Scholar
  26. 26.
    Ziv, E., Koytcheff, R., Middendorf, M., Wiggins, C.: Systematic identification of statistically significant network measures. Phys. Rev. E 71, 16110 (2005)CrossRefGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Joshua A. Grochow
    • 1
  • Manolis Kellis
    • 1
  1. 1.Computer Science and AI Laboratory, M.I.T., Broad Institute of M.I.T. and Harvard 

Personalised recommendations