Network Motif Discovery Using Subgraph Enumeration and Symmetry-Breaking
The study of biological networks and network motifs can yield significant new insights into systems biology. Previous methods of discovering network motifs – network-centric subgraph enumeration and sampling – have been limited to motifs of 6 to 8 nodes, revealing only the smallest network components. New methods are necessary to identify larger network sub-structures and functional motifs.
Here we present a novel algorithm for discovering large network motifs that achieves these goals, based on a novel symmetry-breaking technique, which eliminates repeated isomorphism testing, leading to an exponential speed-up over previous methods. This technique is made possible by reversing the traditional network-based search at the heart of the algorithm to a motif-based search, which also eliminates the need to store all motifs of a given size and enables parallelization and scaling. Additionally, our method enables us to study the clustering properties of discovered motifs, revealing even larger network elements.
We apply this algorithm to the protein-protein interaction network and transcription regulatory network of S. cerevisiae, and discover several large network motifs, which were previously inaccessible to existing methods, including a 29-node cluster of 15-node motifs corresponding to the key transcription machinery of S. cerevisiae.
KeywordsMotif Generalization Network Motif Connected Subgraph Subgraph Isomorphism Query Graph
Unable to display preview. Download preview PDF.
- 1.Baskerville, K., Paczuski, M.: Subgraph ensembles and motif discovery using a new heuristic for graph isomorphism (2006), http://www.arxiv.org:q/bio/0606023
- 2.Costanzo, M.C., Crawford, M.E., Hirschman, J.E., Kranz, J.E., Olsen, P., Robertson, L.S., Skrzypek, M.S., Braun, B.R., Hopkins, K.L., Kondu, P., Lengieza, C., Lew-Smith, J.E., Tillberg, M., Garrels, J.I.: Ypd(tm), pombepd(tm), and wormpd(tm): model organism volumes of the bioknowledge(tm) library, an integrated resource for protein information. Nucleic Acids Res. 29, 75–79 (2001)CrossRefGoogle Scholar
- 6.Jeong, H., Mason, S., Barabási, A.-L., Oltvai, Z.N.: Centrality and lethality of protein networks. Nature 411 (2001)Google Scholar
- 7.Jeong, H., Tombor, B., Albert, R., Oltvai, Z.N., Barabási, A.-L.: The large-scale organization of metabolic networks. Nature 407 (2000)Google Scholar
- 9.Kashtan, N., Itzkovitz, S., Milo, R., Alon, U.: Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics, Evaluation Studies 20(11), 1746–1758 (2004)Google Scholar
- 10.Kashtan, N., Itzkovitz, S., Milo, R., Alon, U.: Topological generalizations of network motifs. Phys. Rev. E, 70:031909 (2004)Google Scholar
- 14.McKay, B.D.: Practical graph isomorphism. In: Proceedings of the Tenth Manitoba Conference on Numerical Mathematics and Computing, vol. I, Winnipeg, Man., 1980, pp. 45–87 (1981), http://cs.anu.edu.au/~bdm/nauty/
- 22.JUNG Framework Development Team: Jung: The java universal network/graph framework (2005)Google Scholar