Abstract
In recent years, many algorithms have been developed to narrow down the set of candidate disease genes implicated by genome wide association studies (GWAS), using knowledge on protein-protein interactions (PPIs). All of these algorithms are based on a common principle; functional association between proteins is correlated with their connectivity/proximity in the PPI network. However, recent research also reveals that networks are organized into recurrent network schemes that underlie the mechanisms of cooperation among proteins with different function, as well as the crosstalk between different cellular processes. In this paper, we hypothesize that proteins that are associated with similar diseases may exhibit patterns of “topological similarity” in PPI networks. Motivated by these observations, we introduce the notion of “topological profile”, which represents the location of a protein in the network with respect to other proteins. Based on this notion, we develop a novel measure to assess the topological similarity of proteins in a PPI network. We then use this measure to develop algorithms that prioritize candidate disease genes based on the topological similarity of their products and the products of known disease genes. Systematic experimental studies using an integrated human PPI network and the Online Mendelian Inheritance (OMIM) database show that the proposed algorithm, Vavien, clearly outperforms state-of-the-art network based prioritization algorithms. Vavien is available as a web service at http://www.diseasegenes.org .
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brunner, H.G., van Driel, M.A.: From syndrome families to functional genomics. Nat. Rev. Genet. 5(7), 545–551 (2004)
Glazier, A.M., Nadeau, J.H., Aitman, T.J.: Finding Genes That Underlie Complex Traits. Science 298(5602), 2345–2349 (2002)
Lage, K., Karlberg, E., Storling, Z., Olason, P., Pedersen, A., Rigina, O., Hinsby, A., Tumer, Z., Pociot, F., Tommerup, N., Moreau, Y., Brunak, S.: A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat. Bio 25(3), 309–316 (2007)
Nica, A.C., Dermitzakis, E.T.: Using gene expression to investigate the genetic basis of complex disorders. Human Molecular Genetics 17(R2), ddn134– ddn285 (2008)
Adie, E., Adams, R., Evans, K., Porteous, D., Pickard, B.: SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics 22(6), 773–774 (2006)
Chen, J., Bardes, E.E., Aronow, B.J., Jegga, A.G.: Toppgene suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Research 37(Web Server issue), gkp427+ (2009)
Turner, F., Clutterbuck, D., Semple, C.: Pocus: mining genomic sequence annotation to predict disease genes. Genome Biology 4(11), R75 (2003)
Ewing, R.M., Chu, P., Elisma, F., Li, H., Figeys, D.: Large-scale mapping of human protein-protein interactions by mass spectrometry. Molecular Systems Biology 3 (2007)
Navlakha, S., Kingsford, C.: The power of protein interaction networks for associating genes with diseases. Bioinformatics 26(8), 1057–1063 (2010)
Franke, L., Bakel, H., Fokkens, L., de Jong, E.D., Egmont-Petersen, M., Wijmenga, C.: Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am. J. Hum. Genet. 78(6), 1011–1025 (2006)
Ideker, T., Sharan, R.: Protein networks in disease. Genome Research 18(4), 644–652 (2008)
Karni, S., Soreq, H., Sharan, R.: A network-based method for predicting disease-causing genes. Journal of Computational Biology 16(2), 181–189 (2009)
Oti, M., Snel, B., Huynen, M.A., Brunner, H.G.: Predicting disease genes using protein-protein interactions. J. Med. Genet., jmg.2006.041376 (2006)
Chen, J., Aronow, B., Jegga, A.: Disease candidate gene identification and prioritization using protein interaction networks. BMC Bioinformatics 10(1), 73 (2009)
Köhler, S., Bauer, S., Horn, D., Robinson, P.N.: Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82(4), 949–958 (2008)
Vanunu, O., Magger, O., Ruppin, E., Shlomi, T., Sharan, R.: Associating genes and protein complexes with disease via network propagation. PLoS Comp. Bio. 6(1) (January 2010)
Zhang, L., Hu, K., Tang, Y.: Predicting disease-related genes by topological similarity in human protein-protein interaction network. Central European Journal of Physics 8, 672–682 (2010), 10.2478/s11534-009-0114-9
Wu, X., Jiang, R., Zhang, M.Q., Li, S.: Network-based global inference of human disease genes. Molecular Systems Biology 4 (May 2008)
Missiuro, P.V.V., Liu, K., Zou, L., Ross, B.C., Zhao, G., Liu, J.S., Ge, H.: Information flow analysis of interactome networks. PLoS Computational Biology 5(4), e1000350+ (2009)
Goh, K.I., Cusick, M.E., Valle, D., Childs, B., Vidal, M., Barabási, A.L.: The human disease network. PNAS 104(21), 8685–8690 (2007)
Rhodes, D.R., Chinnaiyan, A.M.: Integrative analysis of the cancer transcriptome. Nat. Genet. 37 Suppl. (June 2005)
Pandey, J., Koyutürk, M., Grama, A.: Functional characterization and topological modularity of molecular interaction networks. BMC Bioinformatics 11(Suppl. 1), S35 (2010)
Bebek, G., Patel, V., Chance, M.R.: Petals: Proteomic evaluation and topological analysis of a mutated locus signaling. BMC Bioinformatics 11, 596 (2010)
Pandey, J., Koyutürk, M., Kim, Y., Subramaniam, S., Szpankowski, W., Grama, A.: Functional annotation of regulatory pathways. Bioinformatics Suppl. on ISMB/ECCB 2007 23(13), i377–i386 (2007)
Bebek, G., Yang, J.: Pathfinder: mining signal transduction pathway segments from protein-protein interaction networks. BMC Bioinformatics 8, 335 (2007)
Banks, E., Nabieva, E., Peterson, R., Singh, M.: Netgrep: fast network schema searches in interactomes. Genome Biology 9(9) (2008)
Kirac, M., Özsoyoģlu, G.: Protein function prediction based on patterns in biological networks. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS (LNBI), vol. 4955, pp. 197–213. Springer, Heidelberg (2008)
Sjöblom, T., Jones, S., Wood, L.D., Parsons, D.W., Lin, J., Barber, T.D., et al.: The consensus coding sequences of human breast and colorectal cancers. Science 314(5797), 268–274 (2006)
Wood, L.D., Parsons, D.W., Jones, S., Lin, J., Sjöblom, T., Leary, R.J.: The genomic landscapes of human breast and colorectal cancers. Science 318(5853), 1108–1113 (2007)
Marsh, V., Winton, D.J., Williams, G.T., Dubois, N., Trumpp, A., Sansom, O.J., Clarke, A.R.: Epithelial pten is dispensable for intestinal homeostasis but suppresses adenoma development and progression after apc mutation. Nat. Genet. 40(12), 1436–1444 (2008)
Halberg, R.B., Chen, X., Amos-Landgraf, J.M., White, A., Rasmussen, K., Clipson, L., et al.: The pleiotropic phenotype of apc mutations in the mouse: allele specificity and effects of the genetic background. Genetics 180(1), 601–609 (2008)
Patel, V.N., Bebek, G., Mariadason, J.M., Wang, D., Augenlicht, L.H., Chance, M.R.: Prediction and testing of biological networks underlying intestinal cancer. PLoS One 5(9) (2010)
Erten, S., Koyutürk, M.: Role of centrality in network-based prioritization of disease genes. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds.) EvoBIO 2010. LNCS, vol. 6023, pp. 13–25. Springer, Heidelberg (2010)
van Driel, M.A., Bruggeman, J., Vriend, G., Brunner, H.G., Leunissen, J.A.: A text-mining analysis of the human phenome. EJHG 14(5), 535–542 (2006)
Stumpf, M.P.H., Thorne, T., de Silva, E., et al.: Estimating the size of the human interactome. Proc. Natl. Acad. Sci. USA 105(19), 6959–6964 (2008)
Sharan, R., Suthram, S., et al.: Conserved patterns of protein interaction in multiple species. Proc. Natl. Acad. Sci. USA 102(6), 1974–1979 (2005)
Suthram, S., Shlomi, T., Ruppin, E., Sharan, R., Ideker, T.: A direct comparison of protein interaction confidence assignment schemes. BMC Bioinformatics 7, 360+ (2006)
Mewes, H.W., Heumann, K., Kaps, et al.: Mips: a database for genomes and protein sequences. Nuc. Ac. Res. 27(1), 44–48 (1999)
Smialowski, P., Pagel, P., Wong, P., Brauner, B., Dunger, I., Fobo, G., Frishman, G., Montrone, C., Rattei, T., Frishman, D., Ruepp, A.: The negatome database: a reference set of non-interacting protein pairs. Nucleic Acids Res. 38(Database issue), D540–D544 (2010)
Barrett, T., Troup, D.B., et al.: Ncbi geo: archive for high-throughput functional genomic data. Nucleic Acids Res. 37(Database issue), 885–890 (2009)
Sprenger, J., Lynn Fink, J., Karunaratne, S., Hanson, K., Hamilton, N.A., Teasdale, R.D.: Locate: a mammalian protein subcellular localization database. Nucleic Acids Res. 36(Database issue), D230–D233 (2008)
Goldberg, D.S., Roth, F.P.: Assessing experimentally derived interactions in a small world. Proc. Natl. Acad. Sci. USA 100(8), 4372–4376 (2003)
Spielman, D.A., Srivastava, N.: Graph sparsification by effective resistances. In: STOC, pp. 563–568 (2008)
Tetali, P.: Random walks and the effective resistance of networks. Journal of Theoretical Probability 4(1), 101–109 (1991)
Lovász, L.: Random walks on graphs: A survey. Combinatorics, Paul Erdos is Eighty 2, 353–398 (1996)
Tong, H., Faloutsos, C., Pan, J.Y.: Random walk with restart: fast solutions and applications. Knowledge and Information Systems 14(3), 327–346 (2008)
Macropol, K., Can, T., Singh, A.: Rrw: repeated random walks on genome-scale protein networks for local cluster discovery. BMC Bioinformatics 10(1), 283 (2009)
Tong, H., Faloutsos, C.: Center-piece subgraphs: problem definition and fast solutions. In: KDD 2006: Proceedings of the 12th ACM SIGKDD, pp. 404–413. ACM, NY (2006)
Maglott, D., Ostell, J., Pruitt, K.D., Tatusova, T.: Entrez Gene: gene-centered information at NCBI. Nucl. Acids Res. 35(suppl-1), D26–D31 (2007)
Erten, S., Bebek, G., Ewing, R.M., Koyutürk, M.: Dada - degree-aware disease gene prioritization. BioData Mining (to be published)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Erten, S., Bebek, G., Koyutürk, M. (2011). Disease Gene Prioritization Based on Topological Similarity in Protein-Protein Interaction Networks. In: Bafna, V., Sahinalp, S.C. (eds) Research in Computational Molecular Biology. RECOMB 2011. Lecture Notes in Computer Science(), vol 6577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20036-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-20036-6_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20035-9
Online ISBN: 978-3-642-20036-6
eBook Packages: Computer ScienceComputer Science (R0)