Predicting Node Characteristics from Molecular Networks

  • Sara Mostafavi
  • Anna Goldenberg
  • Quaid MorrisEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 781)


A large number of genome-scale networks, including protein–protein and genetic interaction networks, are now available for several organisms. In parallel, many studies have focused on analyzing, characterizing, and modeling these networks. Beyond investigating the topological characteristics such as degree distribution, clustering coefficient, and average shortest-path distance, another area of particular interest is the prediction of nodes (genes) with a given characteristic (labels) – for example prediction of genes that cause a particular phenotype or have a given function. In this chapter, we describe methods and algorithms for predicting node labels from network-based datasets with an emphasis on label propagation algorithms (LPAs) and their relation to local neighborhood methods.

Key words

Functional linkage networks Gene function prediction Label propagation 


  1. 1.
    Marcotte, E.M., et al., Detecting protein function and protein-protein interactions from genome sequences. Science, 1999. 285(5428): p. 751–3.Google Scholar
  2. 2.
    Wu, X., et al., Network-based global inference of human disease genes. Mol Syst Biol, 2008. 4: p. 189.Google Scholar
  3. 3.
    Aerts, S., et al., Gene prioritization through genomic data fusion. Nat Biotechnol, 2006. 24(5): p. 537–44.Google Scholar
  4. 4.
    Sharan, R., I. Ulitsky, and R. Shamir, Network-based prediction of protein function. Mol Syst Biol, 2007. 3: p. 88.Google Scholar
  5. 5.
    Oti, M. and H.G. Brunner, The modular nature of genetic diseases. Clin Genet, 2007. 71(1): p. 1–11.Google Scholar
  6. 6.
    Ashburner, M., et al., Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet, 2000. 25(1): p. 25–9.Google Scholar
  7. 7.
    Ogata, H., et al., KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res, 1999. 27(1): p. 29–34.Google Scholar
  8. 8.
    Ruepp, A., et al., The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Res, 2004. 32(18): p. 5539–45.Google Scholar
  9. 9.
    Robinson, P.N., et al., The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. Am J Hum Genet, 2008. 83(5): p. 610–5.Google Scholar
  10. 10.
    Hamosh, A., et al., Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res, 2005. 33(Database issue): p. D514–7.Google Scholar
  11. 11.
    Chua, H.N., W.K. Sung, and L. Wong, Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics, 2006. 22(13): p. 1623–30.Google Scholar
  12. 12.
    Zhou, X., M.C. Kao, and W.H. Wong, Transitive functional annotation by shortest-path analysis of gene expression data. Proc Natl Acad Sci USA, 2002. 99(20): p. 12783–8.Google Scholar
  13. 13.
    Myers, C.L., et al., Discovery of biological networks from diverse functional genomic data. Genome Biol, 2005. 6(13): p. R114.Google Scholar
  14. 14.
    Karaoz, E., et al., Protective role of melatonin and a combination of vitamin C and vitamin E on lung toxicity induced by chlorpyrifos-ethyl in rats. Exp Toxicol Pathol, 2002. 54(2): p. 97–108.Google Scholar
  15. 15.
    Deng, M., et al., Prediction of protein function using protein-protein interaction data. J Comput Biol, 2003. 10(6): p. 947–60.Google Scholar
  16. 16.
    Nabieva, E., et al., Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics, 2005. 21 Suppl 1: p. i302–10.Google Scholar
  17. 17.
    Mostafavi, S., et al., GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol, 2008. 9 Suppl 1: p. S4.Google Scholar
  18. 18.
    Tsuda, K., H. Shin, and B. Scholkopf, Fast protein classification with multiple networks. Bioinformatics, 2005. 21 Suppl 2: p. ii59-65.Google Scholar
  19. 19.
    Murali, T.M., C.J. Wu, and S. Kasif, The art of gene function prediction. Nat Biotechnol, 2006. 24(12): p. 1474–5; author reply 1475–6.Google Scholar
  20. 20.
    Deng, M., T. Chen, and F. Sun, An integrated probabilistic model for functional prediction of proteins. J Comput Biol, 2004. 11(2–3): p. 463–75.Google Scholar
  21. 21.
    Mostafavi, S. and Q. Morris, Fast Integration of Heterogeneous Data Sources for Predicting Gene Function with Limited Annotation. Bioinformatics, 2010.Google Scholar
  22. 22.
    Lanckriet, G.R., et al., A statistical framework for genomic data fusion. Bioinformatics, 2004. 20(16): p. 2626–35.Google Scholar
  23. 23.
    Pena-Castillo, L., et al., A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biol, 2008. 9 Suppl 1: p. S2.Google Scholar
  24. 24.
    Pavlidis, P., et al., Learning gene functional classifications from multiple data types. J Comput Biol, 2002. 9(2): p. 401–11.Google Scholar
  25. 25.
    Zhang, B. and S. Horvath, A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol, 2005. 4: p. Article17.Google Scholar
  26. 26.
    Yona, G., et al., Effective similarity measures for expression profiles. Bioinformatics, 2006. 22(13): p. 1616–22.Google Scholar
  27. 27.
    Warde-Farley, D., et al., The GeneMANIA ­prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res, 2010. Accepted(Webserver Issue).Google Scholar
  28. 28.
    Costanzo, M., et al., The genetic landscape of a cell. Science, 2010. 327(5964): p. 425–31.Google Scholar
  29. 29.
    Tong, A.H., et al., Global mapping of the yeast genetic interaction network. Science, 2004. 303(5659): p. 808–13.Google Scholar
  30. 30.
    Weirauch, M.T., et al., Information-based methods for predicting gene function from systematic gene knock-downs. BMC Bioinformatics, 2008. 9: p. 463.Google Scholar
  31. 31.
    Hishigaki, H., et al., Assessment of prediction accuracy of protein function from protein--protein interaction data. Yeast, 2001. 18(6): p. 523–31.Google Scholar
  32. 32.
    Schwikowski, B., P. Uetz, and S. Fields, A network of protein-protein interactions in yeast. Nat Biotechnol, 2000. 18(12): p. 1257–61.Google Scholar
  33. 33.
    Zhou, D., et al., Learning with Local and Global Consistency, in Neural Information Processing Systems. 2003, MIT Press: Vancouver, BC, Canada.Google Scholar
  34. 34.
    Weston, J., et al., Protein ranking: from local to global structure in the protein similarity network. Proc Natl Acad Sci USA, 2004. 101(17): p. 6559–63.Google Scholar
  35. 35.
    Hu, P., H. Jiang, and A. Emili, Predicting protein functions by relaxation labelling protein interaction network. BMC Bioinformatics, 2010. 11 Suppl 1: p. S64.Google Scholar
  36. 36.
    Bengio, Y., O. Delalleau, and N. Le Roux, Label Propagation and Quadratic Criterion, in Semi-Supervised Learning, O. Chapelle, B. Scholkopf, and A. Zien, Editors. 2006, MIT Press.Google Scholar
  37. 37.
    Chung, F., Spectral Graph Theory. Number 92 in CBMS Regional Conference Series in Mathematics. 1999: American Mathematical Society.Google Scholar
  38. 38.
    Vazquez, A., et al., Global protein function prediction from protein-protein interaction networks. Nat Biotechnol, 2003. 21(6): p. 697–700.Google Scholar
  39. 39.
    Karaoz, U., et al., Whole-genome annotation by using evidence integration in functional-linkage networks. Proc Natl Acad Sci USA, 2004. 101(9): p. 2888–93.Google Scholar
  40. 40.
    Fraser, A.G. and E.M. Marcotte, A probabilistic view of gene function. Nat Genet, 2004. 36(6): p. 559–64.Google Scholar
  41. 41.
    Lee, I., et al., A probabilistic functional network of yeast genes. Science, 2004. 306(5701): p. 1555–8.Google Scholar
  42. 42.
    Myers, C.L. and O.G. Troyanskaya, Context-sensitive data integration and prediction of ­biological networks. Bioinformatics, 2007. 23(17): p. 2322–30.Google Scholar
  43. 43.
    Huttenhower, C., et al., Exploring the human genome with functional maps. Genome Res, 2009. 19(6): p. 1093–106.Google Scholar
  44. 44.
    Noble, W.S. and A. Ben-Hur, Integrating Information for Protein Function Prediction, in Bioinformatics-From Genomes to Therapies, T. Lengauer, Editor. 2007, Wiley-VCH Verlag GmbH & Co KGaA: Weinheim, Germany.Google Scholar
  45. 45.
    Song, J. and M. Singh, How and when should interactome-derived clusters be used to predict functional modules and protein function? Bioinformatics, 2009. 25(23): p. 3143–50.Google Scholar
  46. 46.
    Myers, C.L., et al., Finding function: evaluation methods for functional genomic data. BMC Genomics, 2006. 7: p. 187.Google Scholar
  47. 47.
    Zhu, X., J. Lafferty, and Z. Ghahramani. Semi-supervised learning using Gaussian fields and harmonic functions. in International Conference on Machine Learning. 2003. Washington DC, USA.Google Scholar
  48. 48.
    Lewis, D.P., T. Jebara, and W.S. Noble, Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure. Bioinformatics, 2006. 22(22): p. 2753–60.Google Scholar
  49. 49.
    Warde-Farley, D., et al., The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res, 2010. 38 Suppl: p. W214–20.Google Scholar
  50. 50.
    Alexeyenko, A. and E.L. Sonnhammer, Global networks of functional coupling in eukaryotes from comprehensive data integration. Genome Res, 2009. 19(6): p. 1107–16.Google Scholar
  51. 51.
    von Mering, C., et al., STRING: known and predicted protein-protein associations, ­integrated and transferred across organisms. Nucleic Acids Res, 2005. 33(Database issue): p. D433–7.Google Scholar
  52. 52.
    Guan, Y., et al., A genomewide functional ­network for the laboratory mouse. PLoS Comput Biol, 2008. 4(9): p. e1000165.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Sara Mostafavi
    • 1
  • Anna Goldenberg
    • 2
  • Quaid Morris
    • 3
    • 4
    Email author
  1. 1.Department of Computer Science, Centre for Cellular and Biomolecular Research (CCBR)University of TorontoTorontoCanada
  2. 2.Banting and Best Department of Medical Research, Centre for Cellular and Biomolecular ResearchUniversity of TorontoTorontoCanada
  3. 3.Department of Computer Science, Banting and Best Department of Medical ResearchCentre for Cellular and Biomolecular ResearchTorontoCanada
  4. 4.Department of Molecular GeneticsUniversity of TorontoTorontoCanada

Personalised recommendations