Functional Interaction Network Construction and Analysis for Disease Discovery

  • Guanming Wu
  • Robin Haw
Part of the Methods in Molecular Biology book series (MIMB, volume 1558)


Network-based approaches project seemingly unrelated genes or proteins onto a large-scale network context, therefore providing a holistic visualization and analysis platform for genomic data generated from high-throughput experiments, reducing the dimensionality of data via using network modules and increasing the statistic analysis power. Based on the Reactome database, the most popular and comprehensive open-source biological pathway knowledgebase, we have developed a highly reliable protein functional interaction network covering around 60 % of total human genes and an app called ReactomeFIViz for Cytoscape, the most popular biological network visualization and analysis platform. In this chapter, we describe the detailed procedures on how this functional interaction network is constructed by integrating multiple external data sources, extracting functional interactions from human curated pathway databases, building a machine learning classifier called a Naïve Bayesian Classifier, predicting interactions based on the trained Naïve Bayesian Classifier, and finally constructing the functional interaction database. We also provide an example on how to use ReactomeFIViz for performing network-based data analysis for a list of genes.

Key words

Functional interaction Biological network Biological pathway Reactome Network-based analysis ReactomeFIViz Cytoscape Naïve Bayesian Classifier Java MySQL 


  1. 1.
    Rual J-F, Venkatesan K, Hao T et al (2005) Towards a proteome-scale map of the human protein–protein interaction network. Nature 437:1173–1178CrossRefPubMedGoogle Scholar
  2. 2.
    Ewing RM, Chu P, Elisma F et al (2007) Large-scale mapping of human protein–protein interactions by mass spectrometry. Mol Syst Biol 3:89CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Fabregat A, Sidiropoulos K, Garapati P et al (2016) The Reactome pathway knowledgebase. Nucleic Acids Res 44:D481–D487CrossRefPubMedGoogle Scholar
  4. 4.
    Gerstein MB, Kundaje A, Hariharan M et al (2012) Architecture of the human regulatory network derived from ENCODE data. Nature 489:91–100CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Jiang C, Xuan Z, Zhao F et al (2007) TRED: a transcriptional regulatory element database, new entries and other development. Nucleic Acids Res 35:D137–D140CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Wu G, Dawson E, Duong A et al (2014) ReactomeFIViz: a Cytoscape app for pathway and network-based data analysis. F1000Res 3:146PubMedPubMedCentralGoogle Scholar
  7. 7.
    Shannon P, Markiel A, Ozier O et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498–2504CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Wu G, Feng X, Stein L (2010) A human functional protein interaction network and its application to cancer data analysis. Genome Biol 11:R53CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    UniProt Consortium (2015) UniProt: a hub for protein information. Nucleic Acids Res 43:D204–D212CrossRefGoogle Scholar
  10. 10.
    McGarvey PB, Huang H, Barker WC et al (2000) PIR: a new resource for bioinformatics. Bioinformatics 16:290–291CrossRefPubMedGoogle Scholar
  11. 11.
    Kanehisa M, Sato Y, Kawashima M et al (2016) KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res 44:D457–D462CrossRefPubMedGoogle Scholar
  12. 12.
    Schaefer CF, Anthony K, Krupa S et al (2009) PID: the pathway interaction database. Nucleic Acids Res 37:D674–D679CrossRefPubMedGoogle Scholar
  13. 13.
    Mi H, Poudel S, Muruganujan A et al (2016) PANTHER version 10: expanded protein families and functions, and analysis tools. Nucleic Acids Res 44:D336–D342CrossRefPubMedGoogle Scholar
  14. 14.
    Razick S, Magklaras G, Donaldson IM (2008) iRefIndex: a consolidated protein interaction database with provenance. BMC Bioinf 9:405CrossRefGoogle Scholar
  15. 15.
    Lee HK, Hsu AK, Sajdak J et al (2004) Coexpression analysis of human genes across many microarray data sets. Genome Res 14:1085–1094CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Prieto C, Risueno A, Fontanillo C et al (2008) Human gene coexpression landscape: confident network derived from tissue transcriptomic profiles. PLoS One 3:e3911CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Ashburner M, Ball CA, Blake JA et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Flicek P, Aken BL, Ballester B et al (2010) Ensembl’s 10th year. Nucleic Acids Res 38(Database):D557–D562CrossRefPubMedGoogle Scholar
  19. 19.
    Finn RD, Coggill P, Eberhardt RY et al (2016) The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res 44:D279–D285CrossRefPubMedGoogle Scholar
  20. 20.
    Cancer Genome Atlas Research Network (2008) Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455:1061–1068CrossRefGoogle Scholar
  21. 21.
    Newman MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci U S A 103:8577–8582CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  1. 1.Informatics and Biocomputing ProgramOntario Institute for Cancer ResearchTorontoCanada
  2. 2.Department of Medical Informatics and Clinical EpidemiologyOregon Health & Science UniversityPortlandUSA

Personalised recommendations