Interrogation of genome-wide networks in biology: comparison of knowledge-based and statistical methods
Abstract
Networks are used extensively in the study of biological systems to address a wide range of questions such as understanding the complex behaviour of a given system or identifying key alterations leading to a disease phenotype. Numerous network-based methods have been developed for inferring molecular interactions using transcriptomic and proteomic data. Different network methods come with their own advantages and limitations, and often give different results for the same data. A systematic study is essential to understand how the methods fare in terms of correctly predicting known biological processes and yielding testable biological hypotheses. To address this, we have carried out a comparison of four different methods to derive context-specific perturbations for two different case studies and evaluated their performance. The methods can be broadly classified into statistical inference and knowledge-based methods. Two of the four methods, WGCNA and ARACNE, belong to the broad class of data-driven approaches which do not rely on prior network information. On the other hand, ResponseNet and jActiveModules utilise knowledge-based protein–protein interaction networks and integrate condition-specific transcriptome or proteome data. We evaluated the interactions inferred through all the approaches and assessed their biological relevance based on three criteria: (1) enrichment of the gold standard gene sets, (2) comparison to gold standard pathways and (3) recovery of hub genes from the context-specific perturbed network, known to be related to the given condition. Comparing the performance of these four methods in two different cases, tuberculosis and melanoma, showed superior performance by ResponseNet, based on all three criteria.
Keywords
Perturbation network Omics Statistical inference methods Knowledge-based methodsAbbreviations
- ARACNE
The Algorithm for the Reconstruction of Accurate Cellular Networks
- BNs
Bayesian Networks
- DEG
Differentially expressed gene
- FDR
False discovery rate
- GSGS
Gold standard gene sets
- GSPS
Gold standard pathway sets
- HC
Healthy controls
- hPPiN
Human protein–protein interaction network
- HPRD
Human Protein Reference Database
- MIM
Mutual information matrix
- NS
Normal skin
- PM
Primary melanoma
- STRING
Search Tool for the Retrieval of Interacting Genes/Proteins
- TAP
Top activated paths
- TB
Tuberculosis
- TPN
Top perturbed network
- TRP
Top repressed paths
- WGCNA
Weighted Gene Co-expression Network Analysis
Notes
Acknowledgements
We thank Department of Biotechnology (DBT), Government of India for the funding. Narmada Sambaturu and Amrisha Bhosle are acknowledged for proof reading the manuscript.
Supplementary material
References
- 1.Barabási, A.-L., Oltvai, Z.N.: Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 5, 101–113 (2004)CrossRefGoogle Scholar
- 2.van Someren, E.P., Wessels, L.F.A., Backer, E., Reinders, M.J.T.: Genetic network modeling. Pharmacogenomics 3, 507–525 (2002)CrossRefGoogle Scholar
- 3.Langfelder, P., Horvath, S.: WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 9, 559 (2008)CrossRefGoogle Scholar
- 4.Margolin, A.A., Nemenman, I., Basso, K., Klein, U., Wiggins, C., Stolovitzky, G., Favera, R.D., Califano, A.: ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinform. 7, S7 (2006)CrossRefGoogle Scholar
- 5.Bansal, M., Belcastro, V., Ambesi-Impiombato, A., di Bernardo, D.: How to infer gene networks from expression profiles. Mol. Syst. Biol. 3, 78 (2007)CrossRefGoogle Scholar
- 6.Butte, A.J., Kohane, I.S.: Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In: Pacific Symposium on Biocomputing, pp. 418–429 (2000)Google Scholar
- 7.Beal, M.J., Falciani, F., Ghahramani, Z., Rangel, C., Wild, D.L.: A Bayesian approach to reconstructing genetic regulatory networks with hidden factors. Bioinform. Oxf. Engl. 21, 349–356 (2005)CrossRefGoogle Scholar
- 8.Aragam, B., Gu, J., Zhou, Q.: Learning large-scale Bayesian networks with the sparsebn package (2017). arXiv:1703.04025 [stat.ML]
- 9.Della Gatta, G., Bansal, M., Ambesi-Impiombato, A., Antonini, D., Missero, C., di Bernardo, D.: Direct targets of the TRP63 transcription factor revealed by a combination of gene expression profiling and reverse engineering. Genome Res. 18, 939–948 (2008)CrossRefGoogle Scholar
- 10.Mobini, R., Andersson, B.A., Erjefält, J., Hahn-Zoric, M., Langston, M.A., Perkins, A.D., Cardell, L.O., Benson, M.: A module-based analytical strategy to identify novel disease-associated genes shows an inhibitory role for interleukin 7 receptor in allergic inflammation. BMC Syst. Biol. 3, 19 (2009)CrossRefGoogle Scholar
- 11.Emmert-Streib, F., Dehmer, M., Haibe-Kains, B.: Untangling statistical and biological models to understand network inference: the need for a genomics network ontology. Front. Genet. 5, 299 (2014)Google Scholar
- 12.De Smet, R., Marchal, K.: Advantages and limitations of current network inference methods. Nat. Rev. Microbiol. 8(10), 717–729 (2010). https://doi.org/10.1038/nrmicro2419 CrossRefGoogle Scholar
- 13.Ideker, T., Sharan, R.: Protein networks in disease. Genome Res. 18, 644–652 (2008)CrossRefGoogle Scholar
- 14.Barabási, A.-L., Gulbahce, N., Loscalzo, J.: Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12, 56–68 (2011)CrossRefGoogle Scholar
- 15.Hopkins, A.L.: Network pharmacology: the next paradigm in drug discovery. Nat. Chem. Biol. 4, 682–690 (2008)CrossRefGoogle Scholar
- 16.Ma, H., Schadt, E.E., Kaplan, L.M., Zhao, H.: COSINE: COndition-SpecIfic sub-NEtwork identification using a global optimization method. Bioinformatics 27, 1290–1298 (2011)CrossRefGoogle Scholar
- 17.Chen, W., Liu, J., He, S.: Prior knowledge guided active modules identification: an integrated multi-objective approach. BMC Syst. Biol. 11, 8 (2017)CrossRefGoogle Scholar
- 18.Linding, R., Jensen, L.J., Pasculescu, A., Olhovsky, M., Colwill, K., Bork, P., Yaffe, M.B., Pawson, T.: NetworKIN: a resource for exploring cellular phosphorylation networks. Nucleic Acids Res. 36, D695–D699 (2008)CrossRefGoogle Scholar
- 19.Stelzl, U., Worm, U., Lalowski, M., Haenig, C., Brembeck, F.H., Goehler, H., Stroedicke, M., Zenkner, M., Schoenherr, A., Koeppen, S., Timm, J., Mintzlaff, S., Abraham, C., Bock, N., Kietzmann, S., Goedde, A., Toksöz, E., Droege, A., Krobitsch, S., Korn, B., Birchmeier, W., Lehrach, H., Wanker, E.E.: A human protein–protein interaction network: a resource for annotating the proteome. Cell 122, 957–968 (2005)CrossRefGoogle Scholar
- 20.Rual, J.-F., Venkatesan, K., Hao, T., Hirozane-Kishikawa, T., Dricot, A., Li, N., Berriz, G.F., Gibbons, F.D., Dreze, M., Ayivi-Guedehoussou, N., Klitgord, N., Simon, C., Boxem, M., Milstein, S., Rosenberg, J., Goldberg, D.S., Zhang, L.V., Wong, S.L., Franklin, G., Li, S., Albala, J.S., Lim, J., Fraughton, C., Llamosas, E., Cevik, S., Bex, C., Lamesch, P., Sikorski, R.S., Vandenhaute, J., Zoghbi, H.Y., Smolyar, A., Bosak, S., Sequerra, R., Doucette-Stamm, L., Cusick, M.E., Hill, D.E., Roth, F.P., Vidal, M.: Towards a proteome-scale map of the human protein–protein interaction network. Nature 437, 1173–1178 (2005)CrossRefGoogle Scholar
- 21.Franceschini, A., Szklarczyk, D., Frankild, S., Kuhn, M., Simonovic, M., Roth, A., Lin, J., Minguez, P., Bork, P., von Mering, C., Jensen, L.J.: STRING v9.1: protein–protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 41, D808–D815 (2013)CrossRefGoogle Scholar
- 22.Goel, R., Harsha, H.C., Pandey, A., Prasad, T.S.K.: Human Protein Reference Database and Human Proteinpedia as resources for phosphoproteome analysis. Mol. BioSyst. 8, 453–463 (2012)CrossRefGoogle Scholar
- 23.Chatr-aryamontri, A., Oughtred, R., Boucher, L., Rust, J., Chang, C., Kolas, N.K., O’Donnell, L., Oster, S., Theesfeld, C., Sellam, A., Stark, C., Breitkreutz, B.-J., Dolinski, K., Tyers, M.: The BioGRID interaction database: 2017 update. Nucleic Acids Res. 45, D369–D379 (2017)CrossRefGoogle Scholar
- 24.Segal, E., Wang, H., Koller, D.: Discovering molecular pathways from protein interaction and gene expression data. Bioinform. Oxf. Engl. 19(Suppl 1), i264–i271 (2003)CrossRefGoogle Scholar
- 25.Sohler, F., Hanisch, D., Zimmer, R.: New methods for joint analysis of biological networks and expression data. Bioinform. Oxf. Engl. 20, 1517–1521 (2004)CrossRefGoogle Scholar
- 26.Cline, M.S., Smoot, M., Cerami, E., Kuchinsky, A., Landys, N., Workman, C., Christmas, R., Avila-Campilo, I., Creech, M., Gross, B., Hanspers, K., Isserlin, R., Kelley, R., Killcoyne, S., Lotia, S., Maere, S., Morris, J., Ono, K., Pavlovic, V., Pico, A.R., Vailaya, A., Wang, P.-L., Adler, A., Conklin, B.R., Hood, L., Kuiper, M., Sander, C., Schmulevich, I., Schwikowski, B., Warner, G.J., Ideker, T., Bader, G.D.: Integration of biological networks and gene expression data using Cytoscape. Nat. Protoc. 2, 2366–2382 (2007)CrossRefGoogle Scholar
- 27.Scott, M.S., Perkins, T., Bunnell, S., Pepin, F., Thomas, D.Y., Hallett, M.: Identifying regulatory subnetworks for a set of genes. Mol. Cell. Proteomics 4, 683–692 (2005)CrossRefGoogle Scholar
- 28.Guo, Z., Li, Y., Gong, X., Yao, C., Ma, W., Wang, D., Li, Y., Zhu, J., Zhang, M., Yang, D., Wang, J.: Edge-based scoring and searching method for identifying condition-responsive protein–protein interaction sub-network. Bioinformatics 23, 2121–2128 (2007)CrossRefGoogle Scholar
- 29.Zhang, B., Horvath, S.: A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 4, Article 17 (2005)Google Scholar
- 30.Ideker, T., Ozier, O., Schwikowski, B., Siegel, A.F.: Discovering regulatory and signalling circuits in molecular interaction networks. Bioinform. Oxf. Engl. 18(Suppl 1), S233–S240 (2002)CrossRefGoogle Scholar
- 31.Elliott, T.O.J.P., Owolabi, O., Donkor, S., Kampmann, B., Hill, P.C., Ottenhoff, T.H.M., Haks, M.C., Kaufmann, S.H.E., Maertzdorf, J., Sutherland, J.S.: Dysregulation of apoptosis is a risk factor for tuberculosis disease progression. J. Infect. Dis. 212, 1469–1479 (2015)CrossRefGoogle Scholar
- 32.Maertzdorf, J., Ota, M., Repsilber, D., Mollenkopf, H.J., Weiner, J., Hill, P.C., Kaufmann, S.H.E.: Functional correlations of pathogenesis-driven gene expression signatures in tuberculosis. PLoS ONE 6, e26938 (2011)CrossRefGoogle Scholar
- 33.Raskin, L., Fullen, D.R., Giordano, T.J., Thomas, D.G., Frohm, M.L., Cha, K.B., Ahn, J., Mukherjee, B., Johnson, T.M., Gruber, S.B.: Transcriptome profiling identifies HMGA2 as a biomarker of melanoma progression and prognosis. J. Invest. Dermatol. 133, 2585–2592 (2013)CrossRefGoogle Scholar
- 34.Gharaibeh, R.Z., Fodor, A.A., Gibas, C.J.: Background correction using dinucleotide affinities improves the performance of GCRMA. BMC Bioinform. 9, 452 (2008)CrossRefGoogle Scholar
- 35.Smyth, G.K.: limma: linear models for microarray data. In: Gentleman, R., Carey, V.J., Huber, W., Irizarry, R.A., Dudoit, S. (eds.) Bioinformatics and Computational Biology Solutions Using R and Bioconductor, pp. 397–420. Springer, New York (2005)CrossRefGoogle Scholar
- 36.Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc. Ser. B (Methodol.) 57, 289–300 (1995)MathSciNetzbMATHGoogle Scholar
- 37.Sambarey, A., Devaprasad, A., Baloni, P., Mishra, M., Mohan, A., Tyagi, P., Singh, A., Akshata, J.S., Sultana, R., Buggi, S., Chandra, N.: Meta-analysis of host response networks identifies a common core in tuberculosis. Npj Syst. Biol. Appl. 3, 4 (2017)CrossRefGoogle Scholar
- 38.Sambarey, A., Prashanthi, K., Chandra, N.: Mining large-scale response networks reveals ‘topmost activities’ in Mycobacterium tuberculosis infection. Sci. Rep. 3, 2302 (2013)CrossRefGoogle Scholar
- 39.Meyer, P.E., Lafitte, F., Bontempi, G.: minet: a R/Bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinform. 9, 461 (2008)CrossRefGoogle Scholar
- 40.Bindea, G., Mlecnik, B., Hackl, H., Charoentong, P., Tosolini, M., Kirilovsky, A., Fridman, W.-H., Pagès, F., Trajanoski, Z., Galon, J.: ClueGO: a Cytoscape plug-into decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics 25, 1091–1093 (2009)CrossRefGoogle Scholar
- 41.Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, B., Ideker, T.: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003)CrossRefGoogle Scholar
- 42.Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M., Tanabe, M.: KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 44, D457–D462 (2016)CrossRefGoogle Scholar
- 43.Joshi-Tope, G., Gillespie, M., Vastrik, I., D’Eustachio, P., Schmidt, E., de Bono, B., Jassal, B., Gopinath, G.R., Wu, G.R., Matthews, L., Lewis, S., Birney, E., Stein, L.: Reactome: a knowledgebase of biological pathways. Nucleic Acids Res. 33, D428–D432 (2005)CrossRefGoogle Scholar
- 44.Slenter, D.N., Kutmon, M., Hanspers, K., Riutta, A., Windsor, J., Nunes, N., Mélius, J., Cirillo, E., Coort, S.L., Digles, D., Ehrhart, F., Giesbertz, P., Kalafati, M., Martens, M., Miller, R., Nishida, K., Rieswijk, L., Waagmeester, A., Eijssen, L.M.T., Evelo, C.T., Pico, A.R., Willighagen, E.L.: WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research. Nucleic Acids Res. 46, D661–D667 (2018)CrossRefGoogle Scholar
- 45.Piñero, J., Bravo, À., Queralt-Rosinach, N., Gutiérrez-Sacristán, A., Deu-Pons, J., Centeno, E., García-García, J., Sanz, F., Furlong, L.I.: DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 45, D833–D839 (2017)CrossRefGoogle Scholar
- 46.Rappaport, N., Nativ, N., Stelzer, G., Twik, M., Guan-Golan, Y., Iny Stein, T., Bahir, I., Belinky, F., Morrey, C.P., Safran, M., Lancet, D.: MalaCards: an integrated compendium for diseases and their annotation. Database (Oxf.) (2013). https://doi.org/10.1093/database/bat018 CrossRefGoogle Scholar
- 47.Assenov, Y., Ramírez, F., Schelhorn, S.-E., Lengauer, T., Albrecht, M.: Computing topological parameters of biological networks. Bioinform. Oxf. Engl. 24, 282–284 (2008)CrossRefGoogle Scholar
- 48.Joosten, S.A., Fletcher, H.A., Ottenhoff, T.H.M.: A helicopter perspective on TB biomarkers: pathway and process based analysis of gene expression data provides new insight into TB pathogenesis. PLoS ONE 8, e73230 (2013)CrossRefGoogle Scholar
- 49.Dorhoi, A., Kaufmann, S.H.E.: Perspectives on host adaptation in response to Mycobacterium tuberculosis: modulation of inflammation. Semin. Immunol. 26, 533–542 (2014)CrossRefGoogle Scholar
- 50.Capparelli, C., Rosenbaum, S., Berman-Booty, L.D., Salhi, A., Gaborit, N., Zhan, T., Chervoneva, I., Roszik, J., Woodman, S.E., Davies, M.A., Setiady, Y.Y., Osman, I., Yarden, Y., Aplin, A.E.: ErbB3/ErbB2 complexes as a therapeutic target in a subset of wild-type BRAF/NRAS cutaneous melanomas. Cancer Res. 75, 3554–3567 (2015)CrossRefGoogle Scholar
- 51.Ferretta, A., Maida, I., Guida, S., Azzariti, A., Porcelli, L., Tommasi, S., Zanna, P., Cocco, T., Guida, M., Guida, G.: New insight into the role of metabolic reprogramming in melanoma cells harboring BRAF mutations. Biochim. Biophys. Acta BBA Mol. Cell Res. 1863, 2710–2718 (2016)CrossRefGoogle Scholar
- 52.Fischer, G.M., Vashisht Gopal, Y.N., McQuade, J.L., Peng, W., DeBerardinis, R.J., Davies, M.A.: Metabolic strategies of melanoma cells: mechanisms, interactions with the tumor microenvironment, and therapeutic implications. Pigment Cell Melanoma Res. (2017). https://doi.org/10.1111/pcmr.12661 CrossRefGoogle Scholar
- 53.Allen, J.D., Xie, Y., Chen, M., Girard, L., Xiao, G.: Comparing statistical methods for constructing large scale gene networks. PLoS ONE 7, e29348 (2012)CrossRefGoogle Scholar
- 54.Visconti, A., Esposito, R., Cordero, F.: Tackling the DREAM challenge for gene regulatory networks reverse engineering. In: AI*IA 2011: artificial intelligence around man and beyond, pp. 372–382. Springer, Berlin, Heidelberg (2011)Google Scholar
- 55.Olsen, C., Fleming, K., Prendergast, N., Rubio, R., Emmert-Streib, F., Bontempi, G., Haibe-Kains, B., Quackenbush, J.: Inference and validation of predictive gene networks from biomedical literature and gene expression data. Genomics 103, 329–336 (2014)CrossRefGoogle Scholar