Abstract
The network structure of protein-protein interaction (PPI) networks has been studied for over a decade. Many theoretical models have been proposed to model PPI networks, but continuing noise and incompleteness in these networks make conclusions difficult. Graphlet-based measures are believed to be among the strongest, most discerning and sensitive network comparison tools available. Several graphlet-based measures have been proposed to measure topological agreement between networks and models, with little work done to compare the measures themselves. The last modeling attempt was 4 years ago; it is time for an update. Using Sept. 2018 BioGRID, we fit eight theoretical models to nine BioGRID networks using four different graphlet-based measures. We find the following: (1) Graph Kernel is the best measure based on ROC and AUPR curves; (2) most graphlet measures disagree on the ordering of the data-model fits, although most agree on the top two (STICKY and Hyperbolic Geometric) and bottom two (ER and GEO) models, in direct contradiction to the 4-years-ago conclusion that GEO models are best; (3) the STICKY model is overall the best fit for these PPI networks but the Hyperbolic Geometric model is a better fit than STICKY on 4 species; and (4) even the best models provide p-values for BioGRID that are many orders of magnitude smaller than 1, thus failing any reasonable hypothesis test. We conclude that in spite of STICKY being the best fit, all BioGRID networks fail all hypothesis tests against all existing models, using all existing graphlet-based measures. Further work is needed to discover whether the data or the models are at fault.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aldecoa, R., Orsini, C., Krioukov, D.: Hyperbolic graph generator. Comput. Phys. Commun. 196, 492–496 (2015)
Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)
Barabási, A., Dezso, Z., Ravasz, E., Yook, Z.H., Oltvai, Z.N.: Scale-free and hierarchical structures in complex networks. In: Modeling of Complex Systems: Seventh Granada Lectures. AIP Conference Proceedings, vol. 661, pp. 1–16 (2003)
Bianconi, G., Pin, P., Marsili, M.: Assessing the relevance of node features for network structure. Proc. Nat. Acad. Sci. 106(28), 11433–11438 (2009)
Chatr-Aryamontri, A., et al.: The BioGRID interaction database: 2017 update. Nucleic Acids Res. 45(D1), D369–D379 (2017)
Davis, D., Yaveroğlu, Ö.N., Malod-Dognin, N., Stojmirovic, A., Pržulj, N.: Topology-function conservation in protein-protein interaction networks. Bioinformatics 31(10), 1632–1639 (2015). https://doi.org/10.1093/bioinformatics/btv026
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM (2006)
Erdös, P., Rényi, A.: On random graphs. Publicationes Mathematicae 6, 290–297 (1959)
Hayes, W., Sun, K., Pržulj, N.: Graphlet-based measures are suitable for biological network comparison. Bioinformatics 29(4), 483–491 (2013)
Higham, D., Rašajski, M., Pržulj, N.: Fitting a geometric graph to a protein-protein interaction network. Bioinformatics 24(8), 1093–1099 (2008)
Hočevar, T., Demšar, J.: A combinatorial approach to graphlet counting. Bioinformatics 30(4), 559–565 (2014). https://doi.org/10.1093/bioinformatics/btt717
Janjić, V., Pržulj, N.: The topology of the growing human interactome data. J. Integr. Bioinform. 11(2), 27–42 (2014)
Janjić, V., Sharan, R., Pržulj, N.: Modelling the yeast interactome. Sci. Rep. 4, 4273 (2014)
Karlebach, G., Shamir, R.: Modelling and analysis of gene regulatory networks. Nat. Rev. Mol. Cell Biol. 9(10), 770 (2008)
Kashtan, N., Itzkovitz, S., Milo, R., Alon, U.: Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20(11), 1746–1758 (2004)
Kotlyar, M., Pastrello, C., Malik, Z., Jurisica, I.: IID 2018 update: context-specific physical protein-protein interactions in human, model organisms and domesticated species. Nucleic Acids Res. 47(D1), D581–D589 (2018)
Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A., Boguná, M.: Hyperbolic geometry of complex networks. Phys. Rev. E 82(3), 036106 (2010)
Kuchaiev, O., Pržulj, N.: Integrative network alignment reveals large regions of global network similarity in yeast and human. Bioinformatics 27, 1390–1396 (2011). https://doi.org/10.1093/bioinformatics/btr127
Kuchaiev, O., Milenković, T., Memišević, V., Hayes, W., Pržulj, N.: Topological network alignment uncovers biological function and phylogeny. J. R. Soc. Interface 7(50), 1341–1354 (2010). https://doi.org/10.1098/rsif.2010.0063
Luck, K., Sheynkman, G.M., Zhang, I., Vidal, M.: Proteome-scale human interactomics. Trends Biochem. Sci. 42, 342–354 (2017)
Luo, F., Yang, Y., Chen, C.F., Chang, R., Zhou, J., Scheuermann, R.H.: Modular organization of protein interaction networks. Bioinformatics 23(2), 207–214 (2006)
Malod-Dognin, N., Pržulj, N.: L-GRAAL: Lagrangian graphlet-based network aligner. Bioinformatics 31(13), 2182–2189 (2015)
Mamano, N., Hayes, W.B.: SANA: simulated annealing far outperforms many other search algorithms for biological network alignment. Bioinformatics 33, 2156–2164 (2017)
Milano, M., et al.: An extensive assessment of network alignment algorithms for comparison of brain connectomes. BMC Bioinform. 18(6), 235 (2017)
Milenković, T., Lai, J., Pržulj, N.: GraphCrunch: a tool for large network analyses. BMC Bioinform. 9, 70 (2008)
Milenković, T., Pržulj, N.: Uncovering biological network function via graphlet degree signatures. Cancer Inform. 6, 257–273 (2008)
Milenković, T., Ng, W.L., Hayes, W., Pržulj, N.: Optimal network alignment with graphlet degree vectors. Cancer Inform. 9, 121–137 (2010). https://doi.org/10.4137/CIN.S4744. http://www.la-press.com/optimal-network-alignment-with-graphlet-degree-vectors-article-a2141
Penrose, M.: Random Geometric Graphs. Oxford Studies in Probability. Oxford University Press, Oxford (2003)
Petit, J., Kavelaars, J., Gladman, B., Loredo, T.: Size distribution of multikilometer transneptunian objects. In: The Solar System Beyond Neptune, pp. 71–87 (2008)
Pinkert, S., Schultz, J., Reichardt, J.: Protein interaction networks-more than mere modules. PLoS Comput. Biol. 6(1), e1000659 (2010)
Pržulj, N.: Biological network comparison using graphlet degree distribution. Bioinformatics 20, e177–e183 (2007)
Pržulj, N., Corneil, D.G., Jurisica, I.: Modeling interactome: scale-free or geometric? Bioinformatics 20(18), 3508–3515 (2004). https://doi.org/10.1093/bioinformatics/bth436. http://bioinformatics.oxfordjournals.org/content/20/18/3508.abstract
Pržulj, N., Higham, D.: Modelling protein-protein interaction networks via a stickiness index. J. R. Soc. Interface 3(10), 711–716 (2006)
Pržulj, N., Kuchaiev, O., Stevanović, A., Hayes, W.: Geometric evolutionary dynamics of protein interaction networks. In: Pacific Symposium on Biocomputing (2010)
Pržulj, N., Kuchaiev, O., Stevanović, A., Hayes, W.: Geometric evolutionary dynamics of protein interaction networks. In: Proceedings of the 2010 Pacific Symposium on Biocomputing (PSB), 4–8 January 2010, Big Island, Hawaii (2010)
Pržulj, N., Milenković, T.: Computational methods for analyzing and modeling biological networks. In: Chen, J., Lonardi, S. (eds.) Biological Data Mining. CRC Press (2009, To appear)
Pržulj, N.: Biological network comparison using graphlet degree distribution. Bioinformatics 23(2), e177–e183 (2007)
Rito, T., Wang, Z., Deane, C.M., Reinert, G.: How threshold behaviour affects the use of subgraphs for network comparison. Bioinformatics 26(18), i611–i617 (2010). https://doi.org/10.1093/bioinformatics/btq386
Salathé, M., May, R.M., Bonhoeffer, S.: The evolution of network topology by selective removal. R. Soc. Interface 2, 533–536 (2005)
Shen-Orr, S., Milo, R., Mangan, S., Alon, U.: Network motifs in the transcriptional regulation network of Escherichia coli. Nat. Genet. 31(1), 64–68 (2002)
Shervashidze, N., Vishwanathan, S., Petri, T., Mehlhorn, K., Borgwardt, K.: Efficient graphlet kernels for large graph comparison. In: Artificial Intelligence and Statistics, pp. 488–495 (2009)
Sole, R.V., Manrubia, S.C., Benton, M., Bak, P.: Self-similarity of extinction statistics in the fossil record. Nature 388(6644), 764 (1997)
Vázquez, A., Flammini, A., Maritan, A., Vespignani, A.: Modeling of protein interaction networks. Complexus 1(1), 38–44 (2003)
Vijayan, V., Saraph, V., Milenković, T.: MAGNA++: maximizing accuracy in global network alignment via both node and edge conservation. Bioinformatics 31, 2409–2411 (2015)
Vishveshwara, S., Brinda, K., Kannan, N.: Protein structure: insights from graph theory. J. Theor. Comput. Chem. 1(01), 187–211 (2002)
Wang, Z., Zhang, J.: In search of the biological significance of modular structures in protein networks. PLoS Comput. Biol. 3(6), e107 (2007)
Yaveroğlu, N., et al.: Revealing the hidden language of complex networks. Sci. Rep. 4, 4547 (2014)
Yaveroğlu, Ö.N., Milenković, T., Pržulj, N.: Proper evaluation of alignment-free network comparison methods. Bioinformatics 31(16), 2697–2704 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Maharaj, S., Ohiba, Z., Hayes, W. (2019). Comparing Different Graphlet Measures for Evaluating Network Model Fits to BioGRID PPI Networks. In: Holmes, I., Martín-Vide, C., Vega-Rodríguez, M. (eds) Algorithms for Computational Biology. AlCoB 2019. Lecture Notes in Computer Science(), vol 11488. Springer, Cham. https://doi.org/10.1007/978-3-030-18174-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-18174-1_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18173-4
Online ISBN: 978-3-030-18174-1
eBook Packages: Computer ScienceComputer Science (R0)