Abstract
Networks representing complex biological interactions are often very intricate and rely on algorithmic tools for thorough quantitative analysis. In bi-layered graphs, identifying subgraphs of potential biological meaning relies on identifying bicliques between two sets of associated nodes, or variables – for example, diseases and genetic variants. Researchers have developed multiple approaches for forming bicliques and it is important to understand the features of these models and their applicability to real-life problems. We introduce a novel algorithm specifically designed for finding maximal bicliques in large datasets. In this study, we applied this algorithm to a variety of networks, including artificially generated networks as well as biological networks based on phenotype-genotype and phenotype-pathway interactions. We analyzed performance with respect to network features including density, node degree distribution, and correlation between nodes, with density being the major contributor to computational complexity. We also examined sample bicliques and postulate that these bicliques could be useful in elucidating the genetic and biological underpinnings of shared disease etiologies and in guiding hypothesis generation. Moving forward, we propose additional features, such as weighted edges between nodes, that could enhance our study of biological networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alexe, G., Alexe, S., Crama, Y., Foldes, S., Hammer, P.L., Simeone, B.: Consensus algorithms for the generation of all maximal bicliques. Discrete Appl. Math. 145(1), 11–21 (2004). Graph Optimization \(\{{\rm IV}\}\)
Anttila, S., Illi, A., Kampman, O., Mattila, K.M., Lehtimaki, T., Leinonen, E.: Interaction between NOTCH4 and catechol-O-methyltransferase genotypes in schizophrenia patients with poor response to typical neuroleptics. Pharmacogenetics 14(5), 303–307 (2004)
Atkinson, M.A., Eisenbarth, G.S.: Type 1 diabetes: new perspectives on disease pathogenesis and treatment. Lancet 358(9277), 221–229 (2001)
Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proceedings of the International Conference Intelligent Systems for Molecular Biology, vol. 8, pp. 93–103 (2000)
Cleynen, I., Boucher, G., Jostins, L., Schumm, L.P., Zeissig, S., Ahmad, T., Andersen, V., Andrews, J.M., Annese, V., Brand, S., Brant, S.R., Cho, J.H., Daly, M.J., Dubinsky, M., Duerr, R.H., Ferguson, L.R., Franke, A., Gearry, R.B., Goyette, P., Hakonarson, H., Halfvarson, J., Hov, J.R., Huang, H., Kennedy, N.A., Kupcinskas, L., Lawrance, I.C., Lee, J.C., Satsangi, J., Schreiber, S., Théâtre, E., van der Meulen-de Jong, A.E., Weersma, R.K., Wilson, D.C., Parkes, M., Vermeire, S., Rioux, J.D., Mansfield, J., Silverberg, M.S., Radford-Smith, G., McGovern, D.P.B., Barrett, J.C., Lees, C.W.: Inherited determinants of Crohn’s disease, ulcerative colitis phenotypes: a genetic association study. Lancet (2015)
Darabos, C., Desai, K., Cowper-Sal\(\cdot \)lari, R., Giacobini, M., Graham, B.E., Lupien, M., Moore, J.H.: Inferring human phenotype networks from genome-wide genetic associations. In: Vanneschi, L., Bush, W.S., Giacobini, M. (eds.) EvoBIO 2013. LNCS, vol. 7833, pp. 23–34. Springer, Heidelberg (2013)
Darabos, C., Grussing, E.D., Cricco, M.E., Clark, K.A., Moore, J.H.: A bipartite network approach to inferring interactions between environmental exposures and human diseases. In: Pacific Symposium on Biocomputing, pp. 171–182 (2015)
Darabos, C., Harmon, S.H., Moore, J.H.: Using the bipartite human phenotype network to reveal pleiotropy and epistasis beyond the gene. In: Pacific Symposium on Biocomputing, pp. 188–199 (2014)
Darabos, C., White, M.J., Graham, B.E., Leung, D.N., Williams, S.M., Moore, J.H.: The multiscale backbone of the human phenotype network based on biological pathways. BioData Min. 7(1), 1 (2014)
Dieset, I., Djurovic, S., Tesli, M.: Up-regulation of NOTCH4 gene expression in bipolar disorder. Am. J. Psychiatry 169, 1292–1300 (2012)
Gaspers, S., Kratsch, D., Liedloff, M.: On independent sets and bicliques in graphs. In: Broersma, H., Erlebach, T., Friedetzky, T., Paulusma, D. (eds.) WG 2008. LNCS, vol. 5344, pp. 171–182. Springer, Heidelberg (2008)
Gondran, M., Minoux, M., Vajda, S.: Graphs and Algorithms. Wiley, New York (1984)
Lernmark, A.: Multiple sclerosis and type 1 diabetes: an unlikely alliance. Lancet 359(9316), 1450–1451 (2002)
Li, J., Li, H., Soh, D., Wong, L.: A correspondence between maximal complete bipartite subgraphs and closed patterns. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 146–156. Springer, Heidelberg (2005)
Liu, J., Wang, W.: Op-cluster: clustering by tendency in high dimensional space. In: 2003 Third IEEE International Conference on Data Mining, ICDM 2003, pp. 187–194, November 2003
Liu, X., Li, J., Wang, L.: Modeling protein interacting groups by quasi-bicliques: complexity, algorithm, and application. IEEE/ACM Trans. Comput. Biol. Bioinform. 7(2), 354–364 (2010)
Makino, K., Uno, T.: New algorithms for enumerating all maximal cliques. In: Hagerup, T., Katajainen, J. (eds.) SWAT 2004. LNCS, vol. 3111, pp. 260–272. Springer, Heidelberg (2004)
Maulik, U., Mukhopadhyay, A., Bhattacharyya, M., Kaderali, L., Brors, B., Bandyopadhyay, S., Eils, R.: Mining quasi-bicliques from HIV-1-human protein interaction network: a multiobjective biclustering approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(2), 423–435 (2013)
Milner, R.: Bigraphs and their algebra. Electron. Notes Theor. Comput. Sci. 209, 5–19 (2008)
Newman, M.: Networks: An Introduction. Oxford University Press Inc., New York (2010)
Pacheco, R., Contreras, F., Zouali, M.: The dopaminergic system in autoimmune diseases. Front. Immunol. 5, 1–17 (2014)
Prisner, E.: Bicliques in graphs I: bounds on their number. Combinatorica 20(1), 109–117 (2000)
Qiu, J., Darabos, C., Moore, J.H.: Studying the genetics of complex diseases with ethnicity-specific human phenotype networks: the case of type 2 diabetes in east asian populations. In: 5th Translational Bioinformatics Conference (2014)
Sanderson, M.J., Driskell, A.C., Ree, R.H., Eulenstein, O., Langley, S.: Obtaining maximal concatenated phylogenetic data sets from large sequence databases. Mol. Biol. Evol. 20(7), 1036–1042 (2003)
Sim, K., Li, J., Gopalkrishnan, V., Liu, G.: Mining maximal quasi-bicliques to co-cluster stocks and financial ratios for value investment. In 2006 Sixth International Conference on Data Mining, ICDM 2006, pp. 1059–1063, December 2006
Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(Suppl 1), S136–44 (2002)
Wang, H., Wang, W., Yang, J., Yu, P.S.: Clustering by pattern similarity in large data sets. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2002, pp. 394–405. ACM, New York (2002)
Zhang, Y., Phillips, C.A., Rogers, G.L., Baker, E.J., Chesler, E.J., Langston, M.A.: On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC Bioinform. 15, 110 (2014)
Acknowledgments
This work was supported by National Institutes of Health grants LM009012, LM010098, and EY022300.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Kershenbaum, A., Cutillo, A., Darabos, C., Murray, K., Schiaffino, R., Moore, J.H. (2016). Bicliques in Graphs with Correlated Edges: From Artificial to Biological Networks. In: Squillero, G., Burelli, P. (eds) Applications of Evolutionary Computation. EvoApplications 2016. Lecture Notes in Computer Science(), vol 9597. Springer, Cham. https://doi.org/10.1007/978-3-319-31204-0_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-31204-0_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31203-3
Online ISBN: 978-3-319-31204-0
eBook Packages: Computer ScienceComputer Science (R0)