Advertisement

Bicliques in Graphs with Correlated Edges: From Artificial to Biological Networks

  • Aaron Kershenbaum
  • Alicia Cutillo
  • Christian Darabos
  • Keitha Murray
  • Robert Schiaffino
  • Jason H. Moore
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9597)

Abstract

Networks representing complex biological interactions are often very intricate and rely on algorithmic tools for thorough quantitative analysis. In bi-layered graphs, identifying subgraphs of potential biological meaning relies on identifying bicliques between two sets of associated nodes, or variables – for example, diseases and genetic variants. Researchers have developed multiple approaches for forming bicliques and it is important to understand the features of these models and their applicability to real-life problems. We introduce a novel algorithm specifically designed for finding maximal bicliques in large datasets. In this study, we applied this algorithm to a variety of networks, including artificially generated networks as well as biological networks based on phenotype-genotype and phenotype-pathway interactions. We analyzed performance with respect to network features including density, node degree distribution, and correlation between nodes, with density being the major contributor to computational complexity. We also examined sample bicliques and postulate that these bicliques could be useful in elucidating the genetic and biological underpinnings of shared disease etiologies and in guiding hypothesis generation. Moving forward, we propose additional features, such as weighted edges between nodes, that could enhance our study of biological networks.

Keywords

Bipartite Graph Degree Distribution Biological Network Maximal Clique Bipartite Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

This work was supported by National Institutes of Health grants LM009012, LM010098, and EY022300.

References

  1. 1.
    Alexe, G., Alexe, S., Crama, Y., Foldes, S., Hammer, P.L., Simeone, B.: Consensus algorithms for the generation of all maximal bicliques. Discrete Appl. Math. 145(1), 11–21 (2004). Graph Optimization \(\{{\rm IV}\}\) MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Anttila, S., Illi, A., Kampman, O., Mattila, K.M., Lehtimaki, T., Leinonen, E.: Interaction between NOTCH4 and catechol-O-methyltransferase genotypes in schizophrenia patients with poor response to typical neuroleptics. Pharmacogenetics 14(5), 303–307 (2004)CrossRefGoogle Scholar
  3. 3.
    Atkinson, M.A., Eisenbarth, G.S.: Type 1 diabetes: new perspectives on disease pathogenesis and treatment. Lancet 358(9277), 221–229 (2001)CrossRefGoogle Scholar
  4. 4.
    Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proceedings of the International Conference Intelligent Systems for Molecular Biology, vol. 8, pp. 93–103 (2000)Google Scholar
  5. 5.
    Cleynen, I., Boucher, G., Jostins, L., Schumm, L.P., Zeissig, S., Ahmad, T., Andersen, V., Andrews, J.M., Annese, V., Brand, S., Brant, S.R., Cho, J.H., Daly, M.J., Dubinsky, M., Duerr, R.H., Ferguson, L.R., Franke, A., Gearry, R.B., Goyette, P., Hakonarson, H., Halfvarson, J., Hov, J.R., Huang, H., Kennedy, N.A., Kupcinskas, L., Lawrance, I.C., Lee, J.C., Satsangi, J., Schreiber, S., Théâtre, E., van der Meulen-de Jong, A.E., Weersma, R.K., Wilson, D.C., Parkes, M., Vermeire, S., Rioux, J.D., Mansfield, J., Silverberg, M.S., Radford-Smith, G., McGovern, D.P.B., Barrett, J.C., Lees, C.W.: Inherited determinants of Crohn’s disease, ulcerative colitis phenotypes: a genetic association study. Lancet (2015)Google Scholar
  6. 6.
    Darabos, C., Desai, K., Cowper-Sal\(\cdot \)lari, R., Giacobini, M., Graham, B.E., Lupien, M., Moore, J.H.: Inferring human phenotype networks from genome-wide genetic associations. In: Vanneschi, L., Bush, W.S., Giacobini, M. (eds.) EvoBIO 2013. LNCS, vol. 7833, pp. 23–34. Springer, Heidelberg (2013)Google Scholar
  7. 7.
    Darabos, C., Grussing, E.D., Cricco, M.E., Clark, K.A., Moore, J.H.: A bipartite network approach to inferring interactions between environmental exposures and human diseases. In: Pacific Symposium on Biocomputing, pp. 171–182 (2015)Google Scholar
  8. 8.
    Darabos, C., Harmon, S.H., Moore, J.H.: Using the bipartite human phenotype network to reveal pleiotropy and epistasis beyond the gene. In: Pacific Symposium on Biocomputing, pp. 188–199 (2014)Google Scholar
  9. 9.
    Darabos, C., White, M.J., Graham, B.E., Leung, D.N., Williams, S.M., Moore, J.H.: The multiscale backbone of the human phenotype network based on biological pathways. BioData Min. 7(1), 1 (2014)CrossRefGoogle Scholar
  10. 10.
    Dieset, I., Djurovic, S., Tesli, M.: Up-regulation of NOTCH4 gene expression in bipolar disorder. Am. J. Psychiatry 169, 1292–1300 (2012)CrossRefGoogle Scholar
  11. 11.
    Gaspers, S., Kratsch, D., Liedloff, M.: On independent sets and bicliques in graphs. In: Broersma, H., Erlebach, T., Friedetzky, T., Paulusma, D. (eds.) WG 2008. LNCS, vol. 5344, pp. 171–182. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  12. 12.
    Gondran, M., Minoux, M., Vajda, S.: Graphs and Algorithms. Wiley, New York (1984)zbMATHGoogle Scholar
  13. 13.
    Lernmark, A.: Multiple sclerosis and type 1 diabetes: an unlikely alliance. Lancet 359(9316), 1450–1451 (2002)CrossRefGoogle Scholar
  14. 14.
    Li, J., Li, H., Soh, D., Wong, L.: A correspondence between maximal complete bipartite subgraphs and closed patterns. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 146–156. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  15. 15.
    Liu, J., Wang, W.: Op-cluster: clustering by tendency in high dimensional space. In: 2003 Third IEEE International Conference on Data Mining, ICDM 2003, pp. 187–194, November 2003Google Scholar
  16. 16.
    Liu, X., Li, J., Wang, L.: Modeling protein interacting groups by quasi-bicliques: complexity, algorithm, and application. IEEE/ACM Trans. Comput. Biol. Bioinform. 7(2), 354–364 (2010)CrossRefGoogle Scholar
  17. 17.
    Makino, K., Uno, T.: New algorithms for enumerating all maximal cliques. In: Hagerup, T., Katajainen, J. (eds.) SWAT 2004. LNCS, vol. 3111, pp. 260–272. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  18. 18.
    Maulik, U., Mukhopadhyay, A., Bhattacharyya, M., Kaderali, L., Brors, B., Bandyopadhyay, S., Eils, R.: Mining quasi-bicliques from HIV-1-human protein interaction network: a multiobjective biclustering approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(2), 423–435 (2013)CrossRefGoogle Scholar
  19. 19.
    Milner, R.: Bigraphs and their algebra. Electron. Notes Theor. Comput. Sci. 209, 5–19 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Newman, M.: Networks: An Introduction. Oxford University Press Inc., New York (2010)CrossRefzbMATHGoogle Scholar
  21. 21.
    Pacheco, R., Contreras, F., Zouali, M.: The dopaminergic system in autoimmune diseases. Front. Immunol. 5, 1–17 (2014)CrossRefGoogle Scholar
  22. 22.
    Prisner, E.: Bicliques in graphs I: bounds on their number. Combinatorica 20(1), 109–117 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Qiu, J., Darabos, C., Moore, J.H.: Studying the genetics of complex diseases with ethnicity-specific human phenotype networks: the case of type 2 diabetes in east asian populations. In: 5th Translational Bioinformatics Conference (2014)Google Scholar
  24. 24.
    Sanderson, M.J., Driskell, A.C., Ree, R.H., Eulenstein, O., Langley, S.: Obtaining maximal concatenated phylogenetic data sets from large sequence databases. Mol. Biol. Evol. 20(7), 1036–1042 (2003)CrossRefGoogle Scholar
  25. 25.
    Sim, K., Li, J., Gopalkrishnan, V., Liu, G.: Mining maximal quasi-bicliques to co-cluster stocks and financial ratios for value investment. In 2006 Sixth International Conference on Data Mining, ICDM 2006, pp. 1059–1063, December 2006Google Scholar
  26. 26.
    Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(Suppl 1), S136–44 (2002)CrossRefGoogle Scholar
  27. 27.
    Wang, H., Wang, W., Yang, J., Yu, P.S.: Clustering by pattern similarity in large data sets. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2002, pp. 394–405. ACM, New York (2002)Google Scholar
  28. 28.
    Zhang, Y., Phillips, C.A., Rogers, G.L., Baker, E.J., Chesler, E.J., Langston, M.A.: On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC Bioinform. 15, 110 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Aaron Kershenbaum
    • 1
  • Alicia Cutillo
    • 1
  • Christian Darabos
    • 1
    • 3
  • Keitha Murray
    • 2
  • Robert Schiaffino
    • 2
  • Jason H. Moore
    • 1
  1. 1.Institute for Biomedical Informatics, The Perelman School of MedicineUniversity of PennsylvaniaPhiladelphiaUSA
  2. 2.Department of Computer ScienceIona CollegeNew RochelleUSA
  3. 3.Research ComputingDartmouth CollegeHanoverUSA

Personalised recommendations