Integrative Systems Biology Approaches to Identify and Prioritize Disease and Drug Candidate Genes

  • Vivek Kaimal
  • Divya Sardana
  • Eric E. Bardes
  • Ranga Chandra Gudivada
  • Jing Chen
  • Anil G. Jegga
Part of the Methods in Molecular Biology book series (MIMB, volume 700)


Although a number of computational approaches have been developed to integrate data from multiple sources for the purpose of predicting or prioritizing candidate disease genes, relatively few of them focus on identifying or ranking drug targets. To address this deficit, we have developed an approach to specifically identify and prioritize disease and drug candidate genes. In this chapter, we demonstrate the applicability of integrative systems-biology-based approaches to identify potential drug targets and candidate genes by employing information extracted from public databases. We illustrate the method in detail using examples of two neurodegenerative diseases (Alzheimer’s and Parkinson’s) and one neuropsychiatric disease (Schizophrenia).

Key words

Candidate gene prioritization Disease gene ranking Drug target ranking Integrative genomics Systems biology Alzheimer’s disease Parkinson’s disease Schizophrenia 


  1. 1.
    Russ AP, Lampel S (2005) The druggable genome: an update. Drug Discov Today 10:1607–1610PubMedCrossRefGoogle Scholar
  2. 2.
    Hopkins AL, Groom CR (2002) The druggable genome. Nat Rev Drug Discov 1: 727–730PubMedCrossRefGoogle Scholar
  3. 3.
    Plewczynski D, Rychlewski L (2009) Meta-basic estimates the size of druggable human genome. J Mol Model 15:695–699PubMedCrossRefGoogle Scholar
  4. 4.
    Yildirim MA, Goh KI, Cusick ME, Barabasi AL, Vidal M (2007) Drug-target network. Nat Biotechnol 25:1119–1126PubMedCrossRefGoogle Scholar
  5. 5.
    Sakharkar MK, Sakharkar KR, Pervaiz S (2007) Druggability of human disease genes. Int J Biochem Cell Biol 39:1156–1164PubMedCrossRefGoogle Scholar
  6. 6.
    Freudenberg J, Propping P (2002) A similarity-based method for genome-wide prediction of disease-relevant human genes. Bioinformatics 18(Suppl 2):S110–S115PubMedCrossRefGoogle Scholar
  7. 7.
    Turner FS, Clutterbuck DR, Semple CA (2003) POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol 4:R75PubMedCrossRefGoogle Scholar
  8. 8.
    Tiffin N, Kelso JF, Powell AR, Pan H, Bajic VB, Hide WA (2005) Integration of text- and data-mining using ontologies successfully selects disease gene candidates. Nucleic Acids Res 33:1544–1552PubMedCrossRefGoogle Scholar
  9. 9.
    Adie EA, Adams RR, Evans KL, Porteous DJ, Pickard BS (2005) Speeding disease gene ­discovery by sequence based candidate prioritization. BMC Bioinform 6:55CrossRefGoogle Scholar
  10. 10.
    Aerts S, Lambrechts D, Maity S et al (2006) Gene prioritization through genomic data fusion. Nat Biotechnol 24:537–544PubMedCrossRefGoogle Scholar
  11. 11.
    Thornblad TA, Elliott KS, Jowett J, Visscher PM (2007) Prioritization of positional candidate genes using multiple web-based software tools. Twin Res Hum Genet 10:861–870PubMedCrossRefGoogle Scholar
  12. 12.
    Zhu M, Zhao S (2007) Candidate gene identification approach: progress and challenges. Int J Biol Sci 3:420–427PubMedCrossRefGoogle Scholar
  13. 13.
    Tiffin N, Adie E, Turner F et al (2006) Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes. Nucleic Acids Res 34:3067–3081PubMedCrossRefGoogle Scholar
  14. 14.
    Adie EA, Adams RR, Evans KL, Porteous DJ, Pickard BS (2006) SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics 22:773–774PubMedCrossRefGoogle Scholar
  15. 15.
    Chen J, Xu H, Aronow BJ, Jegga AG (2007) Improved human disease candidate gene prioritization using mouse phenotype. BMC Bioinform 8:392CrossRefGoogle Scholar
  16. 16.
    Chen J, Bardes EE, Aronow BJ, Jegga AG (2009) ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res 37:W305–W311PubMedCrossRefGoogle Scholar
  17. 17.
    Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabasi AL (2007) The human disease network. Proc Natl Acad Sci USA 104:8685–8690PubMedCrossRefGoogle Scholar
  18. 18.
    Jimenez-Sanchez G, Childs B, Valle D (2001) Human disease genes. Nature 409:853–855PubMedCrossRefGoogle Scholar
  19. 19.
    Smith NG, Eyre-Walker A (2003) Human disease genes: patterns and predictions. Gene 318:169–175PubMedCrossRefGoogle Scholar
  20. 20.
    Tranchevent LC, Barriot R, Yu S et al (2008) ENDEAVOUR update: a web resource for gene prioritization in multiple species. Nucleic Acids Res 36:W377–W384PubMedCrossRefGoogle Scholar
  21. 21.
    Rual JF, Venkatesan K, Hao T et al (2005) Towards a proteome-scale map of the human protein-protein interaction network. Nature 437:1173–1178PubMedCrossRefGoogle Scholar
  22. 22.
    Stelzl U, Worm U, Lalowski M et al (2005) A human protein–protein interaction network: a resource for annotating the proteome. Cell 122:957–968PubMedCrossRefGoogle Scholar
  23. 23.
    George RA, Liu JY, Feng LL, Bryson-Richardson RJ, Fatkin D, Wouters MA (2006) Analysis of protein sequence and interaction data for candidate disease gene prediction. Nucleic Acids Res 34:e130PubMedCrossRefGoogle Scholar
  24. 24.
    Kann MG (2007) Protein interactions and disease: computational approaches to uncover the etiology of diseases. Brief Bioinform 8:333–346PubMedCrossRefGoogle Scholar
  25. 25.
    Kohler S, Bauer S, Horn D, Robinson PN (2008) Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet 82:949–958PubMedCrossRefGoogle Scholar
  26. 26.
    Wu X, Jiang R, Zhang MQ, Li S (2008) Network-based global inference of human disease genes. Mol Syst Biol 4:189PubMedCrossRefGoogle Scholar
  27. 27.
    Xu J, Li Y (2006) Discovering disease-genes by topological features in human protein-protein interaction network. Bioinformatics 22:2800–2805PubMedCrossRefGoogle Scholar
  28. 28.
    Chen JY, Shen C, Sivachenko AY (2006) Mining Alzheimer disease relevant proteins from integrated protein interactome data. Pac Symp Biocomput 11:367–378Google Scholar
  29. 29.
    Ortutay C, Vihinen M (2009) Identification of candidate disease genes by integrating Gene Ontologies and protein-interaction networks: case study of primary immunodeficiencies. Nucleic Acids Res 37:622–628PubMedCrossRefGoogle Scholar
  30. 30.
    Chen J, Aronow BJ, Jegga AG (2009) Disease candidate gene identification and prioritization using protein interaction networks. BMC Bioinform 10:73CrossRefGoogle Scholar
  31. 31.
    Junker BH, Koschutzki D, Schreiber F (2006) Exploration of biological network centralities with CentiBiN. BMC Bioinform 7:219CrossRefGoogle Scholar
  32. 32.
    Popescu M, Keller JM, Mitchell JA (2006) Fuzzy measures on the gene ontology for gene product similarity. IEEE/ACM Trans Comput Biol Bioinform 3:263–274PubMedCrossRefGoogle Scholar
  33. 33.
    White S, Smyth P (2003) Algorithms for estimating relative importance in networks. In: KDD ‘03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM Press, 266–275Google Scholar
  34. 34.
    Kleinberg J (1999) Authoritative sources in a hyperlinked environment. J ACM 46: 604–632CrossRefGoogle Scholar
  35. 35.
    Hamosh A, Scott A, Amberger J, Bocchini C, McKusick V (2005) Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 33:D514–D517PubMedCrossRefGoogle Scholar
  36. 36.
    Becker KG, Barnes KC, Bright TJ, Wang SA (2004) The genetic association database. Nat Genet 36:431–432PubMedCrossRefGoogle Scholar
  37. 37.
    Hindorff LA, Sethupathy P, Junkins HA et al (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA 106:9362–9367PubMedCrossRefGoogle Scholar
  38. 38.
    Davis AP, Murphy CG, Saraceni-Richards CA, Rosenstein MC, Wiegers TC, Mattingly CJ (2009) Comparative Toxicogenomics Database: a knowledgebase and discovery tool for chemical-gene-disease networks. Nucleic Acids Res 37:D786–D792PubMedCrossRefGoogle Scholar
  39. 39.
    Wishart DS, Knox C, Guo AC et al (2008) DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res 36:D901–D906PubMedCrossRefGoogle Scholar
  40. 40.
    Bertram L, McQueen MB, Mullin K, Blacker D, Tanzi RE (2007) Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database. Nat Genet 39:17–23PubMedCrossRefGoogle Scholar
  41. 41.
    King MC, Wilson AC (1975) Evolution at two levels in humans and chimpanzees. Science 188:107–116PubMedCrossRefGoogle Scholar
  42. 42.
    Korstanje R, Paigen B (2002) From QTL to gene: the harvest begins. Nat Genet 31:235–236PubMedCrossRefGoogle Scholar
  43. 43.
    Mackay TF (2001) Quantitative trait loci in Drosophila. Nat Rev Genet 2:11–20PubMedCrossRefGoogle Scholar
  44. 44.
    Giot L, Bader JS, Brouwer C et al (2003) A protein interaction map of Drosophila melanogaster. Science 302:1727–1736PubMedCrossRefGoogle Scholar
  45. 45.
    Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y (2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA 98:4569–4574PubMedCrossRefGoogle Scholar
  46. 46.
    Li S, Armstrong CM, Bertin N et al (2004) A map of the interactome network of the metazoan C. elegans. Science 303:540–543PubMedCrossRefGoogle Scholar
  47. 47.
    Uetz P, Giot L, Cagney G et al (2000) A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae. Nature 403:623–627PubMedCrossRefGoogle Scholar
  48. 48.
    Oti M, Snel B, Huynen MA, Brunner HG (2006) Predicting disease genes using protein–protein interactions. J Med Genet 43:691–698PubMedCrossRefGoogle Scholar
  49. 49.
    Huynen MA, Snel B, van Noort V (2004) Comparative genomics for reliable protein-function prediction from genomic data. Trends Genet 20:340–344PubMedCrossRefGoogle Scholar
  50. 50.
    Bortoluzzi S, Romualdi C, Bisognin A, Danieli GA (2003) Disease genes and intracellular protein networks. Physiol Genomics 15:223–227PubMedGoogle Scholar
  51. 51.
    Huang H, Winter EE, Wang H et al (2004) Evolutionary conservation and selection of human disease gene orthologs in the rat and mouse genomes. Genome Biol 5:R47PubMedCrossRefGoogle Scholar
  52. 52.
    Lopez-Bigas N, Ouzounis CA (2004) Genome-wide identification of genes likely to be involved in human genetic disease. Nucleic Acids Res 32:3108–3114PubMedCrossRefGoogle Scholar
  53. 53.
    Perez-Iratxeta C, Bork P, Andrade MA (2002) Association of genes to genetically inherited diseases using data mining. Nat Genet 31:316–319PubMedGoogle Scholar
  54. 54.
    Perez-Iratxeta C, Wjst M, Bork P, Andrade MA (2005) G2D: a tool for mining genes associated with disease. BMC Genet 6:45PubMedCrossRefGoogle Scholar
  55. 55.
    Hristovski D, Peterlin B, Mitchell JA, Humphrey SM (2005) Using literature-based discovery to identify disease candidate genes. Int J Med Inform 74:289–298PubMedCrossRefGoogle Scholar
  56. 56.
    van Driel MA, Cuelenaere K, Kemmeren PP, Leunissen JA, Brunner HG (2003) A new web-based data mining tool for the identification of candidate genes for human genetic disorders. Eur J Hum Genet 11:57–63PubMedCrossRefGoogle Scholar
  57. 57.
    van Driel MA, Cuelenaere K, Kemmeren PP, Leunissen JA, Brunner HG, Vriend G (2005) GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases. Nucleic Acids Res 33:W758–W761PubMedCrossRefGoogle Scholar
  58. 58.
    Masseroli M, Galati O, Pinciroli F (2005) GFINDer: genetic disease and phenotype location statistical analysis and mining of dynamically annotated gene lists. Nucleic Acids Res 33:W717–W723PubMedCrossRefGoogle Scholar
  59. 59.
    Masseroli M, Martucci D, Pinciroli F (2004) GFINDer: Genome Function INtegrated Discoverer through dynamic annotation, statistical analysis, and mining. Nucleic Acids Res 32:W293–W300PubMedCrossRefGoogle Scholar
  60. 60.
    Rossi S, Masotti D, Nardini C et al (2006) TOM: a web-based integrated approach for identification of candidate disease genes. Nucleic Acids Res 34:W285–W292PubMedCrossRefGoogle Scholar
  61. 61.
    van Driel MA, Bruggeman J, Vriend G, Brunner HG, Leunissen JA (2006) A text-mining analysis of the human phenome. Eur J Hum Genet 14:535–542PubMedCrossRefGoogle Scholar
  62. 62.
    Franke L, Bakel H, Fokkens L, de Jong ED, Egmont-Petersen M, Wijmenga C (2006) Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet 78:1011–1025PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Vivek Kaimal
    • 1
    • 4
  • Divya Sardana
    • 2
  • Eric E. Bardes
    • 1
  • Ranga Chandra Gudivada
    • 1
  • Jing Chen
    • 3
  • Anil G. Jegga
    • 1
    • 4
    • 5
  1. 1.Division of Biomedical InformaticsCincinnati Children’s Hospital Medical CenterCincinnatiUSA
  2. 2.Department of Computer Science, College of EngineeringUniversity of CincinnatiCincinnatiUSA
  3. 3.Department of Environmental HealthUniversity of CincinnatiCincinnatiUSA
  4. 4.Department of Biomedical EngineeringUniversity of CincinnatiCincinnatiUSA
  5. 5.Department of Pediatrics, College of MedicineUniversity of CincinnatiCincinnatiUSA

Personalised recommendations