Orphan Diseases, Bioinformatics and Drug Discovery

  • Anil G. Jegga
  • Cheng Zhu
  • Bruce J. Aronow
Part of the Translational Bioinformatics book series (TRBIO, volume 2)


In general, a rare or orphan disease is any disease that affects a small percentage of the population. Since a majority of the known orphan diseases are genetic, they are present throughout the life of affected individuals. Many of the orphan diseases appear early in life and approximately 30% of children with orphan diseases die before the age of 5. The bulk of genes and pathways underlying these diseases remain unknown and pose a major gap in orphan disease research. In spite of technological advances and opportunities available to understand the causes of orphan diseases and for developing innovative medical approaches, most of the current efforts are focused either on a single or related group of orphan diseases. Relatively few studies have attempted global analysis of all orphan diseases. Constructing networks that underlie biological processes and pathways associated with orphan diseases and orphan drugs facilitate identification of the functional units that respond to genetic perturbations and potentially affect disease risk or therapeutics. Analysis of these biological networks can also identify common pathways or processes for multiple orphan diseases that are biologically related. Comprehensive understanding of such molecular bases may provide opportunities for novel interventions that are beneficial for an array of related orphan diseases. In this chapter, we review some of the current bioinformatic analytical options available for orphan disease and drug research including computational approaches for candidate gene prioritization. We also discuss strategies and present examples and case studies of common drugs being repositioned for treatment of orphan diseases.


Rare Disease Exome Sequencing Orphan Drug Drug Discovery Process Orphan Disease 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Adie EA, et al. Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics. 2005;6:55.PubMedCrossRefGoogle Scholar
  2. Adie EA, et al. SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics. 2006;22(6):773–4.PubMedCrossRefGoogle Scholar
  3. Aerts S, et al. Gene prioritization through genomic data fusion. Nat Biotechnol. 2006;24(5):537–44.PubMedCrossRefGoogle Scholar
  4. Altman RB. PharmGKB: a logical home for knowledge relating genotype to drug response phenotype. Nat Genet. 2007;39(4):426.PubMedCrossRefGoogle Scholar
  5. Ashburn TT, Thor KB. Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discov. 2004;3(8):673–83.PubMedCrossRefGoogle Scholar
  6. Ayme S. [Orphanet, an information site on rare diseases]. Soins. 2003; (672):46–7.Google Scholar
  7. Bainbridge MN, et al. Whole-genome sequencing for optimized patient management. Sci Transl Med. 2011;3(87):87re3.PubMedCrossRefGoogle Scholar
  8. Barrett T, et al. NCBI GEO: mining tens of millions of expression profiles–database and tools update. Nucleic Acids Res. 2007;35(Database issue):D760–5.PubMedCrossRefGoogle Scholar
  9. Becker J, et al. Exome sequencing identifies truncating mutations in human SERPINF1 in autosomal-recessive osteogenesis imperfecta. Am J Hum Genet. 2011;88(3):362–71.Google Scholar
  10. Benitez BA, et al. Exome-sequencing confirms DNAJC5 mutations as cause of adult neuronal ceroid-lipofuscinosis. PLoS One. 2011;6(11):e26741.PubMedCrossRefGoogle Scholar
  11. Bilguvar K, et al. Whole-exome sequencing identifies recessive WDR62 mutations in severe brain malformations. Nature. 2010;467(7312):207–10.PubMedCrossRefGoogle Scholar
  12. Boguski MS, Mandl KD, Sukhatme VP. Drug discovery. Repurposing with a difference. Science. 2009;324(5933):1394–5.PubMedCrossRefGoogle Scholar
  13. Bolze A, et al. Whole-exome-sequencing-based discovery of human FADD deficiency. Am J Hum Genet. 2010;87(6):873–81.PubMedCrossRefGoogle Scholar
  14. Botstein D, Risch N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat Genet. 2003;33(Suppl):228–37.PubMedCrossRefGoogle Scholar
  15. Brenk R, et al. Lessons learnt from assembling screening libraries for drug discovery for neglected diseases. ChemMedChem. 2008;3(3):435–44.PubMedCrossRefGoogle Scholar
  16. Byun M, et al. Whole-exome sequencing-based discovery of STIM1 deficiency in a child with fatal classic Kaposi sarcoma. J Exp Med. 2010;207(11):2307–12.PubMedCrossRefGoogle Scholar
  17. Chen JY, Shen C, Sivachenko AY. Mining Alzheimer disease relevant proteins from integrated protein interactome data. Pac Symp Biocomput; 2006:367–78.Google Scholar
  18. Chen J, et al. Improved human disease candidate gene prioritization using mouse phenotype. BMC Bioinformatics. 2007;8:392.PubMedCrossRefGoogle Scholar
  19. Chen J, et al. ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 2009a;37(Web Server issue):W305–11.PubMedCrossRefGoogle Scholar
  20. Chen J, Aronow BJ, Jegga AG. Disease candidate gene identification and prioritization using protein interaction networks. BMC Bioinformatics. 2009b;10:73.PubMedCrossRefGoogle Scholar
  21. Chiang AP, Butte AJ. Systematic evaluation of drug-disease relationships to identify leads for novel drug uses. Clin PharmaTher. 2009;86(5):507–10.Google Scholar
  22. Choi M, et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc Natl Acad Sci U S A. 2009;106(45):19096–101.PubMedCrossRefGoogle Scholar
  23. Erlich Y, et al. Exome sequencing and disease-network analysis of a single family implicate a mutation in KIF1A in hereditary spastic paraparesis. Genome Res. 2011;21(5):658–64.PubMedCrossRefGoogle Scholar
  24. Feldman I, Rzhetsky A, Vitkup D. Network properties of genes harboring inherited disease mutations. Proc Natl Acad Sci U S A. 2008;105(11):4323–8.PubMedCrossRefGoogle Scholar
  25. Field MJ, Boat TF. Rare diseases and orphan products: accelerating research and development. In: Field MJ, Boat TF, Institute of Medicine Committee on Accelerating Rare Diseases Research and Orphan Product Development, editors. Rare diseases and orphan products: accelerating research and development. Washington, DC: National Academies Press; 2010.Google Scholar
  26. Franke L, et al. Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet. 2006;78(6):1011–25.PubMedCrossRefGoogle Scholar
  27. Freudenberg J, Propping P. A similarity-based method for genome-wide prediction of disease-relevant human genes. Bioinformatics. 2002;18 Suppl 2:S110–15.PubMedCrossRefGoogle Scholar
  28. George RA, et al. Analysis of protein sequence and interaction data for candidate disease gene prediction. Nucleic Acids Res. 2006;34(19):e130.PubMedCrossRefGoogle Scholar
  29. Gilissen C, et al. Exome sequencing identifies WDR35 variants involved in Sensenbrenner syndrome. Am J Hum Genet. 2010;87(3):418–23.PubMedCrossRefGoogle Scholar
  30. Gilissen C, et al. Disease gene identification strategies for exome sequencing. Eur J Hum Genet. 2012;20(5):490–7.PubMedCrossRefGoogle Scholar
  31. Goel R, et al. Human Protein Reference Database and Human Proteinpedia as resources for phosphoproteome analysis. Mol Biosyst. 2012;8(2):453–63.PubMedCrossRefGoogle Scholar
  32. Goh KI, et al. The human disease network. Proc Natl Acad Sci U S A. 2007;104(21):8685–90.PubMedCrossRefGoogle Scholar
  33. Gotz A, et al. Exome sequencing identifies mitochondrial alanyl-tRNA synthetase mutations in infantile mitochondrial cardiomyopathy. Am J Hum Genet. 2011;88(5):635–42.PubMedCrossRefGoogle Scholar
  34. Grau D, Serbedzija G. Innovative strategies for drug repurposing. Drug Discov Dev. 2007. Google Scholar
  35. Hamosh A, et al. Online Mendelian Inheritance in Man (OMIM). Hum Mutat. 2000;15(1):57–61.PubMedCrossRefGoogle Scholar
  36. Hoischen A, et al. De novo mutations of SETBP1 cause Schinzel-Giedion syndrome. Nat Genet. 2010;42(6):483–5.PubMedCrossRefGoogle Scholar
  37. Hristovski D, et al. Using literature-based discovery to identify disease candidate genes. Int J Med Inform. 2005;74(2–4):289–98.PubMedCrossRefGoogle Scholar
  38. Iorio F, et al. Discovery of drug mode of action and drug repositioning from transcriptional responses. Proc Natl Acad Sci U S A. 2010a;107(33):14621–6.PubMedCrossRefGoogle Scholar
  39. Iorio F, et al. Identification of small molecules enhancing autophagic function from drug network analysis. Autophagy. 2010b;6(8):1204–5.PubMedCrossRefGoogle Scholar
  40. Isidor B, et al. Truncating mutations in the last exon of NOTCH2 cause a rare skeletal disorder with osteoporosis. Nat Genet. 2011;43(4):306–8.PubMedCrossRefGoogle Scholar
  41. Jimenez-Sanchez G, Childs B, Valle D. Human disease genes. Nature. 2001;409(6822):853–5.PubMedCrossRefGoogle Scholar
  42. Johnson JO, et al. Exome sequencing reveals VCP mutations as a cause of familial ALS. Neuron. 2010;68(5):857–64.PubMedCrossRefGoogle Scholar
  43. Junker BH, Koschutzki D, Schreiber F. Exploration of biological network centralities with CentiBiN. BMC Bioinform. 2006;7:219.CrossRefGoogle Scholar
  44. Kaimal V, et al. Integrative systems biology approaches to identify and prioritize disease and drug candidate genes. Methods Mol Biol. 2011;700:241–59.PubMedCrossRefGoogle Scholar
  45. Kann MG. Protein interactions and disease: computational approaches to uncover the etiology of diseases. Brief Bioinform. 2007;8(5):333–46.PubMedCrossRefGoogle Scholar
  46. King MC, Wilson AC. Evolution at two levels in humans and chimpanzees. Science. 1975;188(4184):107–16.PubMedCrossRefGoogle Scholar
  47. Kingsmore SF, Saunders CJ. Deep sequencing of patient genomes for disease diagnosis: when will it become routine? Sci Transl Med. 2011;3(87):87ps23.PubMedCrossRefGoogle Scholar
  48. Kohler S, et al. Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet. 2008;82(4):949–58.PubMedCrossRefGoogle Scholar
  49. Korstanje R, Paigen B. From QTL to gene: the harvest begins. Nat Genet. 2002;31(3):235–6.PubMedCrossRefGoogle Scholar
  50. Krawitz PM, et al. Identity-by-descent filtering of exome sequence data identifies PIGV mutations in hyperphosphatasia mental retardation syndrome. Nat Genet. 2010;42(10):827–9.PubMedCrossRefGoogle Scholar
  51. Kuhn M, et al. STITCH 3: zooming in on protein-chemical interactions. Nucleic Acids Res. 2012;40(Database issue):D876–80.PubMedCrossRefGoogle Scholar
  52. Lalonde E, et al. Unexpected allelic heterogeneity and spectrum of mutations in Fowler syndrome revealed by next-generation exome sequencing. Hum Mutat. 2010;31(8):918–23.PubMedCrossRefGoogle Scholar
  53. Lamb J, et al. The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science. 2006;313(5795):1929–35.PubMedCrossRefGoogle Scholar
  54. Li Y, Agarwal P. A pathway-based view of human diseases and disease relationships. PLoS One. 2009;4(2):e4346.PubMedCrossRefGoogle Scholar
  55. Linghu B, et al. Genome-wide prioritization of disease genes and identification of disease-disease associations from an integrated human functional linkage network. Genome Biol. 2009;10(9):R91.PubMedCrossRefGoogle Scholar
  56. Lopez-Bigas N, Ouzounis CA. Genome-wide identification of genes likely to be involved in human genetic disease. Nucleic Acids Res. 2004;32(10):3108–14.PubMedCrossRefGoogle Scholar
  57. Mackay TF. Quantitative trait loci in Drosophila. Nat Rev Genet. 2001;2(1):11–20.PubMedCrossRefGoogle Scholar
  58. Majewski J, et al. Mutations in NOTCH2 in families with Hajdu-Cheney syndrome. Hum Mutat. 2011;32(10):1114–17.PubMedCrossRefGoogle Scholar
  59. Masseroli M, Martucci D, Pinciroli F. GFINDer: Genome Function INtegrated Discoverer through dynamic annotation, statistical analysis, and mining. Nucleic Acids Res. 2004;32(Web Server issue):W293–300.PubMedCrossRefGoogle Scholar
  60. Masseroli M, Galati O, Pinciroli F. GFINDer: genetic disease and phenotype location statistical analysis and mining of dynamically annotated gene lists. Nucleic Acids Res. 2005;33(Web Server issue):W717–23.PubMedCrossRefGoogle Scholar
  61. Musunuru K, et al. Exome sequencing, ANGPTL3 mutations, and familial combined hypolipidemia. N Engl J Med. 2010;363(23):2220–7.PubMedCrossRefGoogle Scholar
  62. Ng SB, et al. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat Genet. 2010a;42(9):790–3.PubMedCrossRefGoogle Scholar
  63. Ng SB, et al. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet. 2010b;42(1):30–5.Google Scholar
  64. O’Connor KA, Roth BL. Finding new tricks for old drugs: an efficient route for public-sector drug discovery. Nat Rev Drug Discov. 2005;4(12):1005–14.PubMedCrossRefGoogle Scholar
  65. O’Sullivan J, et al. Whole-exome sequencing identifies FAM20A mutations as a cause of amelogenesis imperfecta and gingival hyperplasia syndrome. Am J Hum Genet. 2011;88(5):616–20.PubMedCrossRefGoogle Scholar
  66. Ortutay C, Vihinen M. Identification of candidate disease genes by integrating gene ontologies and protein-interaction networks: case study of primary immunodeficiencies. Nucleic Acids Res. 2009;37(2):622–8.PubMedCrossRefGoogle Scholar
  67. Padhy BM, Gupta YK. Drug repositioning: re-investigating existing drugs for new therapeutic indications. J Postgrad Med. 2011;57(2):153–60.PubMedCrossRefGoogle Scholar
  68. Perez-Iratxeta C, Bork P, Andrade MA. Association of genes to genetically inherited diseases using data mining. Nat Genet. 2002;31(3):316–19.PubMedGoogle Scholar
  69. Perez-Iratxeta C, et al. G2D: a tool for mining genes associated with disease. BMC Genet. 2005;6:45.PubMedCrossRefGoogle Scholar
  70. Pierce SB, et al. Mutations in the DBP-deficiency protein HSD17B4 cause ovarian dysgenesis, hearing loss, and ataxia of Perrault syndrome. Am J Hum Genet. 2010;87(2):282–8.PubMedCrossRefGoogle Scholar
  71. Piro RM, Di Cunto F. Computational approaches to disease-gene prediction: rationale, classification and successes. FEBS J. 2012;279(5):678–96.PubMedCrossRefGoogle Scholar
  72. Puente XS, et al. Exome sequencing and functional analysis identifies BANF1 mutation as the cause of a hereditary progeroid syndrome. Am J Hum Genet. 2011;88(5):650–6.PubMedCrossRefGoogle Scholar
  73. Pujol A, et al. Unveiling the role of network and systems biology in drug discovery. Trends PharmaSci. 2010;31(3):115–23.CrossRefGoogle Scholar
  74. Rados C. Orphan products: hope for people with rare diseases. FDA Consum. 2003;37(6):10–5.PubMedGoogle Scholar
  75. Rossi S, et al. TOM: a web-based integrated approach for identification of candidate disease genes. Nucleic Acids Res. 2006;34(Web Server issue):W285–92.PubMedCrossRefGoogle Scholar
  76. Russ AP, Lampel S. The druggable genome: an update. Drug Discov Today. 2005;10(23–24):1607–10.PubMedCrossRefGoogle Scholar
  77. Sardana D, et al. Drug repositioning for orphan diseases. Brief Bioinform. 2011;12(4):346–56.PubMedCrossRefGoogle Scholar
  78. Simpson MA, et al. Mutations in NOTCH2 cause Hajdu-Cheney syndrome, a disorder of severe and progressive bone loss. Nat Genet. 2011;43(4):303–5.PubMedCrossRefGoogle Scholar
  79. Smith NG, Eyre-Walker A. Human disease genes: patterns and predictions. Gene. 2003;318:169–75.PubMedCrossRefGoogle Scholar
  80. Suthram S, et al. Network-based elucidation of human disease similarities reveals common functional modules enriched for pluripotent drug targets. PLoS Comput Biol. 2010;6(2):e162.CrossRefGoogle Scholar
  81. The Orphan Drug Act – implementation and impact. 2001, Department of Health and Human Services, Office of Inspector Journal.Google Scholar
  82. Thornblad TA, et al. Prioritization of positional candidate genes using multiple web-based software tools. Twin Res Hum Genet. 2007;10(6):861–70.PubMedCrossRefGoogle Scholar
  83. Tiffin N, et al. Integration of text- and data-mining using ontologies successfully selects disease gene candidates. Nucleic Acids Res. 2005;33(5):1544–52.PubMedCrossRefGoogle Scholar
  84. Tiffin N, et al. Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes. Nucleic Acids Res. 2006;34(10):3067–81.PubMedCrossRefGoogle Scholar
  85. Tranchevent LC, et al. ENDEAVOUR update: a web resource for gene prioritization in multiple species. Nucleic Acids Res. 2008;36(Web Server issue):W377–84.PubMedCrossRefGoogle Scholar
  86. Turner FS, Clutterbuck DR, Semple CA. POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol. 2003;4(11):R75.PubMedCrossRefGoogle Scholar
  87. US Food and Drug Administration. Orphan Drug Act, Pub L. No. 97-144, 96 Stat. 2049. 1982.Google Scholar
  88. van Driel MA, et al. A new web-based data mining tool for the identification of candidate genes for human genetic disorders. Eur J Hum Genet. 2003;11(1):57–63.PubMedCrossRefGoogle Scholar
  89. van Driel MA, et al. GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases. Nucleic Acids Res. 2005;33(Web Server issue):W758–61.PubMedCrossRefGoogle Scholar
  90. van Driel MA, et al. A text-mining analysis of the human phenome. Eur J Hum Genet. 2006;14(5):535–42.PubMedCrossRefGoogle Scholar
  91. Vissers LE, et al. Chondrodysplasia and abnormal joint development associated with mutations in IMPAD1, encoding the Golgi-resident nucleotide phosphatase, gPAPP. Am J Hum Genet. 2011;88(5):608–15.PubMedCrossRefGoogle Scholar
  92. Wang JL, et al. TGM6 identified as a novel causative gene of spinocerebellar ataxias using exome sequencing. Brain. 2010;133(Pt 12):3510–18.PubMedCrossRefGoogle Scholar
  93. Wastfelt M, Fadeel B, Henter JI. A journey of hope: lessons learned from studies on rare diseases and orphan drugs. J Intern Med. 2006;260(1):1–10.PubMedCrossRefGoogle Scholar
  94. Wishart DS, et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006;34(Database issue):D668–72.PubMedCrossRefGoogle Scholar
  95. Wu X, et al. Network-based global inference of human disease genes. Mol Syst Biol. 2008;4:189.PubMedCrossRefGoogle Scholar
  96. Xu K, Cote TR. Database identifies FDA-approved drugs with potential to be repurposed for treatment of orphan diseases. Brief Bioinform. 2011;12(4):341–5.PubMedCrossRefGoogle Scholar
  97. Xu J, Li Y. Discovering disease-genes by topological features in human protein-protein interaction network. Bioinformatics. 2006;22(22):2800–5.PubMedCrossRefGoogle Scholar
  98. Zhang M, et al. The orphan disease networks. Am J Hum Genet. 2011;88(6):755–66.PubMedCrossRefGoogle Scholar
  99. Zhu M, Zhao S. Candidate gene identification approach: progress and challenges. Int J Biol Sci. 2007;3(7):420–7.PubMedCrossRefGoogle Scholar
  100. Zhu F, et al. Therapeutic target database update 2012: a resource for facilitating target-oriented drug discovery. Nucleic Acids Res. 2012;40(Database issue):D1128–36.PubMedCrossRefGoogle Scholar
  101. Zuchner S, et al. Whole-exome sequencing links a variant in DHDDS to retinitis pigmentosa. Am J Hum Genet. 2011;88(2):201–6.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2012

Authors and Affiliations

  1. 1.Department of PediatricsUniversity of Cincinnati College of MedicineCincinnatiUSA
  2. 2.Division of Biomedical InformaticsCincinnati Children’s Hospital Medical CenterCincinnatiUSA
  3. 3.School of Computing Sciences and InformaticsUniversity of Cincinnati College of Engineering and Applied ScienceCincinnatiUSA
  4. 4.Genome Informatics Core Laboratory, Division of Biomedical InformaticsCincinnati Children’s Hospital Medical CenterCincinnatiUSA

Personalised recommendations