Transcriptomic Data Mining and Repurposing for Computational Drug Discovery

  • Yunguan Wang
  • Jaswanth Yella
  • Anil G. Jegga
Part of the Methods in Molecular Biology book series (MIMB, volume 1903)


Conventional drug discovery in general is costly and time-consuming with extremely low success and relatively high attrition rates. The disparity between high cost of drug discovery and vast unmet medical needs resulted in advent of an increasing number of computational approaches that can “connect” disease with a candidate therapeutic. This includes computational drug repurposing or repositioning wherein the goal is to discover a new indication for an approved drug. Computational drug discovery approaches that are commonly used are similarity-based wherein network analysis or machine learning-based methods are used. One such approach is matching gene expression signatures from disease to those from small molecules, commonly referred to as connectivity mapping. In this chapter, we will focus on how publicly available existing transcriptomic data from diseases can be reused to identify novel candidate therapeutics and drug repositioning candidates. To elucidate these, we will present two case studies: (1) using transcriptional signature similarity or positive correlation to identify novel small molecules that are similar to an approved drug and (2) identifying candidate therapeutics via reciprocal connectivity or negative correlation between transcriptional signatures from a disease and small molecule.

Key words

Computational drug discovery Drug repurposing Drug repositioning Connectivity Map Drug discovery LINCS L1000 


  1. 1.
    Kaitin KI (2010) Deconstructing the drug development process: the new face of innovation. Clin Pharmacol Ther 87(3):356–361. CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Avorn J (2015) The $2.6 billion pill--methodologic and policy considerations. N Engl J Med 372(20):1877–1879. CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Denis A, Mergaert L, Fostier C, Cleemput I, Simoens S (2010) A comparative study of European rare disease and orphan drug markets. Health Policy 97(2-3):173–179. CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Valdez R, Ouyang L, Bolen J (2016) Public health and rare diseases: oxymoron no more. Prev Chronic Dis 13:E05. CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Margolis R, Derr L, Dunn M, Huerta M, Larkin J, Sheehan J, Guyer M, Green ED (2014) The National Institutes of Health’s Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data. J Am Med Inform Assoc 21(6):957–958. CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, Yefanov A, Lee H, Zhang N, Robertson CL, Serova N, Davis S, Soboleva A (2013) NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res 41(Database Issue):D991–D995. CrossRefPubMedGoogle Scholar
  7. 7.
    Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, Santos A, Doncheva NT, Roth A, Bork P, Jensen LJ, von Mering C (2017) The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res 45(D1):D362–D368. CrossRefPubMedGoogle Scholar
  8. 8.
    Hodos RA, Kidd BA, Shameer K, Readhead BP, Dudley JT (2016) In silico methods for drug repurposing and pharmacology. Wiley Interdiscip Rev Syst Biol Med 8(3):186–210. CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Bajorath J (2017) Molecular similarity concepts for informatics applications. Methods Mol Biol 1526:231–245. CrossRefPubMedGoogle Scholar
  10. 10.
    Chavali AK, Blazier AS, Tlaxca JL, Jensen PA, Pearson RD, Papin JA (2012) Metabolic network analysis predicts efficacy of FDA-approved drugs targeting the causative agent of a neglected tropical disease. BMC Syst Biol 6:27. CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Martinez V, Navarro C, Cano C, Fajardo W, Blanco A (2015) DrugNet: network-based drug-disease prioritization by integrating heterogeneous data. Artif Intell Med 63(1):41–49. CrossRefGoogle Scholar
  12. 12.
    Yang L, Agarwal P (2011) Systematic drug repositioning based on clinical side-effects. PLoS One 6(12):e28025. CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Ye H, Liu Q, Wei J (2014) Construction of drug network based on side effects and its application for drug repositioning. PLoS One 9(2):e87864. CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Chiang AP, Butte AJ (2009) Systematic evaluation of drug-disease relationships to identify leads for novel drug uses. Clin Pharmacol Ther 86(5):507–510. CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet JP, Subramanian A, Ross KN, Reich M, Hieronymus H, Wei G, Armstrong SA, Haggarty SJ, Clemons PA, Wei R, Carr SA, Lander ES, Golub TR (2006) The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313(5795):1929–1935. CrossRefGoogle Scholar
  16. 16.
    Lamb J, Ramaswamy S, Ford HL, Contreras B, Martinez RV, Kittrell FS, Zahnow CA, Patterson N, Golub TR, Ewen ME (2003) A mechanism of cyclin D1 action encoded in the patterns of gene expression in human cancer. Cell 114(3):323–334CrossRefGoogle Scholar
  17. 17.
    Gerald KB (1991) Nonparametric statistical methods. Nurse Anesth 2(2):93–95PubMedGoogle Scholar
  18. 18.
    Subramanian A, Narayan R, Corsello SM, Peck DD, Natoli TE, Lu X, Gould J, Davis JF, Tubelli AA, Asiedu JK, Lahr DL, Hirschman JE, Liu Z, Donahue M, Julian B, Khan M, Wadden D, Smith IC, Lam D, Liberzon A, Toder C, Bagul M, Orzechowski M, Enache OM, Piccioni F, Johnson SA, Lyons NJ, Berger AH, Shamji AF, Brooks AN, Vrcic A, Flynn C, Rosains J, Takeda DY, Hu R, Davison D, Lamb J, Ardlie K, Hogstrom L, Greenside P, Gray NS, Clemons PA, Silver S, Wu X, Zhao WN, Read-Button W, Wu X, Haggarty SJ, Ronco LV, Boehm JS, Schreiber SL, Doench JG, Bittker JA, Root DE, Wong B, Golub TR (2017) A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell 171(6):1437–1452.e1417. CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Brum AM, van de Peppel J, van der Leije CS, Schreuders-Koedam M, Eijken M, van der Eerden BC, van Leeuwen JP (2015) Connectivity Map-based discovery of parbendazole reveals targetable human osteogenic pathway. Proc Natl Acad Sci U S A 112(41):12711–12716. CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ (2011) Discovery and preclinical validation of drug indications using compendia of public gene expression data. Sci Transl Med 3(96):96ra77. CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Liu C, Su J, Yang F, Wei K, Ma J, Zhou X (2015) Compound signature detection on LINCS L1000 big data. Mol Biosyst 11(3):714–722. CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Campillos M, Kuhn M, Gavin AC, Jensen LJ, Bork P (2008) Drug target identification using side-effect similarity. Science 321(5886):263–266. CrossRefGoogle Scholar
  23. 23.
    Ding H, Takigawa I, Mamitsuka H, Zhu S (2014) Similarity-based machine learning methods for predicting drug-target interactions: a brief review. Brief Bioinform 15(5):734–747. CrossRefPubMedGoogle Scholar
  24. 24.
    Bleakley K, Yamanishi Y (2009) Supervised prediction of drug-target interactions using bipartite local models. Bioinformatics 25(18):2397–2403. CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Gottlieb A, Stein GY, Ruppin E, Sharan R (2011) PREDICT: a method for inferring novel drug indications with application to personalized medicine. Mol Syst Biol 7:496. CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Pharmacists TASoH-S (2015) Simvastatin.
  27. 27.
    Wang T, Seah S, Loh X, Chan CW, Hartman M, Goh BC, Lee SC (2016) Simvastatin-induced breast cancer cell death and deactivation of PI3K/Akt and MAPK/ERK signalling are reversed by metabolic products of the mevalonate pathway. Oncotarget 7(3):2532–2544. CrossRefPubMedGoogle Scholar
  28. 28.
    Yang LX, Heng XH, Guo RW, Si YK, Qi F, Zhou XB (2013) Atorvastatin inhibits the 5-lipoxygenase pathway and expression of CCL3 to alleviate atherosclerotic lesions in atherosclerotic ApoE knockout mice. J Cardiovasc Pharmacol 62(2):205–211. CrossRefPubMedGoogle Scholar
  29. 29.
    Nair RP, Duffin KC, Helms C, Ding J, Stuart PE, Goldgar D, Gudjonsson JE, Li Y, Tejasvi T, Feng BJ, Ruether A, Schreiber S, Weichenthal M, Gladman D, Rahman P, Schrodi SJ, Prahalad S, Guthery SL, Fischer J, Liao W, Kwok PY, Menter A, Lathrop GM, Wise CA, Begovich AB, Voorhees JJ, Elder JT, Krueger GG, Bowcock AM, Abecasis GR, Collaborative Association Study of P (2009) Genome-wide scan reveals association of psoriasis with IL-23 and NF-kappaB pathways. Nat Genet 41(2):199–204. CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Davis S, Meltzer PS (2007) GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics 23(14):1846–1847. CrossRefGoogle Scholar
  31. 31.
    Smyth GK (2004) Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3:3. CrossRefGoogle Scholar
  32. 32.
    Duan Q, Reid SP, Clark NR, Wang Z, Fernandez NF, Rouillard AD, Readhead B, Tritsch SR, Hodos R, Hafner M, Niepel M, Sorger PK, Dudley JT, Bavari S, Panchal RG, Ma’ayan A (2016) L1000CDS(2): LINCS L1000 characteristic direction signatures search engine. NPJ Syst Biol Appl 2.
  33. 33.
    Roberson ED, Liu Y, Ryan C, Joyce CE, Duan S, Cao L, Martin A, Liao W, Menter A, Bowcock AM (2012) A subset of methylated CpG sites differentiate psoriatic from normal skin. J Invest Dermatol 132(3 Pt 1):583–592. CrossRefPubMedGoogle Scholar
  34. 34.
    Schallreuter KU, Pittelkow MR (1987) Anthralin inhibits elevated levels of thioredoxin reductase in psoriasis. A new mode of action for this drug. Arch Dermatol 123(11):1494–1498CrossRefGoogle Scholar
  35. 35.
    Chen J, Bardes EE, Aronow BJ, Jegga AG (2009) ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res 37(Web Server Issue):W305–W311. CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    Napolitano F, Carrella D, Mandriani B, Pisonero S, Sirci F, Medina D, Brunetti-Pierri N, di Bernardo D (2017) gene2drug: a computational tool for pathway-based rational drug repositioning. Bioinformatics 34:1498. CrossRefGoogle Scholar
  37. 37.
    Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A, McDermott MG, Monteiro CD, Gundersen GW, Ma’ayan A (2016) Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res 44(W1):W90–W97. CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Wang Z, Monteiro CD, Jagodnik KM, Fernandez NF, Gundersen GW, Rouillard AD, Jenkins SL, Feldmann AS, Hu KS, McDermott MG, Duan Q, Clark NR, Jones MR, Kou Y, Goff T, Woodland H, Amaral FM, Szeto GL, Fuchs O, Schussler-Fiorenza Rose SM, Sharma S, Schwartz U, Bausela XB, Szymkiewicz M, Maroulis V, Salykin A, Barra CM, Kruth CD, Bongio NJ, Mathur V, Todoric RD, Rubin UE, Malatras A, Fulp CT, Galindo JA, Motiejunaite R, Juschke C, Dishuck PC, Lahl K, Jafari M, Aibar S, Zaravinos A, Steenhuizen LH, Allison LR, Gamallo P, de Andres Segura F, Dae Devlin T, Perez-Garcia V, Ma’ayan A (2016) Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd. Nat Commun 7:12846. CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Yunguan Wang
    • 1
  • Jaswanth Yella
    • 1
    • 3
  • Anil G. Jegga
    • 1
    • 2
    • 3
  1. 1.Division of Biomedical InformaticsCincinnati Children’s Hospital Medical CenterCincinnatiUSA
  2. 2.Department of PediatricsUniversity of Cincinnati College of MedicineCincinnatiUSA
  3. 3.Department of Computer ScienceUniversity of Cincinnati College of EngineeringCincinnatiUSA

Personalised recommendations