Skip to main content

Using Drug Expression Profiles and Machine Learning Approach for Drug Repurposing

  • Protocol
  • First Online:
Computational Methods for Drug Repurposing

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1903))

Abstract

The cost of new drug development has been increasing, and repurposing known medications for new indications serves as an important way to hasten drug discovery. One promising approach to drug repositioning is to take advantage of machine learning (ML) algorithms to learn patterns in biological data related to drugs and then link them up to the potential of treating specific diseases. Here we give an overview of the general principles and different types of ML algorithms, as well as common approaches to evaluating predictive performances, with reference to the application of ML algorithms to predict repurposing opportunities using drug expression data as features. We will highlight common issues and caveats when applying such models to repositioning. We also introduce resources of drug expression data and highlight recent studies employing such an approach to repositioning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. DiMasi JA, Grabowski HG, Hansen RW (2016) Innovation in the pharmaceutical industry: new estimates of R&D costs. J Health Econ 47:20–33. https://doi.org/10.1016/j.jhealeco.2016.01.012

    Article  PubMed  Google Scholar 

  2. Dudley JT, Deshpande T, Butte AJ (2011) Exploiting drug-disease relationships for computational drug repositioning. Brief Bioinform 12(4):303–311. https://doi.org/10.1093/bib/bbr013

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Hodos RA, Kidd BA, Shameer K, Readhead BP, Dudley JT (2016) In silico methods for drug repurposing and pharmacology. Wiley Interdiscip Rev Syst Biol Med 8(3):186–210. https://doi.org/10.1002/wsbm.1337

    Article  PubMed  PubMed Central  Google Scholar 

  4. Vanhaelen Q, Mamoshina P, Aliper AM, Artemov A, Lezhnina K, Ozerov I, Zhavoronkov A (2017) Design of efficient computational workflows for in silico drug repurposing. Drug Discov Today 22(2):210–222. https://doi.org/10.1016/j.drudis.2016.09.019

    Article  CAS  PubMed  Google Scholar 

  5. Aliper A, Plis S, Artemov A, Ulloa A, Mamoshina P, Zhavoronkov A (2016) Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol Pharm 13(7):2524–2530. https://doi.org/10.1021/acs.molpharmaceut.6b00248

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Zhao K, So H-C (2018) Drug repositioning for schizophrenia and depression/anxiety disorders: A machine learning approach leveraging expression data. IEEE journal of biomedical and health informatics (in press)

    Google Scholar 

  7. Friedman J, Hastie T, Tibshirani R (2001) The elements of statistical learning, vol 1. Springer Series in Statistics, New York

    Google Scholar 

  8. Hoerl AE, Kennard RW (2000) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 42(1):80–86. https://doi.org/10.2307/1271436

    Article  Google Scholar 

  9. Tibshirani R (2011) Regression shrinkage and selection via the lasso: a retrospective. J R Stat Soc Series B Stat Methodol 73:273–282. https://doi.org/10.1111/j.1467-9868.2011.00771.x

    Article  Google Scholar 

  10. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net (vol B 67, pg 301, 2005). J R Stat Soc Series B Stat Methodol 67:768–768. https://doi.org/10.1111/j.1467-9868.2005.00527.x

    Article  Google Scholar 

  11. Bishop CM (2006) Pattern recognition and machine learning. Springer, New York

    Google Scholar 

  12. Xie LW, He S, Wen YQ, Bo XC, Zhang ZN (2017) Discovery of novel therapeutic properties of drugs from transcriptional responses based on multi-label classification. Sci Rep 7. https://doi.org/10.1038/s41598-017-07705-8 ARTN 7136

  13. Wang F, Zhang P, Cao N, Hu JY, Sorrentino R (2014) Exploring the associations between drug side-effects and therapeutic indications. J Biomed Inform 51:15–23. https://doi.org/10.1016/j.jbi.2014.03.014

    Article  PubMed  Google Scholar 

  14. Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22

    Article  PubMed  PubMed Central  Google Scholar 

  15. Lockhart R, Taylor J, Tibshirani RJ, Tibshirani R (2014) A significance test for the lasso. Ann Stat 42(2):413

    Article  PubMed  PubMed Central  Google Scholar 

  16. Breiman, L. (1984). Classification and regression trees. Belmont, CA.: Wadsworth International Group

    Google Scholar 

  17. Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324

    Article  Google Scholar 

  18. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232. https://doi.org/10.1214/aos/1013203451

    Article  Google Scholar 

  19. James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning, vol 112. Springer, New York

    Book  Google Scholar 

  20. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139. https://doi.org/10.1006/jcss.1997.1504

    Article  Google Scholar 

  21. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. https://doi.org/10.1007/Bf00994018

    Article  Google Scholar 

  22. Napolitano F, Zhao Y, Moreira VM, Tagliaferri R, Kere J, D'Amato M, Greco D (2013) Drug repositioning: a machine-learning approach through data integration. J Cheminform 5. https://doi.org/10.1186/1758-2946-5-30 Artn 30

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Wang YC, Chen SL, Deng NY, Wang Y (2013) Drug repositioning by kernel-based integration of molecular structure, molecular activity, and phenotype data. PLoS One 8(11). https://doi.org/10.1371/journal.pone.0078518 ARTN e78518

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O, Grobler J (2013) API design for machine learning software: experiences from the scikit-learn project. arXiv preprint arXiv 1309:0238

    Google Scholar 

  25. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  26. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

    Article  PubMed  Google Scholar 

  27. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958

    Google Scholar 

  28. Klambauer G, Unterthiner T, Mayr A, Hochreiter S (2017) DeepTox: toxicity prediction using deep learning. Toxicol Lett 280:S69–S69. https://doi.org/10.1016/j.toxlet.2017.07.175

    Article  Google Scholar 

  29. Ryu JY, Kim HU, Lee SY (2018) Deep learning improves prediction of drug-drug and drug-food interactions. Proc Natl Acad Sci U S A 115(18):E4304–E4311. https://doi.org/10.1073/pnas.1803294115

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Preuer K, Lewis RPI, Hochreiter S, Bender A, Bulusu KC, Klambauer G (2018) DeepSynergy: predicting anti-cancer drug synergy with deep learning. Bioinformatics 34(9):1538–1546. https://doi.org/10.1093/bioinformatics/btx806

    Article  PubMed  Google Scholar 

  31. Baskin II, Winkler D, Tetko IV (2016) A renaissance of neural networks in drug discovery. Expert Opin Drug Discov 11(8):785–795. https://doi.org/10.1080/17460441.2016.1201262

    Article  CAS  PubMed  Google Scholar 

  32. Chen H, Engkvist O, Wang Y, Olivecrona M, Blaschke T (2018) The rise of deep learning in drug discovery. Drug Discov Today. https://doi.org/10.1016/j.drudis.2018.01.039

    Article  PubMed  Google Scholar 

  33. Ching T, Himmelstein DS, Beaulieu-Jones BK, Kalinin AA, Do BT, Way GP, Greene CS (2018) Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface 15(141). https://doi.org/10.1098/rsif.2017.0387

    Article  PubMed  PubMed Central  Google Scholar 

  34. Varma S, Simon R (2006) Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics 7:91. https://doi.org/10.1186/1471-2105-7-91

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Davis J, Mark G (2006) The relationship between Precision-Recall and ROC curves. In: Proceedings of the 23rd international conference on Machine learning. ACM, pp 233–240

    Google Scholar 

  36. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  Google Scholar 

  37. Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Golub TR (2006) The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313(5795):1929–1935. https://doi.org/10.1126/science.1132939

    Article  CAS  PubMed  Google Scholar 

  38. Smyth GK (2005) Limma: linear models for microarray data Bioinformatics and computational biology solutions using R and Bioconductor. Springer, New York, pp 397–420

    Book  Google Scholar 

  39. Wei WQ, Cronin RM, Xu H, Lasko TA, Bastarache L, Denny JC (2013) Development and evaluation of an ensemble resource linking medications to their indications. J Am Med Inform Assoc 20(5):954–961. https://doi.org/10.1136/amiajnl-2012-001431

    Article  PubMed  PubMed Central  Google Scholar 

  40. Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305

    Google Scholar 

  41. Louppe G, Wehenkel L, Sutera A, Geurts P (2013) Understanding variable importances in forests of randomized trees. In: Advances in neural information processing systems, pp 431–439

    Google Scholar 

  42. Subramanian A, Narayan R, Corsello SM, Peck DD, Natoli TE, Lu XD, Golub TR (2017) A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell 171(6):1437. https://doi.org/10.1016/j.cell.2017.10.049

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. So HC, Chau CKL, Chiu WT, Ho KS, Lo CP, Yim SHY, Sham PC (2017) Analysis of genome-wide association data highlights candidates for drug repositioning in psychiatry. Nat Neurosci 20(10):1342-+. https://doi.org/10.1038/nn.4618

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgment

This work is partially supported by the Lo Kwee-Seong Biomedical Research Fund and a Direct Grant from the Chinese University of Hong Kong to HCS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hon-Cheong So .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Zhao, K., So, HC. (2019). Using Drug Expression Profiles and Machine Learning Approach for Drug Repurposing. In: Vanhaelen, Q. (eds) Computational Methods for Drug Repurposing. Methods in Molecular Biology, vol 1903. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8955-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-8955-3_13

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-8954-6

  • Online ISBN: 978-1-4939-8955-3

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics