Using Drug Expression Profiles and Machine Learning Approach for Drug Repurposing

Zhao, Kai; So, Hon-Cheong

doi:10.1007/978-1-4939-8955-3_13

Kai Zhao³ &
Hon-Cheong So^3,4

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1903))

3306 Accesses
19 Citations
3 Altmetric

Abstract

The cost of new drug development has been increasing, and repurposing known medications for new indications serves as an important way to hasten drug discovery. One promising approach to drug repositioning is to take advantage of machine learning (ML) algorithms to learn patterns in biological data related to drugs and then link them up to the potential of treating specific diseases. Here we give an overview of the general principles and different types of ML algorithms, as well as common approaches to evaluating predictive performances, with reference to the application of ML algorithms to predict repurposing opportunities using drug expression data as features. We will highlight common issues and caveats when applying such models to repositioning. We also introduce resources of drug expression data and highlight recent studies employing such an approach to repositioning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

DiMasi JA, Grabowski HG, Hansen RW (2016) Innovation in the pharmaceutical industry: new estimates of R&D costs. J Health Econ 47:20–33. https://doi.org/10.1016/j.jhealeco.2016.01.012
Article PubMed Google Scholar
Dudley JT, Deshpande T, Butte AJ (2011) Exploiting drug-disease relationships for computational drug repositioning. Brief Bioinform 12(4):303–311. https://doi.org/10.1093/bib/bbr013
Article CAS PubMed PubMed Central Google Scholar
Hodos RA, Kidd BA, Shameer K, Readhead BP, Dudley JT (2016) In silico methods for drug repurposing and pharmacology. Wiley Interdiscip Rev Syst Biol Med 8(3):186–210. https://doi.org/10.1002/wsbm.1337
Article PubMed PubMed Central Google Scholar
Vanhaelen Q, Mamoshina P, Aliper AM, Artemov A, Lezhnina K, Ozerov I, Zhavoronkov A (2017) Design of efficient computational workflows for in silico drug repurposing. Drug Discov Today 22(2):210–222. https://doi.org/10.1016/j.drudis.2016.09.019
Article CAS PubMed Google Scholar
Aliper A, Plis S, Artemov A, Ulloa A, Mamoshina P, Zhavoronkov A (2016) Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol Pharm 13(7):2524–2530. https://doi.org/10.1021/acs.molpharmaceut.6b00248
Article CAS PubMed PubMed Central Google Scholar
Zhao K, So H-C (2018) Drug repositioning for schizophrenia and depression/anxiety disorders: A machine learning approach leveraging expression data. IEEE journal of biomedical and health informatics (in press)
Google Scholar
Friedman J, Hastie T, Tibshirani R (2001) The elements of statistical learning, vol 1. Springer Series in Statistics, New York
Google Scholar
Hoerl AE, Kennard RW (2000) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 42(1):80–86. https://doi.org/10.2307/1271436
Article Google Scholar
Tibshirani R (2011) Regression shrinkage and selection via the lasso: a retrospective. J R Stat Soc Series B Stat Methodol 73:273–282. https://doi.org/10.1111/j.1467-9868.2011.00771.x
Article Google Scholar
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net (vol B 67, pg 301, 2005). J R Stat Soc Series B Stat Methodol 67:768–768. https://doi.org/10.1111/j.1467-9868.2005.00527.x
Article Google Scholar
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York
Google Scholar
Xie LW, He S, Wen YQ, Bo XC, Zhang ZN (2017) Discovery of novel therapeutic properties of drugs from transcriptional responses based on multi-label classification. Sci Rep 7. https://doi.org/10.1038/s41598-017-07705-8 ARTN 7136
Wang F, Zhang P, Cao N, Hu JY, Sorrentino R (2014) Exploring the associations between drug side-effects and therapeutic indications. J Biomed Inform 51:15–23. https://doi.org/10.1016/j.jbi.2014.03.014
Article PubMed Google Scholar
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22
Article PubMed PubMed Central Google Scholar
Lockhart R, Taylor J, Tibshirani RJ, Tibshirani R (2014) A significance test for the lasso. Ann Stat 42(2):413
Article PubMed PubMed Central Google Scholar
Breiman, L. (1984). Classification and regression trees. Belmont, CA.: Wadsworth International Group
Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Article Google Scholar
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232. https://doi.org/10.1214/aos/1013203451
Article Google Scholar
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning, vol 112. Springer, New York
Book Google Scholar
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139. https://doi.org/10.1006/jcss.1997.1504
Article Google Scholar
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. https://doi.org/10.1007/Bf00994018
Article Google Scholar
Napolitano F, Zhao Y, Moreira VM, Tagliaferri R, Kere J, D'Amato M, Greco D (2013) Drug repositioning: a machine-learning approach through data integration. J Cheminform 5. https://doi.org/10.1186/1758-2946-5-30 Artn 30
Article CAS PubMed PubMed Central Google Scholar
Wang YC, Chen SL, Deng NY, Wang Y (2013) Drug repositioning by kernel-based integration of molecular structure, molecular activity, and phenotype data. PLoS One 8(11). https://doi.org/10.1371/journal.pone.0078518 ARTN e78518
Article CAS PubMed PubMed Central Google Scholar
Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O, Grobler J (2013) API design for machine learning software: experiences from the scikit-learn project. arXiv preprint arXiv 1309:0238
Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Google Scholar
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Article PubMed Google Scholar
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
Google Scholar
Klambauer G, Unterthiner T, Mayr A, Hochreiter S (2017) DeepTox: toxicity prediction using deep learning. Toxicol Lett 280:S69–S69. https://doi.org/10.1016/j.toxlet.2017.07.175
Article Google Scholar
Ryu JY, Kim HU, Lee SY (2018) Deep learning improves prediction of drug-drug and drug-food interactions. Proc Natl Acad Sci U S A 115(18):E4304–E4311. https://doi.org/10.1073/pnas.1803294115
Article CAS PubMed PubMed Central Google Scholar
Preuer K, Lewis RPI, Hochreiter S, Bender A, Bulusu KC, Klambauer G (2018) DeepSynergy: predicting anti-cancer drug synergy with deep learning. Bioinformatics 34(9):1538–1546. https://doi.org/10.1093/bioinformatics/btx806
Article PubMed Google Scholar
Baskin II, Winkler D, Tetko IV (2016) A renaissance of neural networks in drug discovery. Expert Opin Drug Discov 11(8):785–795. https://doi.org/10.1080/17460441.2016.1201262
Article CAS PubMed Google Scholar
Chen H, Engkvist O, Wang Y, Olivecrona M, Blaschke T (2018) The rise of deep learning in drug discovery. Drug Discov Today. https://doi.org/10.1016/j.drudis.2018.01.039
Article PubMed Google Scholar
Ching T, Himmelstein DS, Beaulieu-Jones BK, Kalinin AA, Do BT, Way GP, Greene CS (2018) Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface 15(141). https://doi.org/10.1098/rsif.2017.0387
Article PubMed PubMed Central Google Scholar
Varma S, Simon R (2006) Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics 7:91. https://doi.org/10.1186/1471-2105-7-91
Article CAS PubMed PubMed Central Google Scholar
Davis J, Mark G (2006) The relationship between Precision-Recall and ROC curves. In: Proceedings of the 23rd international conference on Machine learning. ACM, pp 233–240
Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Article Google Scholar
Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Golub TR (2006) The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313(5795):1929–1935. https://doi.org/10.1126/science.1132939
Article CAS PubMed Google Scholar
Smyth GK (2005) Limma: linear models for microarray data Bioinformatics and computational biology solutions using R and Bioconductor. Springer, New York, pp 397–420
Book Google Scholar
Wei WQ, Cronin RM, Xu H, Lasko TA, Bastarache L, Denny JC (2013) Development and evaluation of an ensemble resource linking medications to their indications. J Am Med Inform Assoc 20(5):954–961. https://doi.org/10.1136/amiajnl-2012-001431
Article PubMed PubMed Central Google Scholar
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305
Google Scholar
Louppe G, Wehenkel L, Sutera A, Geurts P (2013) Understanding variable importances in forests of randomized trees. In: Advances in neural information processing systems, pp 431–439
Google Scholar
Subramanian A, Narayan R, Corsello SM, Peck DD, Natoli TE, Lu XD, Golub TR (2017) A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell 171(6):1437. https://doi.org/10.1016/j.cell.2017.10.049
Article CAS PubMed PubMed Central Google Scholar
So HC, Chau CKL, Chiu WT, Ho KS, Lo CP, Yim SHY, Sham PC (2017) Analysis of genome-wide association data highlights candidates for drug repositioning in psychiatry. Nat Neurosci 20(10):1342-+. https://doi.org/10.1038/nn.4618
Article CAS PubMed Google Scholar

Download references

Acknowledgment

This work is partially supported by the Lo Kwee-Seong Biomedical Research Fund and a Direct Grant from the Chinese University of Hong Kong to HCS.

Author information

Authors and Affiliations

School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong
Kai Zhao & Hon-Cheong So
KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research of Common Diseases, Kunming Zoology Institute of Zoology, Kunming, China
Hon-Cheong So

Authors

Kai Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Hon-Cheong So
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hon-Cheong So .

Editor information

Editors and Affiliations

Insilico Medicine, Inc., Rockville, MD, USA
Quentin Vanhaelen

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Zhao, K., So, HC. (2019). Using Drug Expression Profiles and Machine Learning Approach for Drug Repurposing. In: Vanhaelen, Q. (eds) Computational Methods for Drug Repurposing. Methods in Molecular Biology, vol 1903. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8955-3_13

Download citation

DOI: https://doi.org/10.1007/978-1-4939-8955-3_13
Published: 14 December 2018
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-8954-6
Online ISBN: 978-1-4939-8955-3
eBook Packages: Springer Protocols

Publish with us

Policies and ethics