Can machine learning ‘transform’ peptides/peptidomimetics into small molecules? A case study with ghrelin receptor ligands

Liu, Wenjie; Hopkins, Austin M.; Yan, Peizhi; Du, Shan; Luyt, Leonard G.; Li, Yifeng; Hou, Jinqiang

doi:10.1007/s11030-022-10555-w

Can machine learning ‘transform’ peptides/peptidomimetics into small molecules? A case study with ghrelin receptor ligands

Original Article
Published: 04 November 2022

Volume 27, pages 2239–2255, (2023)
Cite this article

Molecular Diversity Aims and scope Submit manuscript

Wenjie Liu¹,
Austin M. Hopkins¹,
Peizhi Yan²,
Shan Du³,
Leonard G. Luyt^4,5,
Yifeng Li⁶ &
…
Jinqiang Hou¹

608 Accesses
Explore all metrics

Abstract

There has been considerable interest in transforming peptides into small molecules as peptide-based molecules often present poorer bioavailability and lower metabolic stability. Our studies looked into building machine learning (ML) models to investigate if ML is able to identify the ‘bioactive’ features of peptides and use the features to accurately discriminate between binding and non-binding small molecules. The ghrelin receptor (GR), a receptor that is implicated in various diseases, was used as an example to demonstrate whether ML models derived from a peptide library can be used to predict small molecule binders. ML models based on three different algorithms, namely random forest, support vector machine, and extreme gradient boosting, were built based on a carefully curated dataset of peptide/peptidomimetic and small molecule GR ligands. The results indicated that ML models trained with a dataset exclusively composed of peptides/peptidomimetics provide limited predictive power for small molecules, but that ML models trained with a diverse dataset composed of an array of both peptides/peptidomimetics and small molecules displayed exceptional results in terms of accuracy and false rates. The diversified models can accurately differentiate the binding small molecules from non-binding small molecules using an external validation set with new small molecules that we synthesized previously. Structural features that are the most critical contributors to binding activity were extracted and are remarkably consistent with the crystallography and mutagenesis studies.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using random forests for assistance in the curation of G-protein coupled receptor databases

Article Open access 18 August 2017

XGB-DrugPred: computational prediction of druggable proteins using eXtreme gradient boosting and optimized features set

Article Open access 01 April 2022

In silico prediction of GLP-1R agonists using machine learning approach

Article 24 March 2021

References

Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah P, Spitzer M, Zhao S (2019) Applications of machine learning in drug discovery and development. Nat Rev Drug Discov 18:463–477. https://doi.org/10.1038/s41573-019-0024-5
Article CAS PubMed PubMed Central Google Scholar
Yang X, Wang Y, Byrne R, Schneider G, Yang S (2019) Concepts of artificial intelligence for computer-assisted drug discovery. Chem Rev 119:10520–10594. https://doi.org/10.1021/acs.chemrev.8b00728
Article CAS PubMed Google Scholar
Raschka S, Kaufman B (2020) Machine learning and AI-based approaches for bioactive ligand discovery and GPCR-ligand recognition. Methods 180:89–110. https://doi.org/10.1016/j.ymeth.2020.06.016
Article CAS PubMed PubMed Central Google Scholar
Carracedo-Reboredo P, Linares-Blanco J, Rodriguez-Fernandez N, Cedron F, Novoa FJ, Carballal A, Maojo V, Pazos A, Fernandez-Lozano C (2021) A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 19:4538–4558. https://doi.org/10.1016/j.csbj.2021.08.011
Article CAS PubMed PubMed Central Google Scholar
Kong W, Tu X, Huang W, Yang Y, Xie Z, Huang Z (2020) Prediction and optimization of NaV1.7 sodium channel inhibitors based on machine learning and simulated annealing. J Chem Inf Model 60:2739–2753. https://doi.org/10.1021/acs.jcim.9b01180
Article CAS PubMed Google Scholar
Tan X, Li C, Yang R, Zhao S, Li F, Li X, Chen L, Wan X, Liu X, Yang T, Tong X, Xu T, Cui R, Jiang H, Zhang S, Liu H, Zheng M (2022) Discovery of pyrazolo[3,4-d]pyridazinone derivatives as selective DDR1 inhibitors via deep learning based design, synthesis, and biological evaluation. J Med Chem 65:103–119. https://doi.org/10.1021/acs.jmedchem.1c01205
Article CAS PubMed Google Scholar
Miljkovic F, Rodriguez-Perez R, Bajorath J (2020) Machine learning models for accurate prediction of kinase inhibitors with different binding modes. J Med Chem 63:8738–8748. https://doi.org/10.1021/acs.jmedchem.9b00867
Article CAS PubMed Google Scholar
Zhavoronkov A, Ivanenkov YA, Aliper A, Veselov MS, Aladinskiy VA, Aladinskaya AV, Terentiev VA, Polykovskiy DA, Kuznetsov MD, Asadulaev A, Volkov Y, Zholus A, Shayakhmetov RR, Zhebrak A, Minaeva LI, Zagribelnyy BA, Lee LH, Soll R, Madge D, Xing L, Guo T, Aspuru-Guzik A (2019) Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat Biotechnol 37:1038–1040. https://doi.org/10.1038/s41587-019-0224-x
Article CAS PubMed Google Scholar
Hedegaard MA, Holst B (2020) The complex signaling pathways of the ghrelin receptor. Endocrinology 161:bqaa020. https://doi.org/10.1210/endocr/bqaa020
Article CAS PubMed Google Scholar
Müller TD, Nogueiras R, Andermann ML, Andrews ZB, Anker SD, Argente J, Batterham RL, Benoit SC, Bowers CY, Broglio F, Casanueva FF, D’Alessio D, Depoortere I, Geliebter A, Ghigo E, Cole PA, Cowley M, Cummings DE, Dagher A, Diano S, Dickson SL, Diéguez C, Granata R, Grill HJ, Grove K, Habegger KM, Heppner K, Heiman ML, Holsen L, Holst B, Inui A, Jansson JO, Kirchner H, Korbonits M, Laferrère B, LeRoux CW, Lopez M, Morin S, Nakazato M, Nass R, Perez-Tilve D, Pfluger PT, Schwartz TW, Seeley RJ, Sleeman M, Sun Y, Sussel L, Tong J, Thorner MO, Van der Lely AJ, Van der Ploeg LHT, Zigman JM, Kojima M, Kangawa K, Smith RG, Horvath T, Tschöp MH (2015) Ghrelin. Mol Metab 4:437–460. https://doi.org/10.1016/j.molmet.2015.03.005
Article CAS PubMed PubMed Central Google Scholar
Poher AL, Tschöp MH, Müller TD (2018) Ghrelin regulation of glucose metabolism. Peptides 100:236–242. https://doi.org/10.1016/j.peptides.2017.12.015
Article CAS PubMed PubMed Central Google Scholar
Lu C, McFarland MS, Nesbitt RL, Williams AK, Chan S, Gomez-Lemus J, Autran-Gomez AM, Al-Zahrani A, Chin JL, Izawa JI, Luyt LG, Lewis JD (2012) Ghrelin receptor as a novel imaging target for prostatic neoplasms. Prostate 72:825–833. https://doi.org/10.1002/pros.21484
Article CAS PubMed Google Scholar
Zhang J, Xie T (2020) Ghrelin inhibits cisplatin-induced MDA-MB-231 breast cancer cell apoptosis via PI3K/Akt/mTOR signaling. Exp Ther Med 19:1633–1640. https://doi.org/10.3892/etm.2019.8398
Article CAS PubMed Google Scholar
Gaytan F, Morales C, Barreiro ML, Jeffery P, Chopin LK, Herington AC, Casanueva FF, Aguilar E, Dieguez C, Tena-Sempere M (2005) Expression of growth hormone secretagogue receptor type 1a, the functional ghrelin receptor, in human ovarian surface epithelium, mullerian duct derivatives, and ovarian tumors. J Clin Endocrinol Metab 90:1798–1804. https://doi.org/10.1210/jc.2004-1532
Article CAS PubMed Google Scholar
Hanrahan P, Bell J, Bottomley G, Bradley S, Clarke P, Curtis E, Davis S, Dawson G, Horswill J, Keily J, Moore G, Rasamison C, Bloxham J (2012) Substituted azaquinazolinones as modulators of GHSr-1a for the treatment of type II diabetes and obesity. Bioorg Med Chem Lett 22:2271–2278. https://doi.org/10.1016/j.bmcl.2012.01.078
Article CAS PubMed Google Scholar
Moulin A, Brunel L, Boeglin D, Demange L, Ryan J, M’Kadmi C, Denoyelle S, Martinez J, Fehrentz JA (2013) The 1,2,4-triazole as a scaffold for the design of ghrelin receptor ligands: development of JMV 2959, a potent antagonist. Amino Acids 44:301–314. https://doi.org/10.1007/s00726-012-1355-2
Article CAS PubMed Google Scholar
Hou J, Kovacs MS, Dhanvantari S, Luyt LG (2018) Development of candidates for positron emission tomography (PET) imaging of ghrelin receptor in disease: design, synthesis, and evaluation of fluorine-bearing quinazolinone derivatives. J Med Chem 61:1261–1275. https://doi.org/10.1021/acs.jmedchem.7b01754
Article CAS PubMed Google Scholar
Luyt LG, Hou J (2021) Quinazolinone derivatives useful for imaging. US 11186571
Lau JL, Dunn MK (2018) Therapeutic peptides: historical perspectives, current development trends, and future directions. Bioorg Med Chem 26:2700–2707. https://doi.org/10.1016/j.bmc.2017.06.052
Article CAS PubMed Google Scholar
Otvos L, Wade JD (2014) Current challenges in peptide-based drug discovery. Front Chem 2:1–4. https://doi.org/10.3389/fchem.2014.00062
Article CAS Google Scholar
Lundquist P, Artursson P (2016) Oral absorption of peptides and nanoparticles across the human intestine: opportunities, limitations and studies in human tissues. Adv Drug Deliv Rev 106:256–276. https://doi.org/10.1016/j.addr.2016.07.007
Article CAS PubMed Google Scholar
M’Kadmi C, Cabral A, Barrile F, Giribaldi J, Cantel S, Damian M, Mary S, Denoyelle S, Dutertre S, Péraldi-Roux S, Neasta J, Oiry C, Banères JL, Marie J, Perello M, Fehrentz JA (2019) N-terminal liver-expressed antimicrobial peptide 2 (LEAP2) region exhibits inverse agonist activity toward the ghrelin receptor. J Med Chem 62:965–973. https://doi.org/10.1021/acs.jmedchem.8b01644
Article CAS PubMed Google Scholar
Hou J, Charron CL, Fowkes MM, Luyt LG (2016) Bridging computational modeling with amino acid replacements to investigate GHS-R1a-peptidomimetic recognition. Eur J Med Chem 123:822–833. https://doi.org/10.1016/j.ejmech.2016.07.078
Article CAS PubMed Google Scholar
Giorgioni G, Bello FD, Quaglia W, Botticelli L, Cifani C, Bonaventura EMD, Bonaventura MVMD, Piergentili A (2022) Advances in the development of nonpeptide small molecules targeting ghrelin receptor. J Med Chem 65:3098–3118. https://doi.org/10.1021/acs.jmedchem.1c02191
Article CAS PubMed PubMed Central Google Scholar
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498–2504. https://doi.org/10.1101/gr.1239303
Article CAS PubMed PubMed Central Google Scholar
Davies M, Nowotka M, Papadatos G, Dedman N, Gaulton A, Atkinson F, Bellis L, Overington JP (2015) ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res 43:W612–W620. https://doi.org/10.1093/nar/gkv352
Article CAS PubMed PubMed Central Google Scholar
Siramshetty VB, Chen Q, Devarakonda P, Preissner R (2018) The Catch-22 of predicting hERG blockade using publicly accessible bioactivity data. J Chem Inf Model. https://doi.org/10.1021/acs.jcim.8b00150
Article PubMed Google Scholar
Siramshetty VB, Nguyen DT, Martinez NJ, Southall NT, Simeonov A, Zakharov AV (2020) Critical assessment of artificial intelligence methods for prediction of hERG channel inhibition in the “Big Data” era. J Chem Inf Model 60:6007–6019. https://doi.org/10.1021/acs.jcim.0c00884
Article CAS PubMed Google Scholar
Fan T, Sun G, Zhao L, Cui X, Zhong R (2018) QSAR and classification study on prediction of acute oral toxicity of N-nitroso compounds. Int J Mol Sci 19:3015. https://doi.org/10.3390/ijms19103015
Article CAS PubMed PubMed Central Google Scholar
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754. https://doi.org/10.1021/ci100050t
Article CAS PubMed Google Scholar
Morgan HL (1965) The generation of a unique machine description for chemical structures—a technique developed at chemical abstracts service. J Chem Doc 5:107–113. https://doi.org/10.1021/c160017a018
Article CAS Google Scholar
Accelrys (2011) MACCS structural keys. Accelrys, San Diego
Google Scholar
The RDKit book. https://www.rdkit.org/docs/RDKit_Book.html
RDKit: cheminformatics and machine learning software (2013). http://www.rdkit.org
Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32:1466–1474. https://doi.org/10.1002/jcc.21707
Article CAS PubMed Google Scholar
Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605
Google Scholar
Miljkovic F, Martinsson A, Obrezanova O, Williamson B, Johnson M, Sykes A, Bender A, Greene N (2021) Machine learning models for human in vivo pharmacokinetic parameters with in-house validation. Mol Pharm 18:4520–4530. https://doi.org/10.1021/acs.molpharmaceut.1c00718
Article CAS PubMed Google Scholar
Hou T, Bian Y, McGuire T, Xie XQ (2021) Integrated multi-class classification and prediction of GPCR allosteric modulators by machine learning intelligence. Biomolecules 11:870. https://doi.org/10.3390/biom11060870
Article CAS PubMed PubMed Central Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Google Scholar
Hunter JD (2007) Matplotlib: a 2D graphics environment. Comput Sci Eng 9:90–95. https://doi.org/10.1109/MCSE.2007.55
Article Google Scholar
Breiman L (2001) Random Forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Article Google Scholar
Vapnik VN (2000) The nature of statistical learning theory. Springer, New York
Book Google Scholar
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. arXiv. https://doi.org/10.48550/arXiv.1603.02754
Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. NIPS, Long Beach, pp 4768–4777. https://doi.org/10.48550/arXiv.1705.07874
Rodríguez-Pérez R, Bajorath J (2020) Interpretation of compound activity predictions from complex machine learning models using local approximations and shapley values. J Med Chem 63:8761–8777. https://doi.org/10.1021/acs.jmedchem.9b01101
Article CAS PubMed Google Scholar
Wang K, Tian J, Zheng C, Yang H, Ren J, Liu Y, Han Q, Zhang Y (2021) Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP. Comput Biol Med 137:104813. https://doi.org/10.1016/j.compbiomed.2021.104813
Article PubMed Google Scholar
Shiimura Y, Horita S, Hamamoto A, Asada H, Hirata K, Tanaka M, Mori K, Uemura T, Kobayashi T, Iwata S, Kojima M (2020) Structure of an antagonist-bound ghrelin receptor reveals possible ghrelin recognition mode. Nat Commun 11:4160. https://doi.org/10.1038/s41467-020-17554-1
Article CAS PubMed PubMed Central Google Scholar
Forli S, Huey R, Pique ME, Sanner MF, Goodsell DS, Olson AJ (2016) Computational protein–ligand docking and virtual drug screening with the AutoDock suite. Nat Protoc 11:905–919. https://doi.org/10.1038/nprot.2016.051
Article CAS PubMed PubMed Central Google Scholar
Cui X, Yang R, Li S, Liu J, Wu Q, Li X (2021) Modeling and insights into molecular basis of low molecular weight respiratory sensitizers. Mol Divers 25:847–859. https://doi.org/10.1007/s11030-020-10069-3
Article CAS PubMed Google Scholar
Kruskal WH, Wallis WA (2012) Use of ranks in one-criterion variance analysis. J Am Stat Assoc 47:583–621. https://doi.org/10.2307/2280779
Article Google Scholar
Sanchez JE, KC GB, Franco J, Allen WJ, Garcia JD, Sirimulla S (2021) BiasNet: a model to predict ligand bias toward GPCR signaling. J Chem Inf Model 61:4190–4199. https://doi.org/10.1021/acs.jcim.1c00317
Article CAS PubMed PubMed Central Google Scholar
Jasial S, Gilberg E, Blaschke T, Bajorath J (2018) Machine learning distinguishes with high accuracy between pan-assay interference compounds that are promiscuous or represent dark chemical matter. J Med Chem 61:10255–10264. https://doi.org/10.1021/acs.jmedchem.8b01404
Article CAS PubMed Google Scholar
Galati S, Yonchev D, Rodríguez-Pérez R, Vogt M, Tuccinardi T, Bajorath J (2021) Predicting isoform-selective carbonic anhydrase inhibitors via machine learning and rationalizing structural features important for selectivity. ACS Omega 6:4080–4089. https://doi.org/10.1021/acsomega.0c06153
Article CAS PubMed PubMed Central Google Scholar
Yang KK, Wu Z, Bedbrook CN, Arnold FH (2018) Learned protein embeddings for machine learning. Bioinformatics 34:2642–2648. https://doi.org/10.1093/bioinformatics/bty178
Article CAS PubMed PubMed Central Google Scholar
Duvenaud D, Maclaurin D, Aguilera-Iparraguirre J, Gómez-Bombarelli R, Hirzel T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks on graphs for learning molecular fingerprints. NIPS, Montreal, pp 2215–2223. https://doi.org/10.48550/arXiv.1509.09292

Download references

Acknowledgements

We thank Google Colaboratory (Co-lab) for providing computation resources. This work was supported by Natural Sciences and Engineering Research Council of Canada (NSERC), Thunder Bay Regional Health Research Institute, and Lakehead University.

Author information

Authors and Affiliations

Department of Chemistry, Lakehead University and Thunder Bay Regional Health Research Institute, 980 Oliver Road, Thunder Bay, ON, P7B 6V4, Canada
Wenjie Liu, Austin M. Hopkins & Jinqiang Hou
Department of Electrical and Computer Engineering, The University of British Columbia, Vancouver, BC, Canada
Peizhi Yan
Department of Computer Science, Mathematics, Physics and Statistics, The University of British Columbia, Okanagan, Kelowna, BC, Canada
Shan Du
Department of Chemistry, University of Western Ontario, London, ON, Canada
Leonard G. Luyt
London Regional Cancer Program, Lawson Health Research Institute, London, ON, Canada
Leonard G. Luyt
Department of Computer Science, Brock University, Saint Catharines, ON, Canada
Yifeng Li

Authors

Wenjie Liu
View author publications
You can also search for this author in PubMed Google Scholar
Austin M. Hopkins
View author publications
You can also search for this author in PubMed Google Scholar
Peizhi Yan
View author publications
You can also search for this author in PubMed Google Scholar
Shan Du
View author publications
You can also search for this author in PubMed Google Scholar
Leonard G. Luyt
View author publications
You can also search for this author in PubMed Google Scholar
Yifeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Jinqiang Hou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinqiang Hou.

Ethics declarations

Conflict of interest

All the authors of this manuscript declare that they have no conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 3040 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, W., Hopkins, A.M., Yan, P. et al. Can machine learning ‘transform’ peptides/peptidomimetics into small molecules? A case study with ghrelin receptor ligands. Mol Divers 27, 2239–2255 (2023). https://doi.org/10.1007/s11030-022-10555-w

Download citation

Received: 06 September 2022
Accepted: 19 October 2022
Published: 04 November 2022
Issue Date: October 2023
DOI: https://doi.org/10.1007/s11030-022-10555-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Can machine learning ‘transform’ peptides/peptidomimetics into small molecules? A case study with ghrelin receptor ligands