Skip to main content
Log in

Can machine learning ‘transform’ peptides/peptidomimetics into small molecules? A case study with ghrelin receptor ligands

  • Original Article
  • Published:
Molecular Diversity Aims and scope Submit manuscript

Abstract

There has been considerable interest in transforming peptides into small molecules as peptide-based molecules often present poorer bioavailability and lower metabolic stability. Our studies looked into building machine learning (ML) models to investigate if ML is able to identify the ‘bioactive’ features of peptides and use the features to accurately discriminate between binding and non-binding small molecules. The ghrelin receptor (GR), a receptor that is implicated in various diseases, was used as an example to demonstrate whether ML models derived from a peptide library can be used to predict small molecule binders. ML models based on three different algorithms, namely random forest, support vector machine, and extreme gradient boosting, were built based on a carefully curated dataset of peptide/peptidomimetic and small molecule GR ligands. The results indicated that ML models trained with a dataset exclusively composed of peptides/peptidomimetics provide limited predictive power for small molecules, but that ML models trained with a diverse dataset composed of an array of both peptides/peptidomimetics and small molecules displayed exceptional results in terms of accuracy and false rates. The diversified models can accurately differentiate the binding small molecules from non-binding small molecules using an external validation set with new small molecules that we synthesized previously. Structural features that are the most critical contributors to binding activity were extracted and are remarkably consistent with the crystallography and mutagenesis studies.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah P, Spitzer M, Zhao S (2019) Applications of machine learning in drug discovery and development. Nat Rev Drug Discov 18:463–477. https://doi.org/10.1038/s41573-019-0024-5

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Yang X, Wang Y, Byrne R, Schneider G, Yang S (2019) Concepts of artificial intelligence for computer-assisted drug discovery. Chem Rev 119:10520–10594. https://doi.org/10.1021/acs.chemrev.8b00728

    Article  CAS  PubMed  Google Scholar 

  3. Raschka S, Kaufman B (2020) Machine learning and AI-based approaches for bioactive ligand discovery and GPCR-ligand recognition. Methods 180:89–110. https://doi.org/10.1016/j.ymeth.2020.06.016

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Carracedo-Reboredo P, Linares-Blanco J, Rodriguez-Fernandez N, Cedron F, Novoa FJ, Carballal A, Maojo V, Pazos A, Fernandez-Lozano C (2021) A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 19:4538–4558. https://doi.org/10.1016/j.csbj.2021.08.011

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Kong W, Tu X, Huang W, Yang Y, Xie Z, Huang Z (2020) Prediction and optimization of NaV1.7 sodium channel inhibitors based on machine learning and simulated annealing. J Chem Inf Model 60:2739–2753. https://doi.org/10.1021/acs.jcim.9b01180

    Article  CAS  PubMed  Google Scholar 

  6. Tan X, Li C, Yang R, Zhao S, Li F, Li X, Chen L, Wan X, Liu X, Yang T, Tong X, Xu T, Cui R, Jiang H, Zhang S, Liu H, Zheng M (2022) Discovery of pyrazolo[3,4-d]pyridazinone derivatives as selective DDR1 inhibitors via deep learning based design, synthesis, and biological evaluation. J Med Chem 65:103–119. https://doi.org/10.1021/acs.jmedchem.1c01205

    Article  CAS  PubMed  Google Scholar 

  7. Miljkovic F, Rodriguez-Perez R, Bajorath J (2020) Machine learning models for accurate prediction of kinase inhibitors with different binding modes. J Med Chem 63:8738–8748. https://doi.org/10.1021/acs.jmedchem.9b00867

    Article  CAS  PubMed  Google Scholar 

  8. Zhavoronkov A, Ivanenkov YA, Aliper A, Veselov MS, Aladinskiy VA, Aladinskaya AV, Terentiev VA, Polykovskiy DA, Kuznetsov MD, Asadulaev A, Volkov Y, Zholus A, Shayakhmetov RR, Zhebrak A, Minaeva LI, Zagribelnyy BA, Lee LH, Soll R, Madge D, Xing L, Guo T, Aspuru-Guzik A (2019) Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat Biotechnol 37:1038–1040. https://doi.org/10.1038/s41587-019-0224-x

    Article  CAS  PubMed  Google Scholar 

  9. Hedegaard MA, Holst B (2020) The complex signaling pathways of the ghrelin receptor. Endocrinology 161:bqaa020. https://doi.org/10.1210/endocr/bqaa020

    Article  CAS  PubMed  Google Scholar 

  10. Müller TD, Nogueiras R, Andermann ML, Andrews ZB, Anker SD, Argente J, Batterham RL, Benoit SC, Bowers CY, Broglio F, Casanueva FF, D’Alessio D, Depoortere I, Geliebter A, Ghigo E, Cole PA, Cowley M, Cummings DE, Dagher A, Diano S, Dickson SL, Diéguez C, Granata R, Grill HJ, Grove K, Habegger KM, Heppner K, Heiman ML, Holsen L, Holst B, Inui A, Jansson JO, Kirchner H, Korbonits M, Laferrère B, LeRoux CW, Lopez M, Morin S, Nakazato M, Nass R, Perez-Tilve D, Pfluger PT, Schwartz TW, Seeley RJ, Sleeman M, Sun Y, Sussel L, Tong J, Thorner MO, Van der Lely AJ, Van der Ploeg LHT, Zigman JM, Kojima M, Kangawa K, Smith RG, Horvath T, Tschöp MH (2015) Ghrelin. Mol Metab 4:437–460. https://doi.org/10.1016/j.molmet.2015.03.005

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Poher AL, Tschöp MH, Müller TD (2018) Ghrelin regulation of glucose metabolism. Peptides 100:236–242. https://doi.org/10.1016/j.peptides.2017.12.015

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Lu C, McFarland MS, Nesbitt RL, Williams AK, Chan S, Gomez-Lemus J, Autran-Gomez AM, Al-Zahrani A, Chin JL, Izawa JI, Luyt LG, Lewis JD (2012) Ghrelin receptor as a novel imaging target for prostatic neoplasms. Prostate 72:825–833. https://doi.org/10.1002/pros.21484

    Article  CAS  PubMed  Google Scholar 

  13. Zhang J, Xie T (2020) Ghrelin inhibits cisplatin-induced MDA-MB-231 breast cancer cell apoptosis via PI3K/Akt/mTOR signaling. Exp Ther Med 19:1633–1640. https://doi.org/10.3892/etm.2019.8398

    Article  CAS  PubMed  Google Scholar 

  14. Gaytan F, Morales C, Barreiro ML, Jeffery P, Chopin LK, Herington AC, Casanueva FF, Aguilar E, Dieguez C, Tena-Sempere M (2005) Expression of growth hormone secretagogue receptor type 1a, the functional ghrelin receptor, in human ovarian surface epithelium, mullerian duct derivatives, and ovarian tumors. J Clin Endocrinol Metab 90:1798–1804. https://doi.org/10.1210/jc.2004-1532

    Article  CAS  PubMed  Google Scholar 

  15. Hanrahan P, Bell J, Bottomley G, Bradley S, Clarke P, Curtis E, Davis S, Dawson G, Horswill J, Keily J, Moore G, Rasamison C, Bloxham J (2012) Substituted azaquinazolinones as modulators of GHSr-1a for the treatment of type II diabetes and obesity. Bioorg Med Chem Lett 22:2271–2278. https://doi.org/10.1016/j.bmcl.2012.01.078

    Article  CAS  PubMed  Google Scholar 

  16. Moulin A, Brunel L, Boeglin D, Demange L, Ryan J, M’Kadmi C, Denoyelle S, Martinez J, Fehrentz JA (2013) The 1,2,4-triazole as a scaffold for the design of ghrelin receptor ligands: development of JMV 2959, a potent antagonist. Amino Acids 44:301–314. https://doi.org/10.1007/s00726-012-1355-2

    Article  CAS  PubMed  Google Scholar 

  17. Hou J, Kovacs MS, Dhanvantari S, Luyt LG (2018) Development of candidates for positron emission tomography (PET) imaging of ghrelin receptor in disease: design, synthesis, and evaluation of fluorine-bearing quinazolinone derivatives. J Med Chem 61:1261–1275. https://doi.org/10.1021/acs.jmedchem.7b01754

    Article  CAS  PubMed  Google Scholar 

  18. Luyt LG, Hou J (2021) Quinazolinone derivatives useful for imaging. US 11186571

  19. Lau JL, Dunn MK (2018) Therapeutic peptides: historical perspectives, current development trends, and future directions. Bioorg Med Chem 26:2700–2707. https://doi.org/10.1016/j.bmc.2017.06.052

    Article  CAS  PubMed  Google Scholar 

  20. Otvos L, Wade JD (2014) Current challenges in peptide-based drug discovery. Front Chem 2:1–4. https://doi.org/10.3389/fchem.2014.00062

    Article  CAS  Google Scholar 

  21. Lundquist P, Artursson P (2016) Oral absorption of peptides and nanoparticles across the human intestine: opportunities, limitations and studies in human tissues. Adv Drug Deliv Rev 106:256–276. https://doi.org/10.1016/j.addr.2016.07.007

    Article  CAS  PubMed  Google Scholar 

  22. M’Kadmi C, Cabral A, Barrile F, Giribaldi J, Cantel S, Damian M, Mary S, Denoyelle S, Dutertre S, Péraldi-Roux S, Neasta J, Oiry C, Banères JL, Marie J, Perello M, Fehrentz JA (2019) N-terminal liver-expressed antimicrobial peptide 2 (LEAP2) region exhibits inverse agonist activity toward the ghrelin receptor. J Med Chem 62:965–973. https://doi.org/10.1021/acs.jmedchem.8b01644

    Article  CAS  PubMed  Google Scholar 

  23. Hou J, Charron CL, Fowkes MM, Luyt LG (2016) Bridging computational modeling with amino acid replacements to investigate GHS-R1a-peptidomimetic recognition. Eur J Med Chem 123:822–833. https://doi.org/10.1016/j.ejmech.2016.07.078

    Article  CAS  PubMed  Google Scholar 

  24. Giorgioni G, Bello FD, Quaglia W, Botticelli L, Cifani C, Bonaventura EMD, Bonaventura MVMD, Piergentili A (2022) Advances in the development of nonpeptide small molecules targeting ghrelin receptor. J Med Chem 65:3098–3118. https://doi.org/10.1021/acs.jmedchem.1c02191

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498–2504. https://doi.org/10.1101/gr.1239303

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Davies M, Nowotka M, Papadatos G, Dedman N, Gaulton A, Atkinson F, Bellis L, Overington JP (2015) ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res 43:W612–W620. https://doi.org/10.1093/nar/gkv352

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Siramshetty VB, Chen Q, Devarakonda P, Preissner R (2018) The Catch-22 of predicting hERG blockade using publicly accessible bioactivity data. J Chem Inf Model. https://doi.org/10.1021/acs.jcim.8b00150

    Article  PubMed  Google Scholar 

  28. Siramshetty VB, Nguyen DT, Martinez NJ, Southall NT, Simeonov A, Zakharov AV (2020) Critical assessment of artificial intelligence methods for prediction of hERG channel inhibition in the “Big Data” era. J Chem Inf Model 60:6007–6019. https://doi.org/10.1021/acs.jcim.0c00884

    Article  CAS  PubMed  Google Scholar 

  29. Fan T, Sun G, Zhao L, Cui X, Zhong R (2018) QSAR and classification study on prediction of acute oral toxicity of N-nitroso compounds. Int J Mol Sci 19:3015. https://doi.org/10.3390/ijms19103015

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754. https://doi.org/10.1021/ci100050t

    Article  CAS  PubMed  Google Scholar 

  31. Morgan HL (1965) The generation of a unique machine description for chemical structures—a technique developed at chemical abstracts service. J Chem Doc 5:107–113. https://doi.org/10.1021/c160017a018

    Article  CAS  Google Scholar 

  32. Accelrys (2011) MACCS structural keys. Accelrys, San Diego

    Google Scholar 

  33. The RDKit book. https://www.rdkit.org/docs/RDKit_Book.html

  34. RDKit: cheminformatics and machine learning software (2013). http://www.rdkit.org

  35. Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32:1466–1474. https://doi.org/10.1002/jcc.21707

    Article  CAS  PubMed  Google Scholar 

  36. Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605

    Google Scholar 

  37. Miljkovic F, Martinsson A, Obrezanova O, Williamson B, Johnson M, Sykes A, Bender A, Greene N (2021) Machine learning models for human in vivo pharmacokinetic parameters with in-house validation. Mol Pharm 18:4520–4530. https://doi.org/10.1021/acs.molpharmaceut.1c00718

    Article  CAS  PubMed  Google Scholar 

  38. Hou T, Bian Y, McGuire T, Xie XQ (2021) Integrated multi-class classification and prediction of GPCR allosteric modulators by machine learning intelligence. Biomolecules 11:870. https://doi.org/10.3390/biom11060870

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  40. Hunter JD (2007) Matplotlib: a 2D graphics environment. Comput Sci Eng 9:90–95. https://doi.org/10.1109/MCSE.2007.55

    Article  Google Scholar 

  41. Breiman L (2001) Random Forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324

    Article  Google Scholar 

  42. Vapnik VN (2000) The nature of statistical learning theory. Springer, New York

    Book  Google Scholar 

  43. Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. arXiv. https://doi.org/10.48550/arXiv.1603.02754

  44. Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. NIPS, Long Beach, pp 4768–4777. https://doi.org/10.48550/arXiv.1705.07874

  45. Rodríguez-Pérez R, Bajorath J (2020) Interpretation of compound activity predictions from complex machine learning models using local approximations and shapley values. J Med Chem 63:8761–8777. https://doi.org/10.1021/acs.jmedchem.9b01101

    Article  CAS  PubMed  Google Scholar 

  46. Wang K, Tian J, Zheng C, Yang H, Ren J, Liu Y, Han Q, Zhang Y (2021) Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP. Comput Biol Med 137:104813. https://doi.org/10.1016/j.compbiomed.2021.104813

    Article  PubMed  Google Scholar 

  47. Shiimura Y, Horita S, Hamamoto A, Asada H, Hirata K, Tanaka M, Mori K, Uemura T, Kobayashi T, Iwata S, Kojima M (2020) Structure of an antagonist-bound ghrelin receptor reveals possible ghrelin recognition mode. Nat Commun 11:4160. https://doi.org/10.1038/s41467-020-17554-1

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Forli S, Huey R, Pique ME, Sanner MF, Goodsell DS, Olson AJ (2016) Computational protein–ligand docking and virtual drug screening with the AutoDock suite. Nat Protoc 11:905–919. https://doi.org/10.1038/nprot.2016.051

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Cui X, Yang R, Li S, Liu J, Wu Q, Li X (2021) Modeling and insights into molecular basis of low molecular weight respiratory sensitizers. Mol Divers 25:847–859. https://doi.org/10.1007/s11030-020-10069-3

    Article  CAS  PubMed  Google Scholar 

  50. Kruskal WH, Wallis WA (2012) Use of ranks in one-criterion variance analysis. J Am Stat Assoc 47:583–621. https://doi.org/10.2307/2280779

    Article  Google Scholar 

  51. Sanchez JE, KC GB, Franco J, Allen WJ, Garcia JD, Sirimulla S (2021) BiasNet: a model to predict ligand bias toward GPCR signaling. J Chem Inf Model 61:4190–4199. https://doi.org/10.1021/acs.jcim.1c00317

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Jasial S, Gilberg E, Blaschke T, Bajorath J (2018) Machine learning distinguishes with high accuracy between pan-assay interference compounds that are promiscuous or represent dark chemical matter. J Med Chem 61:10255–10264. https://doi.org/10.1021/acs.jmedchem.8b01404

    Article  CAS  PubMed  Google Scholar 

  53. Galati S, Yonchev D, Rodríguez-Pérez R, Vogt M, Tuccinardi T, Bajorath J (2021) Predicting isoform-selective carbonic anhydrase inhibitors via machine learning and rationalizing structural features important for selectivity. ACS Omega 6:4080–4089. https://doi.org/10.1021/acsomega.0c06153

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Yang KK, Wu Z, Bedbrook CN, Arnold FH (2018) Learned protein embeddings for machine learning. Bioinformatics 34:2642–2648. https://doi.org/10.1093/bioinformatics/bty178

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Duvenaud D, Maclaurin D, Aguilera-Iparraguirre J, Gómez-Bombarelli R, Hirzel T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks on graphs for learning molecular fingerprints. NIPS, Montreal, pp 2215–2223. https://doi.org/10.48550/arXiv.1509.09292

Download references

Acknowledgements

We thank Google Colaboratory (Co-lab) for providing computation resources. This work was supported by Natural Sciences and Engineering Research Council of Canada (NSERC), Thunder Bay Regional Health Research Institute, and Lakehead University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinqiang Hou.

Ethics declarations

Conflict of interest

All the authors of this manuscript declare that they have no conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 3040 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, W., Hopkins, A.M., Yan, P. et al. Can machine learning ‘transform’ peptides/peptidomimetics into small molecules? A case study with ghrelin receptor ligands. Mol Divers 27, 2239–2255 (2023). https://doi.org/10.1007/s11030-022-10555-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11030-022-10555-w

Keywords

Navigation