Abstract
STAT3 belongs to a family of seven transcription factors. It plays an important role in activating the transcription of various genes involved in a variety of cellular processes. High levels of STAT3 are detected in several types of cancer. Hence, STAT3 inhibition is considered a promising therapeutic anti-cancer strategy. However, since STAT3 inhibitors bind to the shallow SH2 domain of the protein, it is expected that hydration water molecules play significant role in ligand-binding complicating the discovery of potent binders. To remedy this issue, we herein propose to extract pharmacophores from molecular dynamics (MD) frames of a potent co-crystallized ligand complexed within STAT3 SH2 domain. Subsequently, we employ genetic function algorithm coupled with machine learning (GFA-ML) to explore the optimal combination of MD-derived pharmacophores that can account for the variations in bioactivity among a list of inhibitors. To enhance the dataset, the training and testing lists were augmented nearly a 100-fold by considering multiple conformers of the ligands. A single significant pharmacophore emerged after 188 ns of MD simulation to represent STAT3-ligand binding. Screening the National Cancer Institute (NCI) database with this model identified one low micromolar inhibitor most likely binds to the SH2 domain of STAT3 and inhibits this pathway.
Similar content being viewed by others
Data availability
Data are available upon request from the corresponding author.
Change history
19 October 2023
A Correction to this paper has been published: https://doi.org/10.1007/s10822-023-00540-2
References
Hospital A, Goñi JR, Orozco M, Gelpi J (2015) Adv Appl Bioinforma Chem 8:37–47
Aykut AO, Atilgan AR, Atilgan C (2013) PLoS Comput Biol 9(12):e1003366
Costa MG, Batista PR, Bisch PM, Perahia D (2015) J Chem Theory Comput 11(6):2755
Gioia D, Bertazzo M, Recanatini M, Masetti M, Cavalli A (2017) Molecules 22(11):2029
Eun C, Ortiz-Sánchez JM, Da L, Wang D, McCammon JA (2014) PLoS ONE 9(5):e97975
Lee JY, Krieger JM, Li H, Bahar I (2020) Protein Sci 29(1):76
Wakefield AE, Kozakov D, Vajda S (2022) Curr Opin Struct Biol 75:102396
Guo Z, Li B, Cheng L-T, Zhou S, McCammon JA, Che J (2015) J Chem Theory Comput 11(2):753
Xie L, Bourne PE (2007) A robust and efficient algorithm for the shape description of protein structures and its application in predicting ligand binding sites. BMC bioinformatics. Springer, Berlin, p 1
Sadybekov AV, Katritch V (2023) Nature 616(7958):673
Hassan Baig M, Ahmad K, Roy S, Mohammad Ashraf J, Adil M, Haris Siddiqui M, Khan S, Amjad Kamal M, Provazník I, Choi I (2016) Curr Pharm Des 22(5):572
McCarthy M, Prakash P, Gorfe AA (2016) Acta Biochim Biophys Sin 48(1):3
Zhavoronkov A, Vanhaelen Q, Oprea TI (2020) Clin Pharmacol Ther 107(4):780
Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah P, Spitzer M (2019) Nat Rev Drug Discovery 18(6):463
Zhang L, Zhan C. Machine learning in rock facies classification: An application of XGBoost. International Geophysical Conference, Qingdao, China: Society of Exploration Geophysicists and Chinese Petroleum Society, 2017: 1371
Qi Y (2012) Random forest for bioinformatics. Ensemble machine learning: Methods and applications. Springer, Berlin, p 307
Lavecchia A (2015) Drug Discovery Today 20(3):318
Wickramasinghe I, Kalutarage H (2021) Soft Comput 25(3):2277
Jaradat NJ, Khanfar MA, Habash M, Taha MO (2015) J Comput Aided Mol Des 29(6):561
Varuna Shree N, Kumar T (2018) Brain informatics 5(1):23
Hajmeer M, Basheer I (2002) J Microbiol Methods 51(2):217
Gupta P, Sinha NK (2000) CHAPTER 14 - neural networks for identification of nonlinear systems: an overview. In: Sinha NK, Gupta MM (eds) Soft Computing and Intelligent Systems. Academic Press, San Diego, p 337
Jiang L, Cai Z, Zhang H, Wang D (2013) J Exp Theor Artif Intell 25(2):273
Tuyen TT, Jaafari A, Yen HPH, Nguyen-Thoi T, Van Phong T, Nguyen HD, Van Le H, Phuong TTM, Nguyen SH, Prakash I (2021) Eco Inform 63:101292
Wong SC, Gatt A, Stamatescu V, McDonnell MD. Understanding data augmentation for classification: when to warp? 2016 International conference on digital image computing: techniques and applications (DICTA): IEEE, 2016: 1
Hatmal MmM, Abuyaman O, Taha M (2021) Comput Struct Biotechnol J 19:4790
Jaradat NJ, Alshaer W, Hatmal M, Taha MO (2023) RSC Adv 13(7):4623
Bromberg JF, Wrzeszczynska MH, Devgan G, Zhao Y, Pestell RG, Albanese C, Darnell JE Jr (1999) Cell 98(3):295
Adan H, Daniel J, Raptis L (2022) Cells 11(16):2537
Bromberg J (2002) J Clin Investig 109(9):1139
Zou S, Tong Q, Liu B, Huang W, Tian Y, Fu X (2020) Mol Cancer 19(1):1
Frank DA (2007) Cancer Lett 251(2):199
Yue P, Lopez-Tapia F, Paladino D, Li Y, Chen C-H, Namanja AT, Hilliard T, Chen Y, Tius MA, Turkson J (2016) Can Res 76(3):652
Feng K-R, Wang F, Shi X-W, Tan Y-X, Zhao J-Y, Zhang J-W, Li Q-H, Lin G-Q, Gao D, Tian P (2020) Eur J Med Chem 201:112428
Verdura S, Cuyàs E, Llorach-Parés L, Pérez-Sánchez A, Micol V, Nonell-Canals A, Joven J, Valiente M, Sánchez-Martínez M, Bosch-Barrera J (2018) Food Chem Toxicol 116:161
Mencalha AL, Du Rocher B, Salles D, Binato R, Abdelhay E (2010) Cancer Chemother Pharmacol 65(6):1039
Zhang L, Wang Y, Dong Y, Chen Z, Eckols TK, Kasembeli MM, Tweardy DJ, Mitch WE (2020) Am J Physiol-Renal Physiol 319(1):F84
Masciocchi D, Gelain A, Villa S, Meneghetti F, Barlocco D (2011) Future Med Chem 3(5):567
Maurer M, Oostenbrink C (2019) J Mol Recognit 32(12):e2810
Singh AV, Kayal A, Malik A, Maharjan RS, Dietrich P, Thissen A, Siewert K, Curato C, Pande K, Prahlad D (2022) Langmuir 38(26):7976
Mark P, Nilsson L (2001) Chem A 105(43):9954
Momany FA, Rone R (1992) J Comput Chem 13(7):888
Hatmal MmM, Jaber S, Taha MO (2016) J Comput-Aided Mol Design 30:1149
Hatmal MmM, Taha MO (2017) Future Med Chem 9(11):1141
Hatmal MmM, Taha MO (2018) J Chem Information Model 58(4):879
Triballeau N, Acher F, Brabet I, Pin J-P, Bertrand H-O (2005) J Med Chem 48(7):2534
Shahin R, Taha MO (2012) Bioorg Med Chem 20(1):377
Kirchmair J, Markt P, Distinto S, Wolber G, Langer T (2008) J Comput Aided Mol Des 22(3):213
Leach A Nucleic Acids Research 45:D945
Davies M, Nowotka M, Papadatos G, Dedman N, Gaulton A, Atkinson F, Bellis L, Overington JP (2015) Nucleic Acids Res 43(W1):W612
Jupp S, Malone J, Bolleman J, Brandizi M, Davies M, Garcia L, Gaulton A, Gehant S, Laibe C, Redaschi N (2014) Bioinformatics 30(9):1338
Gaulton A, Hersey A, Nowotka M, Bento AP, Chambers J, Mendez D, Mutowo P, Atkinson F, Bellis LJ, Cibrián-Uhalte E (2017) Nucleic Acids Res 45(D1):D945
Taha MO, Habash M, Hatmal MmM, Abdelazeem AH, Qandil A (2015) J Mol Graph Model 56:91
Li J, Ehlers T, Sutter J, Varma-O’Brien S, Kirchmair J (2007) J Chem Inf Model 47(5):1923
Al-Tawil MF, Daoud S, Hatmal MmM, Taha MO (2022) RSC Adv 12(17):10686
Aqtash Ra, Zihlif MA, Hammad H, Nassar ZD, Al Meliti J, Taha MO (2017) Comput Biol Chem 71:170
Kurogi Y, Guner OF (2001) Curr Med Chem 8(9):1035
Simm J, Humbeck L, Zalewski A et al (2021) Splitting chemical structure data sets for federated privacy-preserving machine learning. J Cheminform 13:96. https://doi.org/10.1186/s13321-021-00576-2
Géron A (2019) Hands-on machine learning with Scikit-Learn, Keras and TensorFlow: concepts, tools, and techniques to build intelligent systems, 2nd edn. O’Reilly, Springfield
Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah P, Spitzer M, Zhao S (2019) Nat Rev Drug Discov 18(6):463
Berrar D (2018) Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics 403
McHugh ML (2012) Biochemia medica 22(3):276
Vehtari A, Gelman A, Gabry J (2017) Stat Comput 27(5):1413
Kondeti PK, Ravi K, Mutheneni SR, Kadiri MR, Kumaraswamy S, Vadlamani R, Upadhyayula SM (2019). Epidemiol Infection. https://doi.org/10.1017/S0950268819001481
Hall P, Gill N (2018) An introduction to machine learning interpretability. O’Reilly Media, Inc., NewYork
Molnar C (2022) ‘8.6 Global Surrogate’, in Interpretable machine learning: A guide for making Black Box models explainable, 2nd edn. Munich. christophm.github.io/interpretable-ml-book/, Christoph Molnar
Rogers D, Hopfinger AJ (1994) J Chem Inf Comput Sci 34(4):854
Rodríguez-Pérez R, Bajorath J (2019) J Med Chem 63(16):8761
Rodríguez-Pérez R, Bajorath J (2020) J Comput Aided Mol Des 34(10):1013
Ghorbani A, Zou J. Data shapley: Equitable valuation of data for machine learning. International Conference on Machine Learning: PMLR, 2019:2242
Heppler LN, Attarha S, Persaud R, Brown JI, Wang P, Petrova B, Tošić I, Burton FB, Flamand Y, Walker SR (2022) J Biol Chem 298(2):101531
Shastri A, Schinke C, Yanovsky AV, Bhagat TD, Giricz O, Barreyro L, Boultwood J, Pellagati A, Yu Y, Brown JR (2014) Blood 124(21):3602
Khan MW, Saadalla A, Ewida AH, Al-Katranji K, Al-Saoudi G, Giaccone ZT, Gounari F, Zhang M, Frank DA, Khazaie K (2018) Cancer Immunol Immunother 67(1):13
Tuffaha GO, Hatmal MmM, Taha MO (2019) J Mol Graph Model 91:30
Al-Sha’er MA, Taha MO (2021) Curr Comput-Aided Drug Design 17(4):511
Bulavas V, Marcinkevičius V, Rumiński J (2021) Informatica 32(3):441
Al-Sha’er MA, Taha MO (2018) J Mol Graph Model 83:1536
Khanfar MA, Taha MO (2013) J Chem Inf Model 53(10):2587
Al-Nadaf A, Taha MO (2013) Med Chem Res 22:1979
Rodríguez-Pérez R, Bajorath J (2020) J Comput Aided Mol Des 34:1013
Lipiński PF, Szurmak P (2017) Chem Pap 71(11):2217
Schust J, Sperl B, Hollis A, Mayer TU, Berg T (2006) Chem Biol 13(11):1235
Poria DK, Sheshadri N, Balamurugan K, Sharan S, Sterneck E (2021). J Biol Chem. https://doi.org/10.1074/jbc.RA120.016645
Xia Y, Wang G, Jiang M, Liu X, Zhao Y, Song Y, Jiang B, Zhu D, Hu L, Zhang Z (2021) Onco Targets Ther 14:4047
Gordan JD, Thompson CB, Simon MC (2007) Cancer Cell 12(2):108
Zhou F, Yang Y, Xing D (2011) FEBS J 278(3):403
Taylor EC, Harrington PJ, Fletcher SR, Beardsley GP, Moran RG (1985) J Med Chem 28(7):914
Acknowledgements
The authors thank the Deanship of Academic Research at the University of Jordan for funding this project. The authors would also like to thank Dr. Walhan al Shaer, Fadwa Daoud, and Suha Wehaibi, from Cell Therapy Center for their technical assistance in biology testing experiments.
Funding
This project was funded the Deanship of Scientific Research at the University of Jordan, Amman, Jordan.
Author information
Authors and Affiliations
Contributions
NJJ: Investigation, Formal analysis, Review & Editing. MH: Investigation, Review & Editing. DA.: Formal analysis, Review & Editing. MOT: Conceptualization, Methodology, Supervision, Investigation, Resources, Writing, Review & Editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jaradat, N.J., Hatmal, M., Alqudah, D. et al. Computational workflow for discovering small molecular binders for shallow binding sites by integrating molecular dynamics simulation, pharmacophore modeling, and machine learning: STAT3 as case study. J Comput Aided Mol Des 37, 659–678 (2023). https://doi.org/10.1007/s10822-023-00528-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-023-00528-y