Mathematical deep learning for pose and binding affinity prediction and ranking in D3R Grand Challenges

Abstract

Advanced mathematics, such as multiscale weighted colored subgraph and element specific persistent homology, and machine learning including deep neural networks were integrated to construct mathematical deep learning models for pose and binding affinity prediction and ranking in the last two D3R Grand Challenges in computer-aided drug design and discovery. D3R Grand Challenge 2 focused on the pose prediction, binding affinity ranking and free energy prediction for Farnesoid X receptor ligands. Our models obtained the top place in absolute free energy prediction for free energy set 1 in stage 2. The latest competition, D3R Grand Challenge 3 (GC3), is considered as the most difficult challenge so far. It has five subchallenges involving Cathepsin S and five other kinase targets, namely VEGFR2, JAK2, p38-α, TIE2, and ABL1. There is a total of 26 official competitive tasks for GC3. Our predictions were ranked 1st in 10 out of these 26 tasks.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3

References

  1. 1.

    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucl Acids Res 28(1):35–242

    Article  Google Scholar 

  2. 2.

    Liu Z, Su M, Han L, Liu J, Yang Q, Li Y, Wang R (2017) Forging the basis for developing protein–ligand interaction scoring functions. Acc Chem Res 50(2):302–309

    CAS  PubMed  Article  Google Scholar 

  3. 3.

    Ahmed A, Smith RD, Clark JJ, Dunbar JB Jr, Carlson HA (2014) Recent improvements to binding moad: a resource for protein–ligand binding affinities and structures. Nucl Acids Res 43(D1):D465–D469

    PubMed  Article  CAS  Google Scholar 

  4. 4.

    Kroemer RT (2007) Structure-based drug design: docking and scoring. Curr Protein Pept Sci 8(4):312–328

    CAS  PubMed  Article  Google Scholar 

  5. 5.

    Leach AR, Shoichet BK, Peishoff CE (2006) Prediction of protein–ligand interactions. docking and scoring: successes and gaps. J Med Chem 49:5851–5855

    CAS  PubMed  Article  Google Scholar 

  6. 6.

    Novikov FN, Zeifman AA, Stroganov OV, Stroylov VS, Kulkov V, Chilov GG (2011) CSAR scoring challenge reveals the need for new concepts in estimating protein–ligand binding affinity. J Chem Inform Model 51:2090–2096

    CAS  Article  Google Scholar 

  7. 7.

    Wang R, Lu Y, Wang S (2003) Comparative evaluation of 11 scoring functions for molecular docking. J Med Chem 46:2287–2303

    CAS  PubMed  Article  Google Scholar 

  8. 8.

    Liu J, Wang R (2015) Classification of current scoring functions. J Chem Inform Model 55(3):475–482

    CAS  Article  Google Scholar 

  9. 9.

    Ortiz AR, Pisabarro MT, Gago F, Wade RC (1995) Prediction of drug binding affinities by comparative binding energy analysis. J Med Chem 38:2681–2691

    CAS  PubMed  Article  Google Scholar 

  10. 10.

    Yin S, Biedermannova L, Vondrasek J, Dokholyan NV (2008) Medusascore: an acurate force field-based scoring function for virtual drug screening. J Chem Inform Model 48:1656–1662

    CAS  Article  Google Scholar 

  11. 11.

    Zheng Z, Wang T, Li P, Merz KM Jr (2015) KECSA-movable type implicit solvation model (KMTISM). J Chem Theor Comput 11:667–682

    CAS  Article  Google Scholar 

  12. 12.

    Muegge I, Martin Y (1999) A general and fast scoring function for protein–ligand interactions: a simplified potential approach. J Med Chem 42(5):791–804

    CAS  PubMed  Article  Google Scholar 

  13. 13.

    Velec HFG, Gohlke H, Klebe G (2005) Knowledge-based scoring function derived from small molecule crystal data with superior recognition rate of near-native ligand poses and better affinity prediction. J Med Chem 48:6296–6303

    CAS  PubMed  Article  Google Scholar 

  14. 14.

    Huang SY, Zou X (2006) An iterative knowledge-based scoring function to predict protein–ligand interactions: I. derivation of interaction potentials. J Comput Chem 27:1865–1875

    Google Scholar 

  15. 15.

    Wang R, Lai L, Wang S (2002) Further development and validation of empirical scoring functions for structural based binding affinity prediction. J Comput Aided Mol Des 16:11–26

    CAS  PubMed  Article  Google Scholar 

  16. 16.

    Verkhivker G, Appelt K, Freer ST, Villafranca JE (1995) Empirical free energy calculations of ligand–protein crystallographic complexes. I. Knowledge based ligand-protein interaction potentials applied to the prediction of human immunodeficiency virus protease binding affinity. Protein Eng 8:677–691

    CAS  PubMed  Article  Google Scholar 

  17. 17.

    Eldridge MD, Murray CW, Auton TR, Paolini GV, Mee RP (1997) Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes. J Comput Aided Mol Des 11:425–445

    CAS  PubMed  Article  Google Scholar 

  18. 18.

    Baum B, Muley L, Smolinski M, Heine A, Hangauer D, Klebe G (2010) Non-additivity of functional group contributions in protein–ligand binding: a comprehensive study by crystallography and isothermal titration calorimetry. J Mol Biol 397(4):1042–1054

    CAS  PubMed  Article  Google Scholar 

  19. 19.

    Li H, Leung K-S, Wong M-H, Ballester PJ (2014) Substituting random forest for multiple linear regression improves binding affinity prediction of scoring functions: cyscore as a case study. BMC Bioinform 15(1):1

    Article  CAS  Google Scholar 

  20. 20.

    Nguyen DD, Xiao T, Wang ML, Wei GW (2017) Rigidity strengthening: a mechanism for protein–ligand binding. J Chem Inform Model 57:1715–1721

    CAS  Article  Google Scholar 

  21. 21.

    Cang ZX, Wei, GW (2018) “Integration of element specific persistent homology and machine learning for protein–ligand binding affinity prediction. Int J Numer Methods Biomed Eng. https://doi.org/10.1002/cnm.2914

  22. 22.

    Cang ZX, Wei GW (2017) TopologyNet: topology based deep convolutional and multi-task neural networks for biomolecular property predictions. PLoS Comput Biol 13(7):e1005690. https://doi.org/10.1371/journal.pcbi.1005690

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  23. 23.

    Cang ZX, Mu L, Wei GW (2018) Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLoS Comput Biol 14(1):e1005929. https://doi.org/10.1371/journal.pcbi.1005929

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  24. 24.

    Bramer D, Wei G-W (2018) Multiscale weighted colored graphs for protein flexibility and rigidity analysis. J Chem Phys 148(5):054103

    PubMed  Article  CAS  Google Scholar 

  25. 25.

    Kaczynski T, Mischaikow K, Mrozek M (2004) Computational homology. Springer, New York

    Google Scholar 

  26. 26.

    Edelsbrunner H, Letscher D, Zomorodian A (2001) Topological persistence and simplification. Discrete Comput Geom 28:511–533

    Article  Google Scholar 

  27. 27.

    Zomorodian A, Carlsson G (2005) Computing persistent homology. Discrete Comput Geom 33:249–274

    Article  Google Scholar 

  28. 28.

    Frosini P, Landi C (1999) Size theory as a topological tool for computer vision. Pattern Recognit Image Anal 9(4):596–603

    Google Scholar 

  29. 29.

    Kasson PM, Zomorodian A, Park S, Singhal N, Guibas LJ, Pande VS (2007) Persistent voids a new structural metric for membrane fusion. Bioinformatics 23:1753–1759

    CAS  PubMed  Article  Google Scholar 

  30. 30.

    Gameiro M, Hiraoka Y, Izumi S, Kramar M, Mischaikow K, Nanda V (2014) Topological measurement of protein compressibility via persistence diagrams. Japn J Ind Appl Math 32:1–17

    Google Scholar 

  31. 31.

    Dabaghian Y, Mémoli F, Frank L, Carlsson G (2012) A topological paradigm for hippocampal spatial map formation using persistent homology. PLoS Comput Biol 8(8):e1002581

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  32. 32.

    Xia KL, Wei GW (2014) Persistent homology analysis of protein structure, flexibility and folding. Int J Numer Methods Biomed Eng 30:814–844

    Article  Google Scholar 

  33. 33.

    Xia KL, Feng X, Tong YY, Wei GW (2015) Persistent homology for the quantitative prediction of fullerene stability. J Comput Chem 36:408–422

    CAS  PubMed  Article  Google Scholar 

  34. 34.

    Wang B, Wei GW (2016) Object-oriented persistent homology. J Comput Phys 305:276–299

    PubMed  PubMed Central  Article  Google Scholar 

  35. 35.

    Liu B, Wang B, Zhao R, Tong Y, Wei GW (2017) ESES: software for Eulerian solvent excluded surface. J Comput Chem 38:446–466

    CAS  PubMed  Article  Google Scholar 

  36. 36.

    Xia KL, Wei GW (2015) Persistent topology for cryo-EM data analysis. Int J Numer Methods Biomed Eng 31:e02719

    Article  Google Scholar 

  37. 37.

    Cang ZX, Mu L, Wu K, Opron K, Xia K, Wei G-W (2015) A topological approach to protein classification. Mol Based Math Biol 3:140–162

    Google Scholar 

  38. 38.

    Cang ZX, Wei GW (2017) Analysis and prediction of protein folding energy changes upon mutation by element specific persistent homology. Bioinformatics 33:3549–3557

    CAS  PubMed  Google Scholar 

  39. 39.

    Wu K, Wei G-W (2018) Quantitative toxicity prediction using topology based multitask deep neural networks. J Chem Inform Model. https://doi.org/10.1021/acs.jcim.7b00558

  40. 40.

    Wu K, Zhao Z, Wang R, Wei G-W (2017) Topp-s: persistent homology based multi-task deep neural networks for simultaneous predictions of partition coefficient and aqueous solubility. arXiv preprint arXiv:1801.01558

  41. 41.

    Sastry GM, Adzhigirey M, Day T, Annabhimoju R, Sherman W (2013) Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments. J Comput Aided Mol Des 27:221–234

    PubMed  Article  CAS  Google Scholar 

  42. 42.

    Trott O, Olson AJ (2010) AutoDock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem 31(2):455–461

    CAS  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, Olson AJ (2009) Autodock4 and autodocktools4: automated docking with selective receptor flexibility. J Comput Chem 30(16):2785–2791

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  44. 44.

    Bell J, Cao Y, Gunn J, Day T, Gallicchio E, Zhou Z, Levy R, Farid R (2012) Primex and the Schrödinger computational chemistry suite of programs. Int Tables Crystallogr F18:534–538

    Article  Google Scholar 

  45. 45.

    Ye Z, Baumgartner MP, Wingert BM, Camacho CJ (2016) Optimal strategies for virtual screening of induced-fit and flexible target in the 2015 D3R Grand Challenge. J Comput Aided Mol Des 30(9):695–706

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. 46.

    Jones G, Willett P, Glen RC, Leach AR, Taylor R (1997) Development and validation of a genetic algorithm for flexible docking. J Mol Biol 267(3):727–748

    CAS  PubMed  Article  Google Scholar 

  47. 47.

    Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, Repasky MP, Knoll EH, Shelley M, Perry JK, Shaw DE, Francis P, Shenkin PS (2004) Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem 47:1739

    CAS  PubMed  Article  Google Scholar 

  48. 48.

    O’Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR (2011) Open babel: an open chemical toolbox. J Cheminform 3(1):1

    Article  CAS  Google Scholar 

  49. 49.

    Schrödinger LLC (2017) Schrödinger release 2017-4. Schrödinger LLC, New York

    Google Scholar 

  50. 50.

    Dixon SL, Smondyrev AM, Knoll EH, Rao SN, Shaw DE, Friesner RA (2006) Phase: a new engine for pharmacophore perception, 3d qsar model development, and 3d database screening: 1. Methodology and preliminary results. J Comput Aided Mol Des 20(10–11):647–671

    CAS  PubMed  Article  Google Scholar 

  51. 51.

    Dixon SL, Smondyrev AM, Rao SN (2006) Phase: a novel approach to pharmacophore modeling and 3d database searching. Chem Biol Drug Des 67(5):370–372

    CAS  PubMed  Article  Google Scholar 

  52. 52.

    Jacobson MP, Pincus DL, Rapp CS, Day TJ, Honig B, Shaw DE, Friesner RA (2004) A hierarchical approach to all-atom protein loop prediction. Proteins Struct Funct Bioinform 55(2):351–367

    CAS  Article  Google Scholar 

  53. 53.

    Jacobson MP, Friesner RA, Xiang Z, Honig B (2002) On the role of the crystal environment in determining protein side-chain conformations. J Mol Biol 320(3):597–608

    CAS  PubMed  Article  Google Scholar 

  54. 54.

    Farid R, Day T, Friesner RA, Pearlstein RA (2006) New insights about herg blockade obtained from protein modeling, potential energy mapping, and docking studies. Bioorg Med Chem 14(9):3160–3173

    CAS  PubMed  Article  Google Scholar 

  55. 55.

    Sherman W, Day T, Jacobson MP, Friesner RA, Farid R (2006) Novel procedure for modeling ligand/receptor induced fit effects. J Med Chem 49(2):534–553

    CAS  PubMed  Article  Google Scholar 

  56. 56.

    Sherman W, Beard HS, Farid R (2006) Use of an induced fit receptor structure in virtual screening. Chem Biol Drug Des 67(1):83–84

    CAS  PubMed  Article  Google Scholar 

  57. 57.

    Borgatti SP (2005) Centrality and network flow. Soc Netw 27(1):55–71

    Article  Google Scholar 

  58. 58.

    Freeman LC (1978) Centrality in social networks conceptual clarification. Soc Netw 1(3):215–239

    Article  Google Scholar 

  59. 59.

    Bavelas A (1950) Communication patterns in task-oriented groups. J Acoust Soc Am 22(6):725–730

    Article  Google Scholar 

  60. 60.

    Dekker A (2005) Conceptual distance in social network analysis. J Soc Struct 6

  61. 61.

    Edelsbrunner H (1992) Weighted alpha shapes. Technical report. University of Illinois, Champaign

    Google Scholar 

  62. 62.

    Nguyen DD, Wei GW (2018) Multiscale weighted colored algebraic graphs for biomolecules (to be submitted)

Download references

Acknowledgements

This work was supported in part by NSF Grants IIS-1302285, DMS-1721024 and DMS-1761320 and MSU Center for Mathematical Molecular Biosciences Initiative.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Guo-Wei Wei.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (XLSX 14 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nguyen, D.D., Cang, Z., Wu, K. et al. Mathematical deep learning for pose and binding affinity prediction and ranking in D3R Grand Challenges. J Comput Aided Mol Des 33, 71–82 (2019). https://doi.org/10.1007/s10822-018-0146-6

Download citation

Keywords

  • Drug design
  • Pose prediction
  • Binding affinity
  • Machine learning
  • Algebraic topology
  • Graph theory