Skip to main content

Graph Neural Networks for Molecules

  • Chapter
  • First Online:
Machine Learning in Molecular Sciences

Abstract

Graph neural networks (GNNs), which are capable of learning representations from graphical data, are naturally suitable for modeling molecular systems. This review introduces GNNs and their various applications for small organic molecules. GNNs rely on message-passing operations, a generic yet powerful framework, to update node features iteratively. Many researches design GNN architectures to effectively learn topological information of 2D molecule graphs as well as geometric information of 3D molecular systems. GNNs have been implemented in a wide variety of molecular applications, including molecular property prediction, molecular scoring and docking, molecular optimization and de novo generation, molecular dynamics simulation, etc. Besides, the review also summarizes the recent development of self-supervised learning for molecules with GNNs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.rdkit.org/.

References

  1. Bronstein MM, Bruna J, LeCun Y, Szlam A, Vandergheynst P (2017) Geometric deep learning: going beyond Euclidean data. IEEE Signal Process Mag 34:18–42

    Article  Google Scholar 

  2. Dai H, Kozareva Z, Dai B, Smola A, Song L (2018) Learning steady-states of iterative algorithms over graphs. In: International conference on machine learning, pp 1106–1114

    Google Scholar 

  3. Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarell R, Hirzel T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks on graphs for learning molecular fingerprints. Adv Neural Inf Process Syst 28

    Google Scholar 

  4. Dumontier M, Callahan A, Cruz-Toledo J, Ansell P, Emonet V, Belleau F, Droit A (2014) Bio2RDF release 3: a larger connected network of linked data for the life sciences. In: Proceedings of the 2014 international conference on posters & demonstrations track, vol 1272, pp 401–404

    Google Scholar 

  5. Sanchez-Gonzalez A, Heess N, Springenberg JT, Merel J, Riedmiller M, Hadsell R, Battaglia P (2018) Graph networks as learnable physics engines for inference and control. In: International conference on machine learning, pp 4470–4479

    Google Scholar 

  6. Fout A, Byrd J, Shariat B, Ben-Hur A (2017) Protein interface prediction using graph convolutional networks. Adv Neural Inf Process Syst 30

    Google Scholar 

  7. Qi X, Liao R, Jia J, Fidler S, Urtasun R (2017) 3D graph neural networks for RGBD semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 5199–5208

    Google Scholar 

  8. Wang T, Liao R, Ba J, Fidler S (2018) NerveNet: learning structured policy with graph neural networks. In: International conference on learning representations

    Google Scholar 

  9. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444

    Article  Google Scholar 

  10. Zhou J, Cui G, Hu S, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M (2020) Graph neural networks: a review of methods and applications. AI Open 1:57–81

    Article  Google Scholar 

  11. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

    Google Scholar 

  12. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780

    Article  Google Scholar 

  13. Chami I, Abu-El-Haija S, Perozzi B, Ré C, Murphy K (2020) Machine learning on graphs: a model and comprehensive taxonomy. arXiv preprint arXiv:2005.03675

  14. Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. In: International conference on machine learning, pp 1263–1272

    Google Scholar 

  15. Battaglia PW, Hamrick JB, Bapst V, Sanchez-Gonzalez A, Zambaldi V, Malinowski M, Tacchetti A, Raposo D, Santoro A, Faulkner R et al (2018) Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261

  16. Shuman DI, Narang SK, Frossard P, Ortega A, Vandergheynst P (2013) The emerging field of signal processing on graphs: extending high dimensional data analysis to networks and other irregular domains. IEEE Signal Process Mag 30:83–98

    Article  Google Scholar 

  17. Li Y, Tarlow D, Brockschmidt M, Zemel R (2016) Gated graph sequence neural networks

    Google Scholar 

  18. Battaglia P, Pascanu R, Lai M, Jimenez Rezende D et al (2016) Interaction networks for learning about objects, relations and physics. Adv Neural Inf Process Syst 29

    Google Scholar 

  19. Bruna J, Zaremba W, Szlam A, LeCun Y (2013) Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203

  20. Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. Adv Neural Inf Process Syst 29

    Google Scholar 

  21. Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the international conference on learning representations

    Google Scholar 

  22. Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. Adv Neural Inf Process Syst 30

    Google Scholar 

  23. Xu K, Hu W, Leskovec J, Jegelka S (2019) How powerful are graph neural networks? In: International conference on learning representations

    Google Scholar 

  24. Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903

  25. Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078

  26. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075

  27. Rampášek L, Galkin M, Dwivedi VP, Luu AT, Wolf G, Beaini D (2022) Recipe for a general, powerful, scalable graph transformer. arXiv preprint arXiv:2205.12454

  28. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30

    Google Scholar 

  29. Ying C, Cai T, Luo S, Zheng S, Ke G, He D, Shen Y, Liu T-Y (2021) Do transformers really perform badly for graph representation? Adv Neural Inf Process Syst 34:28877–28888

    Google Scholar 

  30. Dwivedi VP, Bresson X (2020) A generalization of transformer networks to graphs. arXiv preprint arXiv:2012.09699

  31. Kim J, Nguyen TD, Min S, Cho S, Lee M, Lee H, Hong S (2022) Pure transformers are powerful graph learners. arXiv preprint arXiv:2207.02505

  32. Choromanski K, Likhosherstov V, Dohan D, Song X, Gane A, Sarlos T, Hawkins P, Davis J, Mohiuddin A, Kaiser L et al (2020) Rethinking attention with performers. arXiv preprint arXiv:2009.14794

  33. Vinyals O, Bengio S, Kudlur M (2015) Order matters: sequence to sequence for sets. arXiv preprint arXiv:1511.06391

  34. Zhang M, Cui Z, Neumann M, Chen Y (2018) An end-to-end deep learning architecture for graph classification. In: Proceedings of the AAAI conference on artificial intelligence, vol 32

    Google Scholar 

  35. Ying Z, You J, Morris C, Ren X, Hamilton W, Leskovec J (2018) Hierarchical graph representation learning with differentiable pooling. Adv Neural Inf Process Syst 31

    Google Scholar 

  36. Lee J, Lee I, Kang J (2019) Self-attention graph pooling. In: International conference on machine learning, pp 3734–3743

    Google Scholar 

  37. Atz K, Grisoni F, Schneider G (2021) Geometric deep learning on molecular representations. Nat Mach Intell 3:1023–1032

    Article  Google Scholar 

  38. Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31–36

    Google Scholar 

  39. Krenn M, Häse F, Nigam A, Friederich P, Aspuru-Guzik A (2020) Self-referencing embedded strings (SELFIES): a 100% robust molecular string representation. Mach Learn Sci Technol 1:045024

    Article  Google Scholar 

  40. Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754

    Article  Google Scholar 

  41. Durant JL, Leland BA, Henry DR, Nourse JG (2002) Reoptimization of MDL keys for use in drug discovery. J Chem Inf Comput Sci 42:1273–1280

    Article  Google Scholar 

  42. Kearnes S, McCloskey K, Berndl M, Pande V, Riley P (2016) Molecular graph convolutions: moving beyond fingerprints. J Comput Aided Mol Des 30:595–608

    Article  Google Scholar 

  43. Hu W, Liu B, Gomes J, Zitnik M, Liang P, Pande V, Leskovec J (2020) Strategies for pre-training graph neural networks. In: International conference on learning representations

    Google Scholar 

  44. Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, Guzman-Perez A, Hopper T, Kelley B, Mathea M et al (2019) Analyzing learned molecular representations for property prediction. J Chem Inf Model 59:3370–3388

    Article  Google Scholar 

  45. Xiong Z, Wang D, Liu X, Zhong F, Wan X, Li X, Li Z, Luo X, Chen K, Jiang H et al (2019) Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J Med Chem 63:8749–8760

    Article  Google Scholar 

  46. Rong Y, Bian Y, Xu T, Xie W, Wei Y, Huang W, Huang J (2020) Self-supervised graph transformer on large-scale molecular data. Adv Neural Inf Process Syst 33:12559–12571

    Google Scholar 

  47. Han J, Rong Y, Xu T, Huang W (2022) Geometrically equivariant graph neural networks: a survey. arXiv preprint arXiv:2202.07230

  48. Blanco-Claraco JL (2021) A tutorial on SE(3) transformation parameterizations and on-manifold optimization. arXiv preprint arXiv:2103.15980

  49. Schütt KT, Arbabzadah F, Chmiela S, Müller KR, Tkatchenko A (2017) Quantum-chemical insights from deep tensor neural networks. Nat Commun 8:1–8

    Article  Google Scholar 

  50. Schütt K, Kindermans P-J, Sauceda Felix HE, Chmiela S, Tkatchenko A, Müller K-R (2017) SchNet: a continuous-filter convolutional neural network for modeling quantum interactions. Adv Neural Inf Process Syst 30

    Google Scholar 

  51. Unke OT, Meuwly M (2019) PhysNet: a neural network for predicting energies, forces, dipole moments, and partial charges. J Chem Theory Comput 15:3678–3693

    Article  Google Scholar 

  52. Gasteiger J, Groß J, Günnemann S (2019) Directional message passing for molecular graphs. In: International conference on learning representations

    Google Scholar 

  53. Klicpera J, Giri S, Margraf JT, Günnemann S (2020) Fast and uncertainty aware directional message passing for non-equilibrium molecules. arXiv preprint arXiv:2011.14115

  54. Gilmore R (2008) Lie groups, physics, and geometry: an introduction for physicists, engineers and chemists. Cambridge University Press

    Google Scholar 

  55. Thomas N, Smidt T, Kearnes S, Yang L, Li L, Kohlhoff K, Riley P (2018) Tensor field networks: rotation- and translation-equivariant neural networks for 3D point clouds. arXiv preprint arXiv:1802.08219

  56. Fuchs F, Worrall D, Fischer V, Welling M (2020) SE(3)-transformers: 3D roto-translation equivariant attention networks. Adv Neural Inf Process Syst 33:1970–1981

    Google Scholar 

  57. Brandstetter J, Hesselink R, van der Pol E, Bekkers EJ, Welling M (2022) Geometric and physical quantities improve E(3) equivariant message passing. In: International conference on learning representations

    Google Scholar 

  58. Anderson B, Hy T-S, Kondor R (2019) Cormorant: covariant molecular neural networks. arXiv:1906.04015

  59. Schütt KT, Unke OT, Gastegger M (2021) Equivariant message passing for the prediction of tensorial properties and molecular spectra. arXiv:2102.03150

  60. Gasteiger J, Groß J, Günnemann S (2020) Directional message passing for molecular graphs. arXiv:2003.03123

  61. Thölke P, De Fabritiis G (2022) TorchMD-NET: equivariant transformers for neural network based molecular potentials. arXiv preprint arXiv:2202.02541

  62. Jing B, Eismann S, Suriana P, Townshend RJL, Dror R (2021) Learning from protein structure with geometric vector perceptrons. In: International conference on learning representations. https://openreview.net/forum?id=1YLJDvSx6J4

  63. Villar S, Hogg DW, Storey-Fisher K, Yao W, Blum-Smith B (2021) Scalars are universal: equivariant machine learning, structured like classical physics. In: Beygelzimer A, Dauphin Y, Liang P, Vaughan JW (eds) Advances in neural information processing systems. https://openreview.net/forum?id=ba27-RzNaIv

  64. Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, Leswing K, Pande V (2018) MoleculeNet: a benchmark for molecular machine learning. Chem Sci 9:513–530

    Article  Google Scholar 

  65. Dwivedi VP, Joshi CK, Laurent T, Bengio Y, Bresson X (2020) Benchmarking graph neural networks. arXiv preprint arXiv:2003.00982

  66. Irwin JJ, Shoichet BK (2005) ZINC—a free database of commercially available compounds for virtual screening. J Chem Inf Model 45:177–182

    Article  Google Scholar 

  67. Chen G, Chen P, Hsieh C-Y, Lee C-K, Liao B, Liao R, Liu W, Qiu J, Sun Q, Tang J et al (2019) Alchemy: a quantum chemistry dataset for benchmarking AI models. arXiv preprint arXiv:1906.09427

  68. Smith JS, Isayev O, Roitberg AE (2017) ANI-1, a data set of 20 million calculated off-equilibrium conformations for organic molecules. Sci Data 4:1–8

    Article  Google Scholar 

  69. Sanchez-Lengeling B, Wei JN, Lee BK, Gerkin RC, Aspuru-Guzik A, Wiltschko AB (2019) Machine learning for scent: learning generalizable perceptual representations of small molecules. arXiv preprint arXiv:1910.10685

  70. Lei Z, Dai C, Chen B (2014) Gas solubility in ionic liquids. Chem Rev 114:1289–1326

    Article  Google Scholar 

  71. Debnath AK, Lopez de Compadre RL, Debnath G, Shusterman AJ, Hansch C (1991) Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. J Med Chem 34:786–797

    Google Scholar 

  72. Kuhn M, Letunic I, Jensen LJ, Bork P (2016) The SIDER database of drugs and side effects. Nucleic Acids Res 44:D1075–D1079

    Article  Google Scholar 

  73. Gayvert KM, Madhukar NS, Elemento O (2016) A data-driven approach to predicting successes and failures of clinical trials. Cell Chem Biol 23:1294–1301

    Article  Google Scholar 

  74. Martins IF, Teixeira AL, Pinheiro L, Falcao AO (2012) A Bayesian approach to in silico blood-brain barrier penetration modeling. J Chem Inf Model 52:1686–1697

    Article  Google Scholar 

  75. Wale N, Watson IA, Karypis G (2008) Comparison of descriptor spaces for chemical compound retrieval and classification. Knowl Inf Syst 14:347–375

    Article  Google Scholar 

  76. Tox21 data challenge 2014. https://tripod.nih.gov/tox21/challenge/

  77. Richard AM, Judson RS, Houck KA, Grulke CM, Volarath P, Thillainadarajah I, Yang C, Rathman J, Martin MT, Wambaugh JF et al (2016) ToxCast chemical landscape: paving the road to 21st century toxicology. Chem Res Toxicol 29:1225–1251

    Google Scholar 

  78. Subramanian G, Ramsundar B, Pande V, Denny RA (2016) Computational modeling of SS-secretase 1 (BACE-1) inhibitors using ligand based approaches. J Chem Inf Model 56:1936–1949

    Article  Google Scholar 

  79. Cortés-Ciriano I, Bender A (2019) KekuleScope: prediction of cancer cell line sensitivity and compound potency using convolutional neural networks trained on compound images. J Cheminform 11:1–16

    Article  Google Scholar 

  80. Wang R, Fang X, Lu Y, Wang S (2004) The PDBbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures. J Med Chem 47:2977–2980

    Article  Google Scholar 

  81. Su M, Yang Q, Du Y, Feng G, Liu Z, Li Y, Wang R (2018) Comparative assessment of scoring functions: the CASF-2016 update. J Chem Inf Model 59:895–913

    Article  Google Scholar 

  82. AIDS antiviral screen data. https://wiki.nci.nih.gov/display/NCIDTPdata/AIDS+Antiviral+Screen+Data

  83. Rohrer SG, Baumann K (2009) Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data. J Chem Inf Model 49:169–184

    Article  Google Scholar 

  84. Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Zhou Z, Han L, Karapetyan K, Dracheva S, Shoemaker BA et al (2012) PubChem’s BioAssay database. Nucleic Acids Res 40:D400–D412

    Article  Google Scholar 

  85. Mobley DL, Guthrie JP (2014) FreeSolv: a database of experimental and calculated hydration free energies, with input files. J Comput Aided Mol Des 28:711–720

    Article  Google Scholar 

  86. Delaney JS (2004) ESOL: estimating aqueous solubility directly from molecular structure. J Chem Inf Comput Sci 44:1000–1005

    Google Scholar 

  87. Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Félix E, Magariños MP, Mosquera JF, Mutowo P, Nowotka M et al (2019) ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res 47:D930–D940

    Article  Google Scholar 

  88. Sorkun MC, Khetan A, Er S (2019) AqSolDB, a curated reference set of aqueous solubility and 2D descriptors for a diverse set of compounds. Sci Data 6:1–8

    Article  Google Scholar 

  89. Rupp M, Tkatchenko A, Müller K-R, Von Lilienfeld OA (2012) Fast and accurate modeling of molecular atomization energies with machine learning. Phys Rev Lett 108:058301

    Article  Google Scholar 

  90. Montavon G, Rupp M, Gobre V, Vazquez-Mayagoitia A, Hansen K, Tkatchenko A, Müller K-R, von Lilienfeld OA (2013) Machine learning of molecular electronic properties in chemical compound space. New J Phys 15:095003

    Article  Google Scholar 

  91. Ruddigkeit L, Van Deursen R, Blum LC, Reymond J-L (2012) Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J Chem Inf Model 52:2864–2875

    Article  Google Scholar 

  92. Ramakrishnan R, Dral PO, Rupp M, von Lilienfeld OA (2014) Quantum chemistry structures and properties of 134 kilo molecules. Sci Data 1

    Google Scholar 

  93. Hu W, Fey M, Zitnik M, Dong Y, Ren H, Liu B, Catasta M, Leskovec J (2020) Open graph benchmark: datasets for machine learning on graphs. Adv Neural Inf Process Syst 33:22118–22133

    Google Scholar 

  94. Wieder O, Kohlbacher S, Kuenemann M, Garon A, Ducrot P, Seidel T, Langer T (2020) A compact review of molecular property prediction with graph neural networks. Drug Discov Today Technol 37:1–12

    Article  Google Scholar 

  95. Henaff M, Bruna J, LeCun Y (2015) Deep convolutional networks on graph structured data. arXiv preprint arXiv:1506.05163

  96. Li R, Wang S, Zhu F, Huang J (2018) Adaptive graph convolutional neural networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 32

    Google Scholar 

  97. Liao R, Zhao Z, Urtasun R, Zemel RS (2019) LanczosNet: multi-scale deep graph convolutional networks. arXiv preprint arXiv:1901.01484

  98. Ma Y, Wang S, Aggarwal CC, Tang J (2019) Graph convolutional networks with eigenpooling. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 723–731

    Google Scholar 

  99. Xu Y, Pei J, Lai L (2017) Deep learning based regression and multiclass models for acute oral toxicity prediction with automatic chemical feature extraction. J Chem Inf Model 57:2672–2685

    Article  Google Scholar 

  100. Li J, Cai D, He X (2017) Learning graph-level representation for drug discovery. arXiv preprint arXiv:1709.03741

  101. Wang X, Li Z, Jiang M, Wang S, Zhang S, Wei Z (2019) Molecule property prediction based on spatial graph embedding. J Chem Inf Model 59:3817–3828

    Article  Google Scholar 

  102. Cho H, Choi IS (2019) Enhanced deep-learning prediction of molecular properties via augmentation of bond topology. ChemMedChem 14:1604–1609

    Article  Google Scholar 

  103. Feinberg EN, Joshi E, Pande VS, Cheng AC (2020) Improvement in ADMET prediction with multitask deep featurization. J Med Chem 63:8835–8848

    Article  Google Scholar 

  104. Withnall M, Lindelöf E, Engkvist O, Chen H (2020) Building attention and edge message passing neural networks for bioactivity and physical-chemical property prediction. J Cheminform 12:1–18

    Article  Google Scholar 

  105. Tang B, Kramer ST, Fang M, Qiu Y, Wu Z, Xu D (2020) A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility. J Cheminform 12:1–9

    Article  Google Scholar 

  106. Smith JS, Isayev O, Roitberg AE (2017) ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem Sci 8:3192–3203

    Article  Google Scholar 

  107. Lubbers N, Smith JS, Barros K (2018) Hierarchical modeling of molecular energies using a deep neural network. J Chem Phys 148:241715

    Article  Google Scholar 

  108. Qiao Z, Welborn M, Anandkumar A, Manby FR, Miller TF III (2020) OrbNet: deep learning for quantum chemistry using symmetry-adapted atomic-orbital features. J Chem Phys 153:124111

    Article  Google Scholar 

  109. Karamad M, Magar R, Shi Y, Siahrostami S, Gates ID, Farimani AB (2020) Orbital graph convolutional neural network for material property prediction. Phys Rev Mater 4:093801

    Article  Google Scholar 

  110. Anderson B, Hy TS, Kondor R (2019) Cormorant: covariant molecular neural networks. Adv Neural Inf Process Syst 32

    Google Scholar 

  111. Liu Y, Wang L, Liu M, Zhang X, Oztekin B, Ji S (2021) Spherical message passing for 3D graph networks. arXiv preprint arXiv:2102.05013

  112. Gasteiger J, Becker F, Günnemann S (2021) GemNet: universal directional graph neural networks for molecules. Adv Neural Inf Process Syst 34:6790–6802

    Google Scholar 

  113. Schütt K, Unke O, Gastegger M (2021) Equivariant message passing for the prediction of tensorial properties and molecular spectra. In: International conference on machine learning, pp 9377–9388

    Google Scholar 

  114. Chmiela S, Tkatchenko A, Sauceda HE, Poltavsky I, Schütt KT, Müller K-R (2017) Machine learning of accurate energy-conserving molecular force fields. Sci Adv 3:e1603015

    Article  Google Scholar 

  115. Hermann J, Schätzle Z, Noé F (2020) Deep-neural-network solution of the electronic Schrödinger equation. Nat Chem 12:891–897

    Article  Google Scholar 

  116. Gao N, Gännemann S (2021) Ab-initio potential energy surfaces by pairing GNNs with neural wave functions. arXiv preprint arXiv:2110.05064

  117. Xiong J, Xiong Z, Chen K, Jiang H, Zheng M (2021) Graph neural networks for automated de novo drug design. Drug Discov Today 26:1382–1393

    Article  Google Scholar 

  118. Stepniewska-Dziubinska MM, Zielenkiewicz P, Siedlecki P (2018) Development and evaluation of a deep learning model for protein-ligand binding affinity prediction. Bioinformatics 34:3666–3674

    Article  Google Scholar 

  119. Yang L, Yang G, Chen X, Yang Q, Yao X, Bing Z, Niu Y, Huang L, Yang L (2021) Deep scoring neural network replacing the scoring function components to improve the performance of structure-based molecular docking. ACS Chem Neurosci 12:2133–2142

    Article  Google Scholar 

  120. Mysinger MM, Carchia M, Irwin JJ, Shoichet BK (2012) Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J Med Chem 55:6582–6594

    Article  Google Scholar 

  121. Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK (2007) BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res 35:D198–D201

    Article  Google Scholar 

  122. Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J (2006) DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 34:D668–D672

    Article  Google Scholar 

  123. Lim S, Lu Y, Cho CY, Sung I, Kim J, Kim Y, Park S, Kim S (2021) A review on compound-protein interaction prediction methods: data, format, representation and model. Comput Struct Biotechnol J 19:1541–1556

    Article  Google Scholar 

  124. Karimi M, Wu D, Wang Z, Shen Y (2019) DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks. Bioinformatics 35:3329–3338

    Article  Google Scholar 

  125. Feinberg EN, Sur D, Wu Z, Husic BE, Mai H, Li Y, Sun S, Yang J, Ramsundar B, Pande VS (2018) PotentialNet for molecular property prediction. ACS Cent Sci 4:1520–1530

    Article  Google Scholar 

  126. Gomes J, Ramsundar B, Feinberg EN, Pande VS (2017) Atomic convolutional networks for predicting protein-ligand binding affinity. arXiv preprint arXiv:1703.10603

  127. Lim J, Ryu S, Park K, Choe YJ, Ham J, Kim WY (2019) Predicting drug-target interaction using a novel graph neural network with 3D structure-embedded graph representation. J Chem Inf Model 59:3981–3988

    Article  Google Scholar 

  128. Jiang D, Hsieh C-Y, Wu Z, Kang Y, Wang J, Wang E, Liao B, Shen C, Xu L, Wu J et al (2021) InteractionGraphNet: a novel and efficient deep graph representation learning framework for accurate protein-ligand interaction predictions. J Med Chem 64:18209–18232

    Article  Google Scholar 

  129. Morrone JA, Weber JK, Huynh T, Luo H, Cornell WD (2020) Combining docking pose rank and structure with deep learning improves protein-ligand binding mode prediction over a baseline docking approach. J Chem Inf Model 60:4170–4179

    Article  Google Scholar 

  130. Son J, Kim D (2021) Development of a graph convolutional neural network model for efficient prediction of protein-ligand binding affinities. PLoS ONE 16:e0249404

    Google Scholar 

  131. Knutson C, Bontha M, Bilbrey JA, Kumar N (2022) Decoding the protein-ligand interactions using parallel graph neural networks. Sci Rep 12:1–14

    Google Scholar 

  132. Torng W, Altman RB (2019) Graph convolutional neural networks for predicting drug-target interactions. J Chem Inf Model 59:4131–4149

    Google Scholar 

  133. Gao KY, Fokoue A, Luo H, Iyengar A, Dey S (2018) Interpretable drug target prediction using deep neural representation. IJCAI 2018:3371–3377

    Google Scholar 

  134. Nguyen T, Le H, Quinn TP, Nguyen T, Le TD, Venkatesh S (2021) GraphDTA: predicting drug-target binding affinity with graph neural networks. Bioinformatics 37:1140–1147

    Google Scholar 

  135. Kitchen DB, Decornez H, Furr JR, Bajorath J (2004) Docking and scoring in virtual screening for drug discovery: methods and applications. Nat Rev Drug Discov 3:935–949

    Google Scholar 

  136. Erickson JA, Jalaie M, Robertson DH, Lewis RA, Vieth M (2004) Lessons in molecular recognition: the effects of ligand and protein flexibility on molecular docking accuracy. J Med Chem 47:45–55

    Google Scholar 

  137. Jiang H, Wang J, Cong W, Huang Y, Ramezani M, Sarma A, Dokholyan NV, Mahdavi M, Kandemir MT (2022) Predicting protein-ligand docking structure with graph neural network. J Chem Inf Model 62:2923–2932

    Google Scholar 

  138. Méndez-Lucio O, Ahmad M, del Rio-Chanona EA, Wegner JK (2021) A geometric deep learning approach to predict binding conformations of bioactive molecules. Nat Mach Intell 3:1033–1039

    Google Scholar 

  139. Klebe G, Mietzner T (1994) A fast and efficient method to generate biologically relevant conformations. J Comput-Aided Mol Des 8:583–606

    Google Scholar 

  140. Li L, Cai M (2017) Drug target prediction by multi-view low rank embedding. IEEE/ACM Trans Comput Biol Bioinform 16:1712–1721

    Google Scholar 

  141. Stärk H, Ganea O, Pattanaik L, Barzilay R, Jaakkola T (2022) EquiBind: geometric deep learning for drug binding structure prediction. In: International conference on machine learning, pp 20503–20521

    Google Scholar 

  142. Lu W, Wu Q, Zhang J, Rao J, Li C, Zheng S (2022) TANKBind: trigonometry-aware neural networks for drug-protein binding structure prediction. bioRxiv

    Google Scholar 

  143. Hollingsworth SA, Dror RO (2018) Molecular dynamics simulation for all. Neuron 99:1129–1143. ISSN: 0896-6273. https://www.sciencedirect.com/science/article/pii/S0896627318306846

  144. Karplus M, McCammon JA (2002) Molecular dynamics simulations of biomolecules. Nat Struct Biol 9:646–652. ISSN: 1545-9985. https://doi.org/10.1038/nsb0902-646

  145. De Vivo M, Masetti M, Bottegoni G, Cavalli A (2016) Role of molecular dynamics and related methods in drug discovery. J Med Chem 59:4035–4061. PMID: 26807648. https://doi.org/10.1021/acs.jmedchem.5b01684

  146. Becke AD (2014) Perspective: fifty years of density-functional theory in chemical physics. J Chem Phys 140:18A301. eprint: https://doi.org/10.1063/1.4869598

  147. Harrison JA, Schall JD, Maskey S, Mikulski PT, Knippenberg MT, Morrow BH (2018) Review of force fields and intermolecular potentials used in atomistic computational materials research. Appl Phys Rev 5:031104. eprint: https://doi.org/10.1063/1.5020808

  148. Deringer VL, Caro MA, Csányi G (2019) Machine learning interatomic potentials as emerging tools for materials science. Adv Mater 31:1902765. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/adma.201902765

  149. Gkeka P, Stoltz G, Barati Farimani A, Belkacemi Z, Ceriotti M, Chodera JD, Dinner AR, Ferguson AL, Maillet J-B, Minoux H, Peter C, Pietrucci F, Silveira A, Tkatchenko A, Trstanova Z, Wiewiora R, Lelièvre T (2020) Machine learning force fields and coarse-grained variables in molecular dynamics: application to materials and biological systems. J Chem Theory Comput 16. PMID: https://doi.org/10.1021/acs.jctc.0c00355

  150. Noé F, Tkatchenko A, Müller K-R, Clementi C. Machine learning for molecular simulation. Annu Rev Phys Chem 71:361–390. PMID: 32092281. eprint: https://doi.org/10.1146/annurev-physchem-042018-052331

  151. Bartók AP, De S, Poelking C, Bernstein N, Kermode JR, Csányi G, Ceriotti M (2017) Machine learning unifies the modeling of materials and molecules. Sci Adv 3. eprint: https://advances.sciencemag.org/content/3/12/e1701816.full.pdf

  152. Li Y, Li H, Pickard FC, Narayanan B, Sen FG, Chan MKY, Sankaranarayanan SKRS, Brooks BR, Roux B (2017) Machine learning force field parameters from ab initio data. J Chem Theory Comput 13:4492–4503. PMID: 28800233. eprint: https://doi.org/10.1021/acs.jctc.7b00521

  153. Behler J (2011) Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J Chem Phys 134:074106. eprint: https://doi.org/10.1063/1.3553717

  154. Behler J, Parrinello M (2007) Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys Rev Lett 98:146401. https://link.aps.org/doi/10.1103/PhysRevLett.98.146401

  155. Behler J (2017) First principles neural network potentials for reactive simulations of large molecular and condensed systems. Angew Chem Int Ed 56:12828–12840. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/anie.201703114

  156. Behler J (2016) Perspective: machine learning potentials for atomistic simulations. J Chem Phys 145:170901. eprint: https://doi.org/10.1063/1.4966192

  157. Hu W, Shuaibi M, Das A, Goyal S, Sriram A, Leskovec J, Parikh D, Zitnick CL (2021) ForceNet: a graph neural network for large-scale quantum calculations. arXiv:2103.01436

  158. Mailoa JP, Kornbluth M, Batzner S, Samsonidze G, Lam ST, Vandermause J, Ablitt C, Molinari N, Kozinsky B (2019) A fast neural network approach for direct covariant forces prediction in complex multi-element extended systems. Nat Mach Intell 1:471–479. https://doi.org/10.1038/s42256-019-0098-0

  159. Li Z, Meidani K, Yadav P, Barati Farimani A (2022) Graph neural networks accelerated molecular dynamics. J Chem Phys 156:144103

    Article  Google Scholar 

  160. Wu F, Zhang Q, Jin X, Jiang Y, Li SZ (2022) A score-based geometric model for molecular dynamics simulations. arXiv:2204.08672

  161. Fu X, Xie T, Rebello NJ, Olsen BD, Jaakkola T (2022) Simulate time integrated coarse-grained molecular dynamics with geometric machine learning. arXiv:2204.10348

  162. Noé F, Olsson S, Köhler J, Wu H (2018) Boltzmann generators—sampling equilibrium states of many-body systems with deep learning. arXiv:1812.01729

  163. Elton DC, Boukouvalas Z, Fuge MD, Chung PW (2019) Deep learning for molecular design—a review of the state of the art. Mol Syst Des Eng 4:828–849

    Article  Google Scholar 

  164. DiMasi JA, Grabowski HG, Hansen RW (2016) Innovation in the pharmaceutical industry: new estimates of R&D costs. J Health Econ 47:20–33

    Article  Google Scholar 

  165. Yang X, Wang Y, Byrne R, Schneider G, Yang S (2019) Concepts of artificial intelligence for computer-assisted drug discovery. Chem Rev 119:10520–10594

    Article  Google Scholar 

  166. Dimitrov T, Kreisbeck C, Becker JS, Aspuru-Guzik A, Saikin SK (2019) Autonomous molecular design: then and now. ACS Appl Mater Interfaces 11:24825–24836

    Article  Google Scholar 

  167. Sun M, Zhao S, Gilvary C, Elemento O, Zhou J, Wang F (2020) Graph convolutional networks for computational drug development and discovery. Brief Bioinform 21:919–935

    Article  Google Scholar 

  168. Popova M, Shvets M, Oliva J, Isayev O (2019) MolecularRNN: generating realistic molecular graphs with optimized properties. arXiv preprint arXiv:1905.13372

  169. Bongini P, Bianchini M, Scarselli F (2021) Molecular generative graph neural networks for drug discovery. Neurocomputing 450:242–252

    Article  Google Scholar 

  170. Kingma DP, Welling M (2013) Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114

  171. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv Neural Inf Process Syst 27

    Google Scholar 

  172. Rezende D, Mohamed S (2015) Variational inference with normalizing flows. In: International conference on machine learning, pp 1530–1538

    Google Scholar 

  173. Song Y, Ermon S (2019) Generative modeling by estimating gradients of the data distribution. Adv Neural Inf Process Syst 32

    Google Scholar 

  174. Ho J, Jain A, Abbeel P (2020) Denoising diffusion probabilistic models. Adv Neural Inf Process Syst 33:6840–6851

    Google Scholar 

  175. Brown N, Fiscato M, Segler MH, Vaucher AC (2019) GuacaMol: benchmarking models for de novo molecular design. J Chem Inf Model 59:1096–1108

    Article  Google Scholar 

  176. Polykovskiy D, Zhebrak A, Sanchez-Lengeling B, Golovanov S, Tatanov O, Belyaev S, Kurbanov R, Artamonov A, Aladinskiy V, Veselov M et al (2020) Molecular sets (MOSES): a benchmarking platform for molecular generation models. Front Pharmacol 11:565644

    Article  Google Scholar 

  177. Preuer K, Renz P, Unterthiner T, Hochreiter S, Klambauer G (2018) Fréchet ChemNet distance: a metric for generative models for molecules in drug discovery. J Chem Inf Model 58:1736–1741

    Article  Google Scholar 

  178. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Adv Neural Inf Process Syst 30

    Google Scholar 

  179. Gómez-Bombarelli R, Wei JN, Duvenaud D, Hernández-Lobato JM, Sánchez-Lengeling B, Sheberla D, Aguilera-Iparraguirre J, Hirzel TD, Adams RP, Aspuru-Guzik A (2018) Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci 4:268–276

    Article  Google Scholar 

  180. Ertl P, Schuffenhauer A (2009) Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J Cheminform 1:1–11

    Article  Google Scholar 

  181. Bickerton GR, Paolini GV, Besnard J, Muresan S, Hopkins AL (2012) Quantifying the chemical beauty of drugs. Nat Chem 4:90–98

    Article  Google Scholar 

  182. Gretton A, Borgwardt KM, Rasch MJ, Schölkopf B, Smola A (2012) A kernel two-sample test. J Mach Learn Res 13:723–773

    MathSciNet  MATH  Google Scholar 

  183. You J, Ying R, Ren X, Hamilton W, Leskovec J (2018) GraphRNN: generating realistic graphs with deep auto-regressive models. In: International conference on machine learning, pp 5708–5717

    Google Scholar 

  184. Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, Repasky MP, Knoll EH, Shelley M, Perry JK et al (2004) Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem 47:1739–1749

    Google Scholar 

  185. Eastman P, Swails J, Chodera JD, McGibbon RT, Zhao Y, Beauchamp KA, Wang L-P, Simmonett AC, Harrigan MP, Stern CD et al (2017) OpenMM 7: rapid development of high performance algorithms for molecular dynamics. PLoS Comput Biol 13:e1005659

    Article  Google Scholar 

  186. Jeon W, Kim D (2020) Autonomous molecule generation using reinforcement learning and docking to develop potential novel inhibitors. Sci Rep 10:1–11

    Article  Google Scholar 

  187. Hoogeboom E, Satorras VG, Vignac C, Welling M (2022) Equivariant diffusion for molecule generation in 3D. In: International conference on machine learning, pp 8867–8887

    Google Scholar 

  188. Gao W, Fu T, Sun J, Coley CW (2022) Sample efficiency matters: a benchmark for practical molecular optimization. arXiv preprint arXiv:2206.12411

  189. Blum LC, Reymond J-L (2009) 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J Am Chem Soc 131:8732–8733

    Article  Google Scholar 

  190. Axelrod S, Gomez-Bombarelli R (2022) GEOM, energy-annotated molecular conformations for property prediction and molecular generation. Sci Data 9:1–14

    Article  Google Scholar 

  191. Grover A, Zweig A, Ermon S (2019) Graphite: iterative generative modelling of graphs. In: International conference on machine learning, pp 2434–2444

    Google Scholar 

  192. Olivecrona M, Blaschke T, Engkvist O, Chen H (2017) Molecular de-novo design through deep reinforcement learning. J Cheminform 9:1–14

    Article  Google Scholar 

  193. Guimaraes GL, Sanchez-Lengeling B, Outeiral C, Farias PLC, Aspuru-Guzik A (2017) Objective-reinforced generative adversarial networks (ORGAN) for sequence generation models. arXiv preprint arXiv:1705.10843

  194. Wang Y, Cao Z, Barati Farimani A (2021) Efficient water desalination with graphene nanopores obtained using artificial intelligence. npj 2D Mater Appl 5:1–9

    Google Scholar 

  195. Grebner C, Matter H, Plowright AT, Hessler G (2020) Automated de novo design in medicinal chemistry: which types of chemistry does a generative neural network learn? J Med Chem 63:8809–8823

    Article  Google Scholar 

  196. You J, Liu B, Ying Z, Pande V, Leskovec J (2018) Graph convolutional policy network for goal-directed molecular graph generation. Adv Neural Inf Process Syst 31

    Google Scholar 

  197. Jin W, Barzilay R, Jaakkola T (2020) Multi-objective molecule generation using interpretable substructures. In: International conference on machine learning, pp 4849–4859

    Google Scholar 

  198. Li Y, Zhang L, Liu Z (2018) Multi-objective de novo drug design with conditional graph generative model. J Cheminform 10:1–24

    Article  Google Scholar 

  199. Khemchandani Y, O’Hagan S, Samanta S, Swainston N, Roberts TJ, Bollegala D, Kell DB (2020) DeepGraphMolGen, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach. J Cheminform 12:1–17

    Article  Google Scholar 

  200. Mercado R, Rastemo T, Lindelöf E, Klambauer G, Engkvist O, Chen H, Bjerrum EJ (2021) Graph networks for molecular design. Mach Learn Sci Technol 2:025023

    Article  Google Scholar 

  201. Podda M, Bacciu D, Micheli A (2020) A deep generative model for fragment-based molecule generation. In: International conference on artificial intelligence and statistics, pp 2240–2250

    Google Scholar 

  202. Chen Z, Min MR, Parthasarathy S, Ning X (2021) A deep generative model for molecule optimization via one fragment modification. Nat Mach Intell 3:1040–1049

    Article  Google Scholar 

  203. Lim J, Hwang S-Y, Moon S, Kim S, Kim WY (2020) Scaffold-based molecular design with a graph generative model. Chem Sci 11:1153–1164

    Article  Google Scholar 

  204. Xie Y, Shi C, Zhou H, Yang Y, Zhang W, Yu Y, Li L (2021) MARS: Markov molecular sampling for multi-objective drug discovery. arXiv preprint arXiv:2103.10432

  205. Shi C, Xu M, Zhu Z, Zhang W, Zhang M, Tang J (2020) GraphAF: a flow-based autoregressive model for molecular graph generation. arXiv preprint arXiv:2001.09382

  206. Luo Y, Yan K, Ji S (2021) GraphDF: a discrete flow model for molecular graph generation. In: International conference on machine learning, pp 7192–7203

    Google Scholar 

  207. Gebauer N, Gastegger M, Schütt K (2019) Symmetry-adapted generation of 3D point sets for the targeted discovery of molecules. Adv Neural Inf Process Syst 32

    Google Scholar 

  208. Gebauer NW, Gastegger M, Hessmann SS, Müller K-R, Schütt KT (2022) Inverse design of 3D molecular structures with conditional generative neural networks. Nat Commun 13:1–11

    Article  Google Scholar 

  209. Simm G, Pinsler R, Hernández-Lobato JM (2020) Reinforcement learning for molecular design guided by quantum mechanics. In: International conference on machine learning, pp 8959–8969

    Google Scholar 

  210. Flam-Shepherd D, Zhigalin A, Aspuru-Guzik A (2022) Scalable fragment-based 3D molecular design with reinforcement learning. arXiv preprint arXiv:2202.00658

  211. Luo Y, Ji S (2021) An autoregressive flow model for 3D molecular geometry generation from scratch. In: International conference on learning representations

    Google Scholar 

  212. Luo S, Guan J, Ma J, Peng J (2021) A 3D generative model for structure-based drug design. Adv Neural Inf Process Syst 34:6229–6239

    Google Scholar 

  213. Liu M, Luo Y, Uchino K, Maruhashi K, Ji S (2022) Generating 3D molecules for target protein binding. arXiv preprint arXiv:2204.09410

  214. Powers A, Yu H, Suriana P, Dror R (2022) Fragment-based ligand generation guided by geometric deep learning on protein-ligand structure. bioRxiv

    Google Scholar 

  215. Imrie F, Bradley AR, van der Schaar M, Deane CM (2020) Deep generative models for 3D linker design. J Chem Inf Model 60:1983–1995

    Article  Google Scholar 

  216. Blaschke T, Olivecrona M, Engkvist O, Bajorath J, Chen H (2018) Application of generative autoencoder in de novo molecular design. Mol Inform 37:1700123

    Article  Google Scholar 

  217. Kipf TN, Welling M (2016) Variational graph auto-encoders. arXiv preprint arXiv:1611.07308

  218. Kusner MJ, Paige B, Hernández-Lobato JM (2017) Grammar variational autoencoder. In: International conference on machine learning, pp 1945–1954

    Google Scholar 

  219. Vignac C, Frossard P (2022) Top-N: equivariant set and graph generation without exchangeability. In: International conference on learning representations

    Google Scholar 

  220. Simonovsky M, Komodakis N (2018) GraphVAE: towards generation of small graphs using variational autoencoders. In: International conference on artificial neural networks, pp 412–422

    Google Scholar 

  221. Kwon Y, Yoo J, Choi Y-S, Son W-J, Lee D, Kang S (2019) Efficient learning of non-autoregressive graph variational autoencoders for molecular graph generation. J Cheminform 11:1–10

    Article  Google Scholar 

  222. Ma T, Chen J, Xiao C (2018) Constrained generation of semantically valid graphs via regularizing variational autoencoders. Adv Neural Inf Process Syst 31

    Google Scholar 

  223. Bresson X, Laurent T (2019) A two-step graph convolutional decoder for molecule generation. arXiv preprint arXiv:1906.03412

  224. Jin W, Barzilay R, Jaakkola T (2018) Junction tree variational autoencoder for molecular graph generation. In: International conference on machine learning, pp 2323–2332

    Google Scholar 

  225. Jin W, Barzilay R, Jaakkola T (2020) Hierarchical generation of molecular graphs using structural motifs. In: International conference on machine learning, pp 4839–4848

    Google Scholar 

  226. Li Y, Hu J, Wang Y, Zhou J, Zhang L, Liu Z (2019) DeepScaffold: a comprehensive tool for scaffold-based de novo drug discovery using deep learning. J Chem Inf Model 60:77–91

    Article  Google Scholar 

  227. Mahmood O, Mansimov E, Bonneau R, Cho K (2021) Masked graph modelling for molecule generation. Nat Commun 12:1–12

    Article  Google Scholar 

  228. Kang S, Cho K (2018) Conditional molecular design with deep generative models. J Chem Inf Model 59:43–52

    Article  Google Scholar 

  229. Lim J, Ryu S, Kim JW, Kim WY (2018) Molecular generative model based on conditional variational autoencoder for de novo molecular design. J Cheminform 10:1–9

    Article  Google Scholar 

  230. Griffiths R-R, Hernández-Lobato JM (2020) Constrained Bayesian optimization for automatic chemical design using variational autoencoders. Chem Sci 11:577–586

    Article  Google Scholar 

  231. Chenthamarakshan V, Das P, Hoffman S, Strobelt H, Padhi I, Lim KW, Hoover B, Manica M, Born J, Laino T et al (2020) CogMol: target-specific and selective drug design for COVID-19 using deep generative models. Adv Neural Inf Process Syst 33:4320–4332

    Google Scholar 

  232. Jin W, Yang K, Barzilay R, Jaakkola T (2019) Learning multimodal graph-to-graph translation for molecular optimization. In: International conference on learning representations

    Google Scholar 

  233. Eckmann P, Sun K, Zhao B, Feng M, Gilson MK, Yu R (2022) LIMO: latent inceptionism for targeted molecule generation. arXiv preprint arXiv:2206.09010

  234. Wang H, Wang J, Wang J, Zhao M, Zhang W, Zhang F, Li W, Xie X, Guo M (2019) Learning graph representation with generative adversarial nets. IEEE Trans Knowl Data Eng 33:3090–3103

    Article  Google Scholar 

  235. De Cao N, Kipf T (2018) MolGAN: an implicit generative model for small molecular graphs. arXiv preprint arXiv:1805.11973

  236. Maziarka L, Pocha A, Kaczmarczyk J, Rataj K, Danel T, Warchol M (2020) Mol-CycleGAN: a generative model for molecular optimization. J Cheminform 12:1–18

    Google Scholar 

  237. Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232

    Google Scholar 

  238. Tsujimoto Y, Hiwa S, Nakamura Y, Oe Y, Hiroyasu T (2021) L-MolGAN: an improved implicit generative model for large molecular graphs

    Google Scholar 

  239. Liu J, Kumar A, Ba J, Kiros J, Swersky K (2019) Graph normalizing flows. Adv Neural Inf Process Syst 32

    Google Scholar 

  240. Dinh L, Krueger D, Bengio Y (2014) NICE: non-linear independent components estimation. arXiv preprint arXiv:1410.8516

  241. Dinh L, Sohl-Dickstein J, Bengio S (2016) Density estimation using real NVP. arXiv preprint arXiv:1605.08803

  242. Madhawa K, Ishiguro K, Nakago K, Abe M (2019) GraphNVP: an invertible flow model for generating molecular graphs. arXiv preprint arXiv:1905.11600

  243. Zang C, Wang F (2020) MoFlow: an invertible flow model for generating molecular graphs. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 617–626

    Google Scholar 

  244. Niu C, Song Y, Song J, Zhao S, Grover A, Ermon S (2020) Permutation invariant graph generation via score-based generative modeling. In: International conference on artificial intelligence and statistics, pp 4474–4484

    Google Scholar 

  245. Song Y, Sohl-Dickstein J, Kingma DP, Kumar A, Ermon S, Poole B (2020) Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456

  246. Trippe BL, Yim J, Tischer D, Broderick T, Baker D, Barzilay R, Jaakkola T (2022) Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem. arXiv preprint arXiv:2206.04119

  247. Axelrod S, Gomez-Bombarelli R (2020) Molecular machine learning with conformer ensembles. arXiv preprint arXiv:2012.08452

  248. AlQuraishi M, Sorger PK (2021) Differentiable biology: using deep learning for biophysics-based and data-driven modeling of molecular mechanisms. Nat Methods 18:1169–1180

    Article  Google Scholar 

  249. Hawkins PC (2017) Conformation generation: the state of the art. J Chem Inf Model 57:1747–1756

    Article  Google Scholar 

  250. Shi C, Luo S, Xu M, Tang J (2021) Learning gradient fields for molecular conformation generation. In: International conference on machine learning, pp 9558–9568

    Google Scholar 

  251. Kabsch W (1976) A solution for the best rotation to relate two sets of vectors. Acta Crystallogr Sect A Cryst Phys Diffr Theor Gen Crystallogr 32:922–923

    Article  Google Scholar 

  252. Ganea O, Pattanaik L, Coley C, Barzilay R, Jensen K, Green W, Jaakkola T (2021) GeoMol: torsional geometric generation of molecular 3D conformer ensembles. Adv Neural Inf Process Syst 34:13757–13769

    Google Scholar 

  253. Mansimov E, Mahmood O, Kang S, Cho K (2019) Molecular geometry prediction using a deep generative graph neural network. Sci Rep 9:1–13

    Google Scholar 

  254. Simm GN, Hernández-Lobato JM (2019) A generative model for molecular distance geometry. arXiv preprint arXiv:1909.11459

  255. Xu M, Luo S, Bengio Y, Peng J, Tang J (2021) Learning neural generative dynamics for molecular conformation generation. arXiv preprint arXiv:2102.10240

  256. Liberti L, Lavor C, Maculan N, Mucherino A (2014) Euclidean distance geometry and applications. SIAM Rev 56:3–69

    Article  MathSciNet  MATH  Google Scholar 

  257. Xu M, Wang W, Luo S, Shi C, Bengio Y, Gomez-Bombarelli R, Tang J (2021) An end-to-end framework for molecular conformation generation via bilevel programming. In: International conference on machine learning, pp 11537–11547

    Google Scholar 

  258. Luo S, Shi C, Xu M, Tang J (2021) Predicting molecular conformation via dynamic graph score matching. Adv Neural Inf Process Syst 34:19784–19795

    Google Scholar 

  259. Xu M, Yu L, Song Y, Shi C, Ermon S, Tang J (2022) GeoDiff: a geometric diffusion model for molecular conformation generation. In: International conference on learning representations

    Google Scholar 

  260. Jing B, Corso G, Chang J, Barzilay R, Jaakkola T (2022) Torsional diffusion for molecular conformer generation. arXiv preprint arXiv:2206.01729

  261. Gogineni T, Xu Z, Punzalan E, Jiang R, Kammeraad J, Tewari A, Zimmerman P (2020) TorsionNet: a reinforcement learning approach to sequential conformer search. Adv Neural Inf Process Syst 33:20142–20153

    Google Scholar 

  262. Kadurin A, Nikolenko S, Khrabrov K, Aliper A, Zhavoronkov A (2017) druGAN: an advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Mol Pharm 14:3098–3104

    Article  Google Scholar 

  263. Segler MH, Kogej T, Tyrchan C, Waller MP (2018) Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent Sci 4:120–131

    Article  Google Scholar 

  264. Coley CW, Barzilay R, Jaakkola TS, Green WH, Jensen KF (2017) Prediction of organic reaction outcomes using machine learning. ACS Cent Sci 3:434–443

    Google Scholar 

  265. Corey EJ (1967) General methods for the construction of complex molecules. Pure Appl Chem 14:19–38

    Article  Google Scholar 

  266. Coley CW, Green WH, Jensen KF (2018) Machine learning in computer-aided synthesis planning. Acc Chem Res 51:1281–1289

    Article  Google Scholar 

  267. Schneider N, Stiefl N, Landrum GA (2016) What’s what: the (nearly) definitive guide to reaction role assignment. J Chem Inf Model 56:2336–2346

    Article  Google Scholar 

  268. Shi C, Xu M, Guo H, Zhang M, Tang J (2020) A graph to graphs framework for retrosynthesis prediction. In: International conference on machine learning, pp 8818–8827

    Google Scholar 

  269. Sun R, Dai H, Li L, Kearnes S, Dai B (2020) Energy-based view of retrosynthesis. arXiv preprint arXiv:2007.13437

  270. Somnath VR, Bunne C, Coley C, Krause A, Barzilay R (2021) Learning graph models for retrosynthesis prediction. Adv Neural Inf Process Syst 34:9405–9415

    Google Scholar 

  271. Lin Z, Yin S, Shi L, Zhou W, Zhang Y (2022) G2GT: retrosynthesis prediction with graph to graph attention neural network and self-training. arXiv preprint arXiv:2204.08608

  272. Han P, Zhao P, Lu C, Huang J, Wu J, Shang S, Yao B, Zhang X (2022) GNN-retro: retrosynthetic planning with graph neural networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 36, pp 4014–4021

    Google Scholar 

  273. Ryu JY, Kim HU, Lee SY (2018) Deep learning improves prediction of drug-drug and drug-food interactions. Proc Natl Acad Sci 115:E4304–E4311

    Article  Google Scholar 

  274. Niu J, Straubinger RM, Mager DE (2019) Pharmacodynamic drug-drug interactions. Clin Pharmacol Ther 105:1395–1406

    Article  Google Scholar 

  275. Karim MR, Cochez M, Jares JB, Uddin M, Beyan O, Decker S (2019) Drug-drug interaction prediction based on knowledge graph embeddings and convolutional-LSTM network. In: Proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics, pp 113–123

    Google Scholar 

  276. Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30

    Article  Google Scholar 

  277. Zitnik M, Sosič R, Maheshwari S, Leskovec J (2018) BioSNAP datasets: Stanford biomedical network dataset collection. http://snap.stanford.edu/biodata

  278. Whirl-Carrillo M, McDonagh EM, Hebert J, Gong L, Sangkuhl K, Thorn C, Altman RB, Klein TE (2012) Pharmacogenomics knowledge for personalized medicine. Clin Pharmacol Ther 92:414–417

    Article  Google Scholar 

  279. Belleau F, Nolin M-A, Tourigny N, Rigault P, Morissette J (2008) Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J Biomed Inform 41:706–716

    Article  Google Scholar 

  280. Feng Y-H, Zhang S-W, Shi J-Y (2020) DPDDI: a deep predictor for drug-drug interactions. BMC Bioinform 21:1–15

    Article  Google Scholar 

  281. Yu Y, Huang K, Zhang C, Glass LM, Sun J, Xiao C (2021) SumGNN: multi-typed drug interaction prediction via efficient knowledge graph summarization. Bioinformatics 37:2988–2995

    Article  Google Scholar 

  282. Lyu T, Gao J, Tian L, Li Z, Zhang P, Zhang J (2021) MDNN: a multimodal deep neural network for predicting drug-drug interaction events. IJCA I:3536–3542

    Google Scholar 

  283. Lin X, Quan Z, Wang Z-J, Ma T, Zeng X (2020) KGNN: knowledge graph neural network for drug-drug interaction prediction. IJCAI 380:2739–2745

    Google Scholar 

  284. Zhang Y, Li Z, Duan B, Qin L, Peng J (2022) MKGE: knowledge graph embedding with molecular structure information. Comput Biol Chem 107730

    Google Scholar 

  285. He C, Liu Y, Li H, Zhang H, Mao Y, Qin X, Liu L, Zhang X (2022) Multi-type feature fusion based on graph neural network for drug-drug interaction prediction. BMC Bioinform 23:1–18

    Article  Google Scholar 

  286. Feng Y-H, Zhang S-W (2022) Prediction of drug-drug interaction using an attention-based graph neural network on drug molecular graphs. Molecules 27:3004

    Article  Google Scholar 

  287. Nyamabo AK, Yu H, Shi J-Y (2021) SSI-DDI: substructure-substructure interactions for drug-drug interaction prediction. Brief Bioinform 22:bbab133

    Google Scholar 

  288. Nyamabo AK, Yu H, Liu Z, Shi J-Y (2022) Drug-drug interaction prediction with learnable size-adaptive molecular substructures. Brief Bioinform 23:bbab441

    Google Scholar 

  289. Zitnik M, Sosič R, Feldman MW, Leskovec J (2019) Evolution of resilience in protein interactomes across the tree of life. Proc Natl Acad Sci 116:4426–4433

    Google Scholar 

  290. Yang F, Fan K, Song D, Lin H (2020) Graph-based prediction of protein-protein interactions with attributed signed graph embedding. BMC Bioinform 21:1–16

    Article  Google Scholar 

  291. Garay-Ruiz D, Bo C (2022) Chemical reaction network knowledge graphs: the OntoRXN ontology. J Cheminform 14:1–12

    Article  Google Scholar 

  292. Zitnik M, Agrawal M, Leskovec J (2018) Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics 34:i457–i466

    Article  Google Scholar 

  293. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A et al (2021) Highly accurate protein structure prediction with AlphaFold. Nature 596:583–589

    Google Scholar 

  294. Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, Wang J, Cong Q, Kinch LN, Schaeffer RD et al (2021) Accurate prediction of protein structures and interactions using a three-track neural network. Science 373:871–876

    Article  Google Scholar 

  295. Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M (2022) ColabFold: making protein folding accessible to all. Nat Methods 1–4

    Google Scholar 

  296. Lin Z, Akin H, Rao R, Hie B, Zhu Z, Lu W, dos Santos Costa A, Fazel-Zarandi M, Sercu T, Candido S et al (2022) Language models of protein sequences at the scale of evolution enable accurate structure prediction. bioRxiv

    Google Scholar 

  297. Spalević S, Veličković P, Kovačević J, Nikolić M (2020) Hierarchical protein function prediction with tail-GNNs. arXiv preprint arXiv:2007.12804

  298. Gligorijević V, Renfrew PD, Kosciolek T, Leman JK, Berenberg D, Vatanen T, Chandler C, Taylor BC, Fisk IM, Vlamakis H et al (2021) Structure-based protein function prediction using graph convolutional networks. Nat Commun 12:1–14

    Google Scholar 

  299. Evans R, O’Neill M, Pritzel A, Antropova N, Senior AW, Green T, žídek A, Bates R, Blackwell S, Yim J et al (2021) Protein complex prediction with AlphaFold-Multimer. BioRxiv

    Google Scholar 

  300. Yan Z, Hamilton WL, Blanchette M (2020) Graph neural representational learning of RNA secondary structures for predicting RNA-protein interactions. Bioinformatics 36:i276–i284

    Article  Google Scholar 

  301. Strokach A, Becerra D, Corbi-Verge C, Perez-Riba A, Kim PM (2020) Fast and flexible protein design using deep graph neural networks. Cell Syst 11:402–411

    Article  Google Scholar 

  302. Reymond J-L, Ruddigkeit L, Blum L, Van Deursen R (2012) The enumeration of chemical space. Wiley Interdiscip Rev Comput Mol Sci 2:717–733

    Article  Google Scholar 

  303. Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, vol 2, pp 1735–1742

    Google Scholar 

  304. Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: European conference on computer vision, pp 649–666

    Google Scholar 

  305. Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728

  306. Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2536–2544

    Google Scholar 

  307. He K, Chen X, Xie S, Li Y, Dollár P, Girshick R (2022) Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16000–16009

    Google Scholar 

  308. Devlin J, Chang M-W, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805

  309. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692

  310. He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738

    Google Scholar 

  311. Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, pp 1597–1607

    Google Scholar 

  312. Bengio Y, Lecun Y, Hinton G (2021) Deep learning for AI. Commun ACM 64:58–65

    Article  Google Scholar 

  313. Xie S, Gu J, Guo D, Qi CR, Guibas L, Litany O (2020) PointContrast: unsupervised pre-training for 3D point cloud understanding. In: European conference on computer vision, pp 574–591

    Google Scholar 

  314. Gao T, Yao X, Chen D (2021) SimCSE: simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821

  315. Magar R, Wang Y, Farimani AB (2022) Crystal twins: self-supervised learning for crystalline material property prediction. arXiv preprint arXiv:2205.01893

  316. Grill J-B, Strub F, Altché F, Tallec C, Richemond P, Buchatskaya E, Doersch C, Avila Pires B, Guo Z, Gheshlaghi Azar M et al (2020) Bootstrap your own latent—a new approach to self-supervised learning. Adv Neural Inf Process Syst 33:21271–21284

    Google Scholar 

  317. Chen X, He K (2021) Exploring simple siamese representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15750–15758

    Google Scholar 

  318. Bardes A, Ponce J, LeCun Y (2021) VICReg: variance-invariance-covariance regularization for self-supervised learning. arXiv preprint arXiv:2105.04906

  319. Xie Y, Xu Z, Zhang J, Wang Z, Ji S (2022) Self-supervised learning of graph neural networks: a unified review. IEEE Trans Pattern Anal Mach Intell

    Google Scholar 

  320. Fang Y, Zhang Q, Chen Z, Fan X, Chen H (2022) Knowledge-informed molecular learning: a survey on paradigm transfer. arXiv preprint arXiv:2202.10587

  321. Zhang Z, Liu Q, Wang H, Lu C, Lee C-K (2021) Motif-based graph self-supervised learning for molecular property prediction. Adv Neural Inf Process Syst 34

    Google Scholar 

  322. He J, Tian K, Luo S, Min Y, Zheng S, Shi Y, He D, Liu H, Yu N, Wang L et al (2022) Masked molecule modeling: a new paradigm of molecular representation learning for chemistry understanding

    Google Scholar 

  323. Liu S, Demirel MF, Liang Y (2019) N-gram graph: simple unsupervised representation for graphs, with applications to molecules. Adv Neural Inf Process Syst 32

    Google Scholar 

  324. Sun F-Y, Hoffman J, Verma V, Tang J (2019) InfoGraph: unsupervised and semi-supervised graph-level representation learning via mutual information maximization. In: International conference on learning representations

    Google Scholar 

  325. Fang X, Liu L, Lei J, He D, Zhang S, Zhou J, Wang F, Wu H, Wang H (2022) Geometry-enhanced molecular representation learning for property prediction. Nat Mach Intell 4:127–134

    Article  Google Scholar 

  326. Li S, Zhou J, Xu T, Dou D, Xiong H (2022) GeomGCL: geometric graph contrastive learning for molecular property prediction. In: Proceedings of the AAAI conference on artificial intelligence, vol 36, pp 4541–4549

    Google Scholar 

  327. Zhou G, Gao Z, Ding Q, Zheng H, Xu H, Wei Z, Zhang L, Ke G (2022) Uni-Mol: a universal 3D molecular representation learning framework

    Google Scholar 

  328. Liu S, Wang H, Liu W, Lasenby J, Guo H, Tang J (2022) Pre-training molecular graph representation with 3D geometry. In: International conference on learning representations

    Google Scholar 

  329. Stärk H, Beaini D, Corso G, Tossou P, Dallago C, Günnemann S, Lió P (2022) 3D Infomax improves GNNs for molecular property prediction. In: Proceedings of the 39th international conference on machine learning

    Google Scholar 

  330. Zaidi S, Schaarschmidt M, Martens J, Kim H, Teh YW, Sanchez-Gonzalez A, Battaglia P, Pascanu R, Godwin J (2022) Pre-training via denoising for molecular property prediction. arXiv preprint arXiv:2206.00133

  331. Liu S, Guo H, Tang J (2022) Molecular geometry pretraining with SE(3)-invariant denoising distance matching. arXiv preprint arXiv:2206.13602

  332. Chen D, Gao K, Nguyen DD, Chen X, Jiang Y, Wei G-W, Pan F (2021) Algebraic graph-assisted bidirectional transformers for molecular property prediction. Nat Commun 12:1–9

    Google Scholar 

  333. Jiao R, Han J, Huang W, Rong Y, Liu Y (2022) 3D equivariant molecular graph pretraining. arXiv preprint arXiv:2207.08824

  334. Wang Y, Xu C, Li Z, Farimani AB (2023) Denoise pre-training on nonequilibrium molecules for accurate and transferable neural potentials. arXiv preprint arXiv:2303.02216

  335. Wang Y, Wang J, Cao Z, Barati Farimani A (2022) Molecular contrastive learning of representations via graph neural networks. Nat Mach Intell 1–9

    Google Scholar 

  336. Zhang S, Hu Z, Subramonian A, Sun Y (2020) Motif-driven contrastive learning of graph representations. arXiv preprint arXiv:2012.12533

  337. Zhu J, Xia Y, Qin T, Zhou W, Li H, Liu T-Y (2021) Dual-view molecule pre-training. arXiv preprint arXiv:2106.10234

  338. Zhu Y, Chen D, Du Y, Wang Y, Liu Q, Wu S (2022) Featurizations matter: a multiview contrastive learning approach to molecular pretraining. In: ICML 2022 2nd AI for science workshop

    Google Scholar 

  339. Wang Y, Magar R, Liang C, Barati Farimani A (2022) Improving molecular contrastive learning via faulty negative mitigation and decomposed fragment contrast. J Chem Inf Model

    Google Scholar 

  340. Fang Y, Yang H, Zhuang X, Shao X, Fan X, Chen H (2021) Knowledge-aware contrastive molecular graph learning. arXiv preprint arXiv:2103.13047

  341. Sun M, Xing J, Wang H, Chen B, Zhou J (2021) MoCL: data-driven molecular fingerprint via knowledge-aware contrastive learning from molecular graph. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pp 3585–3594

    Google Scholar 

  342. Fang Y, Zhang Q, Yang H, Zhuang X, Deng S, Zhang W, Qin M, Chen Z, Fan X, Chen H (2022) Molecular contrastive learning with chemical element knowledge graph. In: Proceedings of the AAAI conference on artificial intelligence, vol 36, pp 3968–3976

    Google Scholar 

  343. Gao Z, Tan C, Wu L, Li SZ (2022) CoSP: co-supervised pretraining of pocket and ligand. arXiv preprint arXiv:2206.12241

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Barati Farimani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Wang, Y., Li, Z., Barati Farimani, A. (2023). Graph Neural Networks for Molecules. In: Qu, C., Liu, H. (eds) Machine Learning in Molecular Sciences. Challenges and Advances in Computational Chemistry and Physics, vol 36. Springer, Cham. https://doi.org/10.1007/978-3-031-37196-7_2

Download citation

Publish with us

Policies and ethics