Skip to main content

Deep Learning and Computational Chemistry

  • 2197 Accesses

Part of the Methods in Molecular Biology book series (MIMB,volume 2390)

Abstract

Within the context of the latest resurgence in the application of artificial intelligence approaches, deep learning has undergone a renaissance over recent years. These methods have been applied to a number of problems in computational chemistry. Compared to other machine learning approaches, the practical performance advantages of deep neural networks are often unclear. However, deep learning does appear to offer a number of other advantages such as the facile incorporation of multitask learning and the enhancement of generative modeling. The high complexity of contemporary network architectures represents a potentially significant barrier to their future adoption due to the costs of training such models and challenges in interpreting their predictions. When combined with the relative paucity of very large datasets, it is interesting to reflect on whether deep learning is likely to have the kind of transformational impact on computational chemistry that it is commonly held to have had in other domains such as image recognition.

Key words

  • AI
  • Artificial intelligence
  • Computational chemistry
  • Deep learning
  • Explainable AI
  • Generative models
  • Interpretability
  • Machine learning
  • Quantitative structure-activity relationships
  • QSAR
  • Virtual screening

This is a preview of subscription content, access via your institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-1-0716-1787-8_5
  • Chapter length: 27 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   219.00
Price excludes VAT (USA)
  • ISBN: 978-1-0716-1787-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Hardcover Book
USD   279.99
Price excludes VAT (USA)
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Rawat W, Wang Z (2017) Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput 29:2352–2449. https://doi.org/10.1162/neco_a_00990

    CrossRef  PubMed  Google Scholar 

  2. Silver D, Huang A, Maddison CJ et al (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529:484–489. https://doi.org/10.1038/nature16961

    CAS  CrossRef  PubMed  Google Scholar 

  3. Silver D, Hubert T, Schrittwieser J et al (2018) A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362:1140–1144. https://doi.org/10.1126/science.aar6404

    CAS  CrossRef  PubMed  Google Scholar 

  4. Open AI, Berner C, Brockman G et al (2019) Dota 2 with large scale deep reinforcement learning. ArXiv191206680 Cs Stat

    Google Scholar 

  5. Wu Y, Schuster M, Chen Z et al (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. ArXiv160908144 Cs

    Google Scholar 

  6. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/10.1007/BF00994018

    CrossRef  Google Scholar 

  7. Corwin H, Toshio F (1964) p-σ-π analysis. A method for the correlation of biological activity and chemical structure. J Am Chem Soc 86:1616–1626. https://doi.org/10.1021/ja01062a035

    CrossRef  Google Scholar 

  8. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. The MIT Press, Cambridge, MA

    Google Scholar 

  9. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. https://doi.org/10.1038/nature14539

    CAS  CrossRef  PubMed  Google Scholar 

  10. McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133. https://doi.org/10.1007/BF02478259

    CrossRef  Google Scholar 

  11. Rosenblatt F The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65:386

    Google Scholar 

  12. Rumelhart DE, JL MC, PDP Research Group C (1986) Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations. MIT Press, Cambridge, MA

    CrossRef  Google Scholar 

  13. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. https://doi.org/10.1038/323533a0

    CrossRef  Google Scholar 

  14. Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on computational learning theory. Association for Computing Machinery, New York, NY, pp 144–152

    CrossRef  Google Scholar 

  15. Schölkopf B, Burges CJC, Smola AJ (1999) Advances in kernel methods: support vector learning. MIT Press, Cambridge, MA

    Google Scholar 

  16. Dauphin YN, Pascanu R, Gulcehre C et al (2014) Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. Adv Neural Informat Process Syst 4:9

    Google Scholar 

  17. Ge R, Huang F, Jin C, Yuan Y (2015) Escaping from saddle points—online stochastic gradient for tensor decomposition. In: Conference on learning theory. PMLR, pp 797–842

    Google Scholar 

  18. Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527

    CrossRef  PubMed  Google Scholar 

  19. Bengio Y, Lamblin P, Popovici D, Larochelle H (2006) Greedy layer-wise training of deep networks. In: Proceedings of the 19th international conference on neural information processing systems. MIT Press, Cambridge, MA, pp 153–160

    Google Scholar 

  20. Ranzato M, Huang FJ, Boureau Y, LeCun Y (2007) Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: 2007 IEEE conference on computer vision and pattern recognition. pp 1–8

    Google Scholar 

  21. Mendez D, Gaulton A, Bento AP et al (2019) ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res 47:D930–D940. https://doi.org/10.1093/nar/gky1075

    CAS  CrossRef  PubMed  Google Scholar 

  22. Kim S, Chen J, Cheng T et al (2021) PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res 49:D1388–D1395. https://doi.org/10.1093/nar/gkaa971

    CAS  CrossRef  PubMed  Google Scholar 

  23. Groom CR, Bruno IJ, Lightfoot MP, Ward SC (2016) The Cambridge Structural Database. Acta Crystallogr B 72:171–179. https://doi.org/10.1107/S2052520616003954

    CAS  CrossRef  Google Scholar 

  24. Ng A (2016) Machine learning yearning. Harvard Business Publishing

    Google Scholar 

  25. Says L (2017) IPUs—a new breed of processor. EEJournal. https://www.eejournal.com/article/20170119-ipu/. Accessed 14 Feb 2021

  26. Jouppi N, Young C, Patil N, Patterson D (2018) Motivation for and evaluation of the first tensor processing unit. IEEE Micro 38:10–19. https://doi.org/10.1109/MM.2018.032271057

    CrossRef  Google Scholar 

  27. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR workshop and conference proceedings, pp 315–323

    Google Scholar 

  28. Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958

    Google Scholar 

  29. Paszke A, Gross S, Massa F et al (2019) PyTorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037

    Google Scholar 

  30. Collobert R, Kavukcuoglu K, Farabet C (2011) Torch7: a matlab-like environment for machine learning. Infoscience. http://infoscience.epfl.ch/record/192376. Accessed 14 Feb 2021

  31. The Theano Development Team, Al-Rfou R, Alain G et al (2016) Theano: a Python framework for fast computation of mathematical expressions. ArXiv 160502688 Cs

    Google Scholar 

  32. Jia Y, Shelhamer E, Donahue J et al (2014) Caffe: convolutional architecture for fast feature embedding

    Google Scholar 

  33. Abadi M, Barham P, Chen J et al (2016) TensorFlow: a system for large-scale machine learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16). pp 265–283

    Google Scholar 

  34. Chollet F et al. (2015) Keras. https://github.com/fchollet/keras

  35. Ramsundar B, Eastman P, Walters P et al (2019) Deep learning for the life sciences. O’Reilly Media

    Google Scholar 

  36. Ma J, Sheridan RP, Liaw A et al (2015) Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model 55:263–274. https://doi.org/10.1021/ci500747n

    CAS  CrossRef  PubMed  Google Scholar 

  37. Dahl GE, Jaitly N, Salakhutdinov R (2014) Multi-task neural networks for QSAR predictions. ArXiv14061231 Cs Stat

    Google Scholar 

  38. Merget B, Turk S, Eid S et al (2017) Profiling prediction of kinase inhibitors: toward the virtual assay. J Med Chem 60:474–485. https://doi.org/10.1021/acs.jmedchem.6b01611

    CAS  CrossRef  PubMed  Google Scholar 

  39. Lenselink EB, ten Dijke N, Bongers B et al (2017) Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set. J Cheminformatics 9:45. https://doi.org/10.1186/s13321-017-0232-0

    CAS  CrossRef  Google Scholar 

  40. Winkler DA, Le TC (2017) Performance of deep and shallow neural networks, the universal approximation theorem, activity cliffs, and QSAR. Mol Inform 36:1600118. https://doi.org/10.1002/minf.201600118

    CAS  CrossRef  Google Scholar 

  41. Muratov EN, Bajorath J, Sheridan RP et al (2020) QSAR without borders. Chem Soc Rev 49:3525–3564. https://doi.org/10.1039/D0CS00098A

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  42. Ramsundar B, Kearnes S, Riley P et al (2015) Massively multitask networks for drug discovery. ArXiv150202072 Cs Stat

    Google Scholar 

  43. Zhang Y, Yang Q (2018) A survey on multi-task learning. ArXiv170708114 Cs

    Google Scholar 

  44. Xu Y, Ma J, Liaw A et al (2017) Demystifying multitask deep neural networks for quantitative structure—activity relationships. J Chem Inf Model 57:2490–2504. https://doi.org/10.1021/acs.jcim.7b00087

    CAS  CrossRef  PubMed  Google Scholar 

  45. Sun M, Zhao S, Gilvary C et al (2020) Graph convolutional networks for computational drug development and discovery. Brief Bioinform 21:919–935. https://doi.org/10.1093/bib/bbz042

    CrossRef  PubMed  Google Scholar 

  46. Coley CW, Barzilay R, Green WH et al (2017) Convolutional embedding of attributed molecular graphs for physical property prediction. J Chem Inf Model 57:1757–1772. https://doi.org/10.1021/acs.jcim.6b00601

    CAS  CrossRef  PubMed  Google Scholar 

  47. Gilmer J, Schoenholz SS, Riley PF et al (2017) Neural message passing for quantum chemistry. In: Proceedings of the 34th international conference on machine learning—volume 70. JMLR.org, Sydney, NSW, pp 1263–1272

    Google Scholar 

  48. Faber FA, Hutchison L, Huang B et al (2017) Prediction errors of molecular machine learning models lower than hybrid DFT error. J Chem Theory Comput 13:5255–5264. https://doi.org/10.1021/acs.jctc.7b00577

    CAS  CrossRef  PubMed  Google Scholar 

  49. Montavon G, Samek W, Müller K-R (2018) Methods for interpreting and understanding deep neural networks. Digit Signal Process 73:1–15. https://doi.org/10.1016/j.dsp.2017.10.011

    CrossRef  Google Scholar 

  50. Mater AC, Coote ML (2019) Deep learning in chemistry. J Chem Inf Model 59:2545–2559. https://doi.org/10.1021/acs.jcim.9b00266

    CAS  CrossRef  PubMed  Google Scholar 

  51. Yoshikawa N, Terayama K, Sumita M et al (2018) Population-based De Novo molecule generation, using grammatical evolution. Chem Lett 47:1431–1434. https://doi.org/10.1246/cl.180665

    CAS  CrossRef  Google Scholar 

  52. Rupakheti C, Virshup A, Yang W, Beratan DN (2015) Strategy to discover diverse optimal molecules in the small molecule universe. J Chem Inf Model 55:529–537. https://doi.org/10.1021/ci500749q

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  53. Salimans T, Ho J, Chen X et al (2017) Evolution strategies as a scalable alternative to reinforcement learning. ArXiv170303864 Cs Stat

    Google Scholar 

  54. Sanchez-Lengeling B, Aspuru-Guzik A (2018) Inverse molecular design using machine learning: generative models for matter engineering. Science 361:360–365. https://doi.org/10.1126/science.aat2663

    CAS  CrossRef  PubMed  Google Scholar 

  55. Mercado R, Rastemo T, Lindelöf E et al (2020) Graph networks for molecular design. Mach Learn Sci Technol. https://doi.org/10.1088/2632-2153/abcf91

  56. Xia X, Hu J, Wang Y et al (2019) Graph-based generative models for de Novo drug design. Drug Discov Today Technol 32–33:45–53. https://doi.org/10.1016/j.ddtec.2020.11.004

    CrossRef  PubMed  Google Scholar 

  57. Kingma DP, Welling M (2013) Auto-encoding variational bayes

    Google Scholar 

  58. Gómez-Bombarelli R, Wei JN, Duvenaud D et al (2018) Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci 4:268–276. https://doi.org/10.1021/acscentsci.7b00572

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  59. Winter R, Montanari F, Noé F, Clevert D-A (2019) Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem Sci 10:1692–1701. https://doi.org/10.1039/C8SC04175J

    CAS  CrossRef  PubMed  Google Scholar 

  60. Winter R, Montanari F, Steffen A et al (2019) Efficient multi-objective molecular optimization in a continuous latent space. Chem Sci 10:8016–8024. https://doi.org/10.1039/C9SC01928F

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  61. Prykhodko O, Johansson SV, Kotsias P-C et al (2019) A de novo molecular generation method using latent vector based generative adversarial network. J Cheminformatics 11:74. https://doi.org/10.1186/s13321-019-0397-9

    CrossRef  Google Scholar 

  62. Kadurin A, Nikolenko S, Khrabrov K et al (2017) druGAN: an advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Mol Pharm 14:3098–3104. https://doi.org/10.1021/acs.molpharmaceut.7b00346

    CAS  CrossRef  PubMed  Google Scholar 

  63. Amimeur T, Shaver JM, Ketchem RR et al (2020) Designing feature-controlled humanoid antibody discovery libraries using generative adversarial networks. bioRxiv:2020.04.12.024844. https://doi.org/10.1101/2020.04.12.024844

  64. Bowman SR, Vilnis L, Vinyals O et al (2016) Generating sentences from a continuous space. In: Proceedings of the 20th SIGNLL conference on computational natural language learning. Association for Computational Linguistics, Berlin, pp 10–21

    CrossRef  Google Scholar 

  65. Blaschke T, Arús-Pous J, Chen H et al (2020) REINVENT 2.0: an AI tool for de novo drug design. J Chem Inf Model 60:5918–5922. https://doi.org/10.1021/acs.jcim.0c00915

    CAS  CrossRef  PubMed  Google Scholar 

  66. Segler MHS, Kogej T, Tyrchan C, Waller MP (2018) Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent Sci 4:120–131. https://doi.org/10.1021/acscentsci.7b00512

    CAS  CrossRef  PubMed  Google Scholar 

  67. O’Boyle N, Dalke A (2018) DeepSMILES: an adaptation of smiles for use in machine-learning of chemical structures. chemRxiv. https://doi.org/10.26434/chemrxiv.7097960.v1

  68. Krenn M, Häse F, Nigam A et al (2020) Self-referencing embedded strings (SELFIES): a 100% robust molecular string representation. Mach Learn Sci Technol 1:045024. https://doi.org/10.1088/2632-2153/aba947

    CrossRef  Google Scholar 

  69. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv

    Google Scholar 

  70. Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8:279–292. https://doi.org/10.1007/BF00992698

    CrossRef  Google Scholar 

  71. Irwin JJ, Sterling T, Mysinger MM et al (2012) ZINC: a free tool to discover chemistry for biology. J Chem Inf Model 52:1757–1768. https://doi.org/10.1021/ci3001277

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  72. Perron Q, Mirguet O, Tajmouati H et al (2021) Deep generative models for ligand-based de novo design applied to multi-parametric optimization. https://doi.org/10.26434/chemrxiv.13622417.v2

  73. Brown N, Fiscato M, Segler MHS, Vaucher AC (2019) GuacaMol: benchmarking models for de novo molecular design. J Chem Inf Model 59:1096–1108. https://doi.org/10.1021/acs.jcim.8b00839

    CAS  CrossRef  PubMed  Google Scholar 

  74. Polykovskiy D, Zhebrak A, Sanchez-Lengeling B et al (2020) Molecular sets (MOSES): a benchmarking platform for molecular generation models. Front Pharmacol 11:565644. https://doi.org/10.3389/fphar.2020.565644

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  75. Irwin JJ, Shoichet BK (2005) ZINC—a free database of commercially available compounds for virtual screening. J Chem Inf Model 45:177–182. https://doi.org/10.1021/ci049714+

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  76. Sterling T, Irwin JJ (2015) ZINC 15—ligand discovery for everyone. J Chem Inf Model 55:2324–2337. https://doi.org/10.1021/acs.jcim.5b00559

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  77. van Hilten N, Chevillard F, Kolb P (2019) Virtual compound libraries in computer-assisted drug discovery. J Chem Inf Model 59:644–651. https://doi.org/10.1021/acs.jcim.8b00737

    CAS  CrossRef  PubMed  Google Scholar 

  78. Lyu J, Wang S, Balius TE et al (2019) Ultra-large library docking for discovering new chemotypes. Nature 566:224–229. https://doi.org/10.1038/s41586-019-0917-9

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  79. Gorgulla C, Boeszoermenyi A, Wang Z-F et al (2020) An open-source drug discovery platform enables ultra-large virtual screens. Nature 580:663–668. https://doi.org/10.1038/s41586-020-2117-z

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  80. Clark DE (2020) Virtual screening: is bigger always better? Or can small be beautiful? J Chem Inf Model 60:4120–4123. https://doi.org/10.1021/acs.jcim.0c00101

    CAS  CrossRef  PubMed  Google Scholar 

  81. Gentile F, Agrawal V, Hsing M et al (2020) Deep docking: a deep learning platform for augmentation of structure based drug discovery. ACS Cent Sci 6:939–949. https://doi.org/10.1021/acscentsci.0c00229

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  82. Graff DE, Shakhnovich EI, Coley CW (2020) Accelerating high-throughput virtual screening through molecular pool-based active learning. ArXiv:201207127 Cs Q-Bio

    Google Scholar 

  83. Ahmed L, Georgiev V, Capuccini M et al (2018) Efficient iterative virtual screening with Apache Spark and conformal prediction. J Cheminformatics 10:8. https://doi.org/10.1186/s13321-018-0265-z

    CAS  CrossRef  Google Scholar 

  84. Svensson F, Norinder U, Bender A (2017) Improving screening efficiency through iterative screening using docking and conformal prediction. J Chem Inf Model 57:439–444. https://doi.org/10.1021/acs.jcim.6b00532

    CAS  CrossRef  PubMed  Google Scholar 

  85. Jastrzębski S, Szymczak M, Pocha A et al (2020) Emulating docking results using a deep neural network: a new perspective for virtual screening. J Chem Inf Model 60:4246–4262. https://doi.org/10.1021/acs.jcim.9b01202

    CAS  CrossRef  PubMed  Google Scholar 

  86. Gaulton A, Bellis LJ, Bento AP et al (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40:D1100–D1107. https://doi.org/10.1093/nar/gkr777

    CAS  CrossRef  PubMed  Google Scholar 

  87. Irwin BWJ, Levell JR, Whitehead TM et al (2020) Practical applications of deep learning to impute heterogeneous drug discovery data. J Chem Inf Model 60:2848–2857. https://doi.org/10.1021/acs.jcim.0c00443

    CAS  CrossRef  PubMed  Google Scholar 

  88. Whitehead TM, Irwin BWJ, Hunt P et al (2019) Imputation of assay bioactivity data using deep learning. J Chem Inf Model 59:1197–1204. https://doi.org/10.1021/acs.jcim.8b00768

    CAS  CrossRef  PubMed  Google Scholar 

  89. Martin EJ, Polyakov VR, Zhu X-W et al (2019) All-Assay-Max2 pQSAR: activity predictions as accurate as four-concentration IC50s for 8558 Novartis assays. J Chem Inf Model 59:4450–4459. https://doi.org/10.1021/acs.jcim.9b00375

    CAS  CrossRef  PubMed  Google Scholar 

  90. Russakovsky O, Deng J, Su H et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115:211–252. https://doi.org/10.1007/s11263-015-0816-y

    CrossRef  Google Scholar 

  91. Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. ArXiv181004805 Cs

    Google Scholar 

  92. Brown TB, Mann B, Ryder N et al (2020) Language models are few-shot learners. ArXiv200514165 Cs

    Google Scholar 

  93. Kryshtafovych A, Schwede T, Topf M et al (2019) Critical assessment of methods of protein structure prediction (CASP)—round XIII. Protein Struct Funct Bioinformat 87:1011–1020. https://doi.org/10.1002/prot.25823

    CAS  CrossRef  Google Scholar 

  94. Senior AW, Evans R, Jumper J et al (2020) Improved protein structure prediction using potentials from deep learning. Nature 577:706–710. https://doi.org/10.1038/s41586-019-1923-7

    CAS  CrossRef  PubMed  Google Scholar 

  95. https://predictioncenter.org/

  96. Callaway E (2020) ‘It will change everything’: DeepMind’s AI makes gigantic leap in solving protein structures. Nature 588:203–204. https://doi.org/10.1038/d41586-020-03348-4

    CAS  CrossRef  PubMed  Google Scholar 

  97. Berman HM, Westbrook J, Feng Z et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242. https://doi.org/10.1093/nar/28.1.235

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  98. The UniProt Consortium (2019) UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res 47:D506–D515. https://doi.org/10.1093/nar/gky1049

    CAS  CrossRef  Google Scholar 

  99. Deng J, Li K, Do M et al (2009) Construction and analysis of a large scale image ontology. Vision Sciences Society

    Google Scholar 

  100. Common Crawl. https://commoncrawl.org/

  101. Segler MHS, Preuss M, Waller MP (2018) Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555:604–610. https://doi.org/10.1038/nature25978

    CAS  CrossRef  PubMed  Google Scholar 

  102. Jiménez-Luna J, Grisoni F, Schneider G (2020) Drug discovery with explainable artificial intelligence. Nat Mach Intell 2:573–584. https://doi.org/10.1038/s42256-020-00236-4

    CrossRef  Google Scholar 

  103. Xiong Z, Wang D, Liu X et al (2020) Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J Med Chem 63:8749–8760. https://doi.org/10.1021/acs.jmedchem.9b00959

    CAS  CrossRef  PubMed  Google Scholar 

  104. Sheridan RP (2019) Interpretation of QSAR models by coloring atoms according to changes in predicted activity: how robust is it? J Chem Inf Model 59:1324–1337. https://doi.org/10.1021/acs.jcim.8b00825

    CAS  CrossRef  PubMed  Google Scholar 

  105. Liu B, Udell M (2020) Impact of accuracy on model interpretations. ArXiv201109903 Cs

    Google Scholar 

  106. Goh GB, Siegel C, Vishnu A et al (2018) How much chemistry does a deep neural network need to know to make accurate predictions? ArXiv171002238 Cs Stat

    Google Scholar 

  107. Schütt KT, Gastegger M, Tkatchenko A, Müller K-R (2019) Quantum-chemical insights from interpretable atomistic neural networks. In: Samek W, Montavon G, Vedaldi A et al (eds) Explainable AI: interpreting, explaining and visualizing deep learning. Springer International Publishing, Cham, pp 311–330

    CrossRef  Google Scholar 

  108. Lapuschkin S, Wäldchen S, Binder A et al (2019) Unmasking Clever Hans predictors and assessing what machines really learn. Nat Commun 10:1096. https://doi.org/10.1038/s41467-019-08987-4

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  109. Jia S, Lansdall-Welfare T, Cristianini N (2018) Right for the right reason: training agnostic networks. ArXiv180606296 Cs Stat 11191:164–174. https://doi.org/10.1007/978-3-030-01768-2_14

  110. Ross AS, Hughes MC, Doshi-Velez F (2017) Right for the right reasons: training differentiable models by constraining their explanations. ArXiv170303717 Cs Stat

    Google Scholar 

  111. Geirhos R, Rubisch P, Michaelis C et al (2019) ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. ArXiv181112231 Cs Q-Bio Stat

    Google Scholar 

  112. Hirschfeld L, Swanson K, Yang K et al (2020) Uncertainty quantification using neural networks for molecular property prediction. J Chem Inf Model 60:3770–3780. https://doi.org/10.1021/acs.jcim.0c00502

    CAS  CrossRef  PubMed  Google Scholar 

  113. David L, Thakkar A, Mercado R, Engkvist O (2020) Molecular representations in AI-driven drug discovery: a review and practical guide. J Cheminformatics 12:56. https://doi.org/10.1186/s13321-020-00460-5

    CAS  CrossRef  Google Scholar 

  114. Yang K, Swanson K, Jin W et al (2019) Analyzing learned molecular representations for property prediction. J Chem Inf Model 59:3370–3388. https://doi.org/10.1021/acs.jcim.9b00237

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  115. Fabian B, Edlich T, Gaspar H et al (2020) Molecular representation learning with language models and domain-relevant auxiliary tasks. ArXiv201113230 Cs

    Google Scholar 

  116. Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31–36. https://doi.org/10.1021/ci00057a005

    CAS  CrossRef  Google Scholar 

  117. Kearnes S, McCloskey K, Berndl M et al (2016) Molecular graph convolutions: moving beyond fingerprints. J Comput Aided Mol Des 30:595–608. https://doi.org/10.1007/s10822-016-9938-8

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  118. D’Amour A, Heller K, Moldovan D et al (2020) Underspecification presents challenges for credibility in modern machine learning. ArXiv201103395 Cs Stat

    Google Scholar 

  119. Azure Machine Learning—ML as a Service | Microsoft Azure. https://azure.microsoft.com/en-us/services/machine-learning/. Accessed 6 Feb 2021

  120. MLOps: continuous delivery and automation pipelines in machine learning. In: Google Cloud. https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning. Accessed 6 Feb 2021

  121. Gartner identifies five emerging trends that will drive technology innovation for the next decade. In: Gartner. https://www.gartner.com/en/newsroom/press-releases/2020-08-18-gartner-identifies-five-emerging-trends-that-will-drive-technology-innovation-for-the-next-decade. Accessed 9 Feb 2021

  122. Méndez-Lucio O, Baillif B, Clevert D-A et al (2020) De novo generation of hit-like molecules from gene expression signatures using artificial intelligence. Nat Commun 11:10. https://doi.org/10.1038/s41467-019-13807-w

    CAS  CrossRef  PubMed  PubMed Central  Google Scholar 

  123. Méndez-Lucio O, Zapata PAM, Wichard J et al (2020) Cell morphology-guided de novo hit design by conditioning generative adversarial networks on phenotypic image features. doi:https://doi.org/10.26434/chemrxiv.11594067.v1

  124. Chindelevitch L, Ziemek D, Enayetallah A et al (2012) Causal reasoning on biological networks: interpreting transcriptional changes. Bioinformatics 28:1114–1121. https://doi.org/10.1093/bioinformatics/bts090

    CAS  CrossRef  PubMed  Google Scholar 

  125. Liu A, Trairatphisan P, Gjerga E et al (2019) From expression footprints to causal pathways: contextualizing large signaling networks with CARNIVAL. Npj Syst Biol Appl 5:1–10. https://doi.org/10.1038/s41540-019-0118-z

    CAS  CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tim James .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Verify currency and authenticity via CrossMark

Cite this protocol

James, T., Hristozov, D. (2022). Deep Learning and Computational Chemistry. In: Heifetz, A. (eds) Artificial Intelligence in Drug Design. Methods in Molecular Biology, vol 2390. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1787-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-1787-8_5

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-1786-1

  • Online ISBN: 978-1-0716-1787-8

  • eBook Packages: Springer Protocols