Skip to main content

Deep Molecular Representation in Cheminformatics

  • Chapter
  • First Online:
Handbook of Deep Learning Applications

Abstract

Quantum-chemical descriptors are powerful predictors of discovering and designing new materials of desired properties. Wave-function-based methods are often employed to calculate quantum-chemical descriptors, which are time consuming. Recently, machine learning models have been used for predicting quantum-chemical descriptors because of their computational advantages. However, it is difficult to generate a proper molecular representation for training. This work reviews recent molecular representation techniques and then employs variational autoencoders to encode Bag-of-Bond molecular representation. The encoded representation reduce the dimensionality of features and extract the essential information through a deep neural network structure. Results on a benchmark dataset show that the deep encoded molecular representation outperforms Bag-of-Bond representations in predicting electronic quantum-chemical descriptors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. M.A. Lill, Multi-dimensional QSAR in drug discovery. Drug Discov. Today 12(23), 1013–1017 (2007)

    Article  Google Scholar 

  2. Hugo Kubinyi, QSAR and 3D QSAR in drug design Part 1: methodology. Drug Discov. Today 2(11), 457–467 (1997)

    Article  Google Scholar 

  3. D. Qi-Shi, R.-B. Huang, K.-C. Chou, Recent advances in QSAR and their applications in predicting the activities of chemical molecules, peptides and proteins for drug design. Curr. Protein Pept. Sci. 9(3), 248–259 (2008)

    Article  Google Scholar 

  4. C. Hansch, D. Hoekman, A. Leo, D. Weininger, C.D. Selassie et al., Chem-bioinformatics: comparative QSAR at the interface between chemistry and biology. Chem. Rev. 102(3), 783–812 (2002)

    Article  Google Scholar 

  5. C. Hansch, A. Leo, D. Hoekman, Albert Leo, Exploring QSAR, vol. 631 (American Chemical Society, Washington, DC, 1995)

    Google Scholar 

  6. R. Gómez-Bombarelli, D. Duvenaud, J. Hernández-Lobato, J. Aguilera-Iparraguirre, T.D. Hirzel, R.P. Adams, A. Aspuru-Guzik, Automatic chemical design using a data-driven continuous representation of molecules. arXiv:1610.02415 (2016)

  7. R.L. Camacho-Mendoza, E. Gutierrez-Moreno, E. Guzman-Percastegui, E. Aquino-Torres, J. Cruz-Borbolla, J.A. Rodriguez-Avila et al., Density functional theory and electrochemical studies: structure–efficiency relationship on corrosion inhibition. J. Chem. Inf. Model. 55(11), 2391–2402 (2015)

    Article  Google Scholar 

  8. L. Li, X. Zhang, S. Gong, Hongxia Zhao, Yang Bai, Qianshu Li, Lin Ji, The discussion of descriptors for the QSAR model and molecular dynamics simulation of benzimidazole derivatives as corrosion inhibitors. Corros. Sci. 99, 76–88 (2015)

    Article  Google Scholar 

  9. M. Karelson, V.S. Lobanov, A.R. Katritzky, Quantum-chemical descriptors in QSAR/QSPR studies. Chem. Rev. 96(3), 1027–1044 (1996)

    Article  Google Scholar 

  10. Z. Zhang, N. Tian, L. Wu, L. Zhang, Inhibition of the corrosion of carbon steel in HCL solution by methionine and its derivatives. Corros. Sci. 98, 438–449 (2015)

    Article  Google Scholar 

  11. C. Gnerre, M. Catto, F. Leonetti, P. Weber, P.-A. Carrupt, C. Altomare et al., Inhibition of monoamine oxidases by functionalized coumarin derivatives: biological activities, QSARs, and 3D-QSARs. J. Med. Chem. 43(25), 4747–4758 (2000)

    Article  Google Scholar 

  12. G. Schüürmann, QSAR analysis of the acute fish toxicity of organic phosphorothionates using theoretically derived molecular descriptors. Environ. Toxicol. Chem. 9(4), 417–428 (1990)

    Article  Google Scholar 

  13. Ramon Carbó-Dorca, Stochastic transformation of quantum similarity matrices and their use in quantum QSAR (QQSAR) models. Int. J. Quantum Chem. 79(3), 163–177 (2000)

    Article  Google Scholar 

  14. M. Rupp, A. Tkatchenko, K.-R. Müller, O.A. Von Lilienfeld, Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 108(5), 058301 (2012)

    Google Scholar 

  15. K. Hansen, G. Montavon, F. Biegler, S. Fazli, M. Rupp, M. Scheffler et al., Assessment and validation of machine learning methods for predicting molecular atomization energies. J. Chem. Theory Comput. 9(8), 3404–3419 (2013)

    Article  Google Scholar 

  16. R. Ramakrishnan, O.A. von Lilienfeld, Machine learning, quantum mechanics, and chemical compound space. arXiv:1510.07512 (2015)

  17. E. Gutiérrez, Development of a predictive model for corrosion inhibition of carbon steel by imidazole and benzimidazole derivatives. Corros. Sci. 108, 23–35 (2016)

    Article  Google Scholar 

  18. P.D Lyne, Structure-based virtual screening: an overview. Drug Discov. Today 7(20), 1047–1055 (2002)

    Article  Google Scholar 

  19. G.E. Dahl, N. Jaitly, R. Salakhutdinov, Multi-task neural networks for QSAR predictions. arXiv:1406.1231 (2014)

  20. J. Ma, R.P. Sheridan, A. Liaw, G.E. Dahl, V. Svetnik, Deep neural nets as a method for quantitative structure–activity relationships. J. Chem. Inf. Model. 55(2), 263–274 (2015)

    Article  Google Scholar 

  21. T. Unterthiner, A. Mayr, M. Steijaert, J.K. Wegner, H. Ceulemans, S. Hochreiter, Deep learning as an opportunity in virtual screening

    Google Scholar 

  22. G. Montavon, K. Hansen, S. Fazli, M. Rupp, F. Biegler, A. Ziehe et al., Learning invariant representations of molecules for atomization energy prediction, in Advances in Neural Information Processing Systems (2012), pp. 440–448

    Google Scholar 

  23. K. Hansen, F. Biegler, R. Ramakrishnan, W. Pronobis, O.A Von Lilienfeld, K.-R. Müller et al., Machine learning predictions of molecular properties: accurate many-body potentials and nonlocality in chemical space. J. Phys. Chem. Lett. 6(12), 2326 (2015)

    Article  Google Scholar 

  24. M. Hirn, N. Poilvert, S. Mallat, Quantum energy regression using scattering transforms. arXiv:1502.02077 (2015)

  25. K.T. Schütt, F. Arbabzadah, S. Chmiela, K.R. Müller, A. Tkatchenko. Quantum-chemical insights from deep tensor neural networks. Nat. Commun. 8 (2017)

    Article  Google Scholar 

  26. A. Lusci, G. Pollastri, P. Baldi, Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules. J. Chem. Inf. Model. 53(7), 1563 (2013)

    Article  Google Scholar 

  27. Predicting activities without computing descriptors: graph machines for QSAR §

    Google Scholar 

  28. C.R. Collins, G.J. Gordon, O.A. von Lilienfeld, D.J. Yaron, Constant size molecular descriptors for use with machine learning. arXiv:1701.0664 (2017)

  29. Y. LeCun, Y. Bengio, Geoffrey Hinton, Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  30. D. Ciregan, U. Meier, J. Schmidhuber, Multi-column deep neural networks for image classification, in 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE, 2012), pp. 3642–3649

    Google Scholar 

  31. L. Deng, J. Li, J.-T. Huang, K. Yao, D. Yu, F. Seide et al., Recent advances in deep learning for speech research at microsoft, in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE, 2013), pp. 8604–8608

    Google Scholar 

  32. R. Collobert and J. Weston, A unified architecture for natural language processing: deep neural networks with multitask learning, in Proceedings of the 25th International Conference on Machine Learning (ACM, 2008), pp. 160–167

    Google Scholar 

  33. D.P. Kingma, M. Welling, Auto-encoding variational Bayes. arXiv:1312.6114 (2013)

  34. R. Ramakrishnan, P.O. Dral, M. Rupp, O.A. Von Lilienfeld, Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1 (2014)

    Google Scholar 

  35. L. Ruddigkeit, R. Van Deursen, L.C. Blum, J.-L. Reymond, Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 52(11), 2864–2875 (2012)

    Article  Google Scholar 

  36. O.G. Mekenyan, G.T. Ankley, G.D. Veith, D.J. Call, QSARs for photoinduced toxicity of aromatic compounds. SAR QSAR Environ. Res. 4(2–3), 139–145 (1995)

    Article  Google Scholar 

  37. F. Chollet, Keras. https://github.com/fchollet/keras (2015)

  38. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro et al., TensorFlow: large-scale machine learning on heterogeneous systems. Software available from www.tensorflow.org (2015)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mojtaba Maghrebi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Jiang, P., Saydam, S., Ramandi, H.L., Crosky, A., Maghrebi, M. (2019). Deep Molecular Representation in Cheminformatics. In: Balas, V., Roy, S., Sharma, D., Samui, P. (eds) Handbook of Deep Learning Applications. Smart Innovation, Systems and Technologies, vol 136. Springer, Cham. https://doi.org/10.1007/978-3-030-11479-4_8

Download citation

Publish with us

Policies and ethics